WorldWideScience

Sample records for complex bioinformatics web

  1. Bringing Web 2.0 to bioinformatics.

    Science.gov (United States)

    Zhang, Zhang; Cheung, Kei-Hoi; Townsend, Jeffrey P

    2009-01-01

    Enabling deft data integration from numerous, voluminous and heterogeneous data sources is a major bioinformatic challenge. Several approaches have been proposed to address this challenge, including data warehousing and federated databasing. Yet despite the rise of these approaches, integration of data from multiple sources remains problematic and toilsome. These two approaches follow a user-to-computer communication model for data exchange, and do not facilitate a broader concept of data sharing or collaboration among users. In this report, we discuss the potential of Web 2.0 technologies to transcend this model and enhance bioinformatics research. We propose a Web 2.0-based Scientific Social Community (SSC) model for the implementation of these technologies. By establishing a social, collective and collaborative platform for data creation, sharing and integration, we promote a web services-based pipeline featuring web services for computer-to-computer data exchange as users add value. This pipeline aims to simplify data integration and creation, to realize automatic analysis, and to facilitate reuse and sharing of data. SSC can foster collaboration and harness collective intelligence to create and discover new knowledge. In addition to its research potential, we also describe its potential role as an e-learning platform in education. We discuss lessons from information technology, predict the next generation of Web (Web 3.0), and describe its potential impact on the future of bioinformatics studies.

  2. BOWS (bioinformatics open web services) to centralize bioinformatics tools in web services.

    Science.gov (United States)

    Velloso, Henrique; Vialle, Ricardo A; Ortega, J Miguel

    2015-06-02

    Bioinformaticians face a range of difficulties to get locally-installed tools running and producing results; they would greatly benefit from a system that could centralize most of the tools, using an easy interface for input and output. Web services, due to their universal nature and widely known interface, constitute a very good option to achieve this goal. Bioinformatics open web services (BOWS) is a system based on generic web services produced to allow programmatic access to applications running on high-performance computing (HPC) clusters. BOWS intermediates the access to registered tools by providing front-end and back-end web services. Programmers can install applications in HPC clusters in any programming language and use the back-end service to check for new jobs and their parameters, and then to send the results to BOWS. Programs running in simple computers consume the BOWS front-end service to submit new processes and read results. BOWS compiles Java clients, which encapsulate the front-end web service requisitions, and automatically creates a web page that disposes the registered applications and clients. Bioinformatics open web services registered applications can be accessed from virtually any programming language through web services, or using standard java clients. The back-end can run in HPC clusters, allowing bioinformaticians to remotely run high-processing demand applications directly from their machines.

  3. ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis.

    Science.gov (United States)

    Römer, Michael; Eichner, Johannes; Dräger, Andreas; Wrzodek, Clemens; Wrzodek, Finja; Zell, Andreas

    2016-01-01

    Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/.

  4. An active registry for bioinformatics web services.

    NARCIS (Netherlands)

    Pettifer, S.; Thorne, D.; McDermott, P.; Attwood, T.; Baran, J.; Bryne, J.C.; Hupponen, T.; Mowbray, D.; Vriend, G.

    2009-01-01

    SUMMARY: The EMBRACE Registry is a web portal that collects and monitors web services according to test scripts provided by the their administrators. Users are able to search for, rank and annotate services, enabling them to select the most appropriate working service for inclusion in their

  5. KBWS: an EMBOSS associated package for accessing bioinformatics web services.

    Science.gov (United States)

    Oshita, Kazuki; Arakawa, Kazuharu; Tomita, Masaru

    2011-04-29

    The availability of bioinformatics web-based services is rapidly proliferating, for their interoperability and ease of use. The next challenge is in the integration of these services in the form of workflows, and several projects are already underway, standardizing the syntax, semantics, and user interfaces. In order to deploy the advantages of web services with locally installed tools, here we describe a collection of proxy client tools for 42 major bioinformatics web services in the form of European Molecular Biology Open Software Suite (EMBOSS) UNIX command-line tools. EMBOSS provides sophisticated means for discoverability and interoperability for hundreds of tools, and our package, named the Keio Bioinformatics Web Service (KBWS), adds functionalities of local and multiple alignment of sequences, phylogenetic analyses, and prediction of cellular localization of proteins and RNA secondary structures. This software implemented in C is available under GPL from http://www.g-language.org/kbws/ and GitHub repository http://github.com/cory-ko/KBWS. Users can utilize the SOAP services implemented in Perl directly via WSDL file at http://soap.g-language.org/kbws.wsdl (RPC Encoded) and http://soap.g-language.org/kbws_dl.wsdl (Document/literal).

  6. KBWS: an EMBOSS associated package for accessing bioinformatics web services

    Directory of Open Access Journals (Sweden)

    Tomita Masaru

    2011-04-01

    Full Text Available Abstract The availability of bioinformatics web-based services is rapidly proliferating, for their interoperability and ease of use. The next challenge is in the integration of these services in the form of workflows, and several projects are already underway, standardizing the syntax, semantics, and user interfaces. In order to deploy the advantages of web services with locally installed tools, here we describe a collection of proxy client tools for 42 major bioinformatics web services in the form of European Molecular Biology Open Software Suite (EMBOSS UNIX command-line tools. EMBOSS provides sophisticated means for discoverability and interoperability for hundreds of tools, and our package, named the Keio Bioinformatics Web Service (KBWS, adds functionalities of local and multiple alignment of sequences, phylogenetic analyses, and prediction of cellular localization of proteins and RNA secondary structures. This software implemented in C is available under GPL from http://www.g-language.org/kbws/ and GitHub repository http://github.com/cory-ko/KBWS. Users can utilize the SOAP services implemented in Perl directly via WSDL file at http://soap.g-language.org/kbws.wsdl (RPC Encoded and http://soap.g-language.org/kbws_dl.wsdl (Document/literal.

  7. A web services choreography scenario for interoperating bioinformatics applications

    Directory of Open Access Journals (Sweden)

    Cheung David W

    2004-03-01

    Full Text Available Abstract Background Very often genome-wide data analysis requires the interoperation of multiple databases and analytic tools. A large number of genome databases and bioinformatics applications are available through the web, but it is difficult to automate interoperation because: 1 the platforms on which the applications run are heterogeneous, 2 their web interface is not machine-friendly, 3 they use a non-standard format for data input and output, 4 they do not exploit standards to define application interface and message exchange, and 5 existing protocols for remote messaging are often not firewall-friendly. To overcome these issues, web services have emerged as a standard XML-based model for message exchange between heterogeneous applications. Web services engines have been developed to manage the configuration and execution of a web services workflow. Results To demonstrate the benefit of using web services over traditional web interfaces, we compare the two implementations of HAPI, a gene expression analysis utility developed by the University of California San Diego (UCSD that allows visual characterization of groups or clusters of genes based on the biomedical literature. This utility takes a set of microarray spot IDs as input and outputs a hierarchy of MeSH Keywords that correlates to the input and is grouped by Medical Subject Heading (MeSH category. While the HTML output is easy for humans to visualize, it is difficult for computer applications to interpret semantically. To facilitate the capability of machine processing, we have created a workflow of three web services that replicates the HAPI functionality. These web services use document-style messages, which means that messages are encoded in an XML-based format. We compared three approaches to the implementation of an XML-based workflow: a hard coded Java application, Collaxa BPEL Server and Taverna Workbench. The Java program functions as a web services engine and interoperates

  8. Evolution of web services in bioinformatics

    NARCIS (Netherlands)

    Neerincx, P.B.T.; Leunissen, J.A.M.

    2005-01-01

    Bioinformaticians have developed large collections of tools to make sense of the rapidly growing pool of molecular biological data. Biological systems tend to be complex and in order to understand them, it is often necessary to link many data sets and use more than one tool. Therefore,

  9. jORCA: easily integrating bioinformatics Web Services.

    Science.gov (United States)

    Martín-Requena, Victoria; Ríos, Javier; García, Maximiliano; Ramírez, Sergio; Trelles, Oswaldo

    2010-02-15

    Web services technology is becoming the option of choice to deploy bioinformatics tools that are universally available. One of the major strengths of this approach is that it supports machine-to-machine interoperability over a network. However, a weakness of this approach is that various Web Services differ in their definition and invocation protocols, as well as their communication and data formats-and this presents a barrier to service interoperability. jORCA is a desktop client aimed at facilitating seamless integration of Web Services. It does so by making a uniform representation of the different web resources, supporting scalable service discovery, and automatic composition of workflows. Usability is at the top of the jORCA agenda; thus it is a highly customizable and extensible application that accommodates a broad range of user skills featuring double-click invocation of services in conjunction with advanced execution-control, on the fly data standardization, extensibility of viewer plug-ins, drag-and-drop editing capabilities, plus a file-based browsing style and organization of favourite tools. The integration of bioinformatics Web Services is made easier to support a wider range of users. .

  10. MAPI: towards the integrated exploitation of bioinformatics Web Services.

    Science.gov (United States)

    Ramirez, Sergio; Karlsson, Johan; Trelles, Oswaldo

    2011-10-27

    Bioinformatics is commonly featured as a well assorted list of available web resources. Although diversity of services is positive in general, the proliferation of tools, their dispersion and heterogeneity complicate the integrated exploitation of such data processing capacity. To facilitate the construction of software clients and make integrated use of this variety of tools, we present a modular programmatic application interface (MAPI) that provides the necessary functionality for uniform representation of Web Services metadata descriptors including their management and invocation protocols of the services which they represent. This document describes the main functionality of the framework and how it can be used to facilitate the deployment of new software under a unified structure of bioinformatics Web Services. A notable feature of MAPI is the modular organization of the functionality into different modules associated with specific tasks. This means that only the modules needed for the client have to be installed, and that the module functionality can be extended without the need for re-writing the software client. The potential utility and versatility of the software library has been demonstrated by the implementation of several currently available clients that cover different aspects of integrated data processing, ranging from service discovery to service invocation with advanced features such as workflows composition and asynchronous services calls to multiple types of Web Services including those registered in repositories (e.g. GRID-based, SOAP, BioMOBY, R-bioconductor, and others).

  11. MOWServ: a web client for integration of bioinformatic resources

    Science.gov (United States)

    Ramírez, Sergio; Muñoz-Mérida, Antonio; Karlsson, Johan; García, Maximiliano; Pérez-Pulido, Antonio J.; Claros, M. Gonzalo; Trelles, Oswaldo

    2010-01-01

    The productivity of any scientist is affected by cumbersome, tedious and time-consuming tasks that try to make the heterogeneous web services compatible so that they can be useful in their research. MOWServ, the bioinformatic platform offered by the Spanish National Institute of Bioinformatics, was released to provide integrated access to databases and analytical tools. Since its release, the number of available services has grown dramatically, and it has become one of the main contributors of registered services in the EMBRACE Biocatalogue. The ontology that enables most of the web-service compatibility has been curated, improved and extended. The service discovery has been greatly enhanced by Magallanes software and biodataSF. User data are securely stored on the main server by an authentication protocol that enables the monitoring of current or already-finished user’s tasks, as well as the pipelining of successive data processing services. The BioMoby standard has been greatly extended with the new features included in the MOWServ, such as management of additional information (metadata such as extended descriptions, keywords and datafile examples), a qualified registry, error handling, asynchronous services and service replication. All of them have increased the MOWServ service quality, usability and robustness. MOWServ is available at http://www.inab.org/MOWServ/ and has a mirror at http://www.bitlab-es.com/MOWServ/. PMID:20525794

  12. WeBIAS: a web server for publishing bioinformatics applications.

    Science.gov (United States)

    Daniluk, Paweł; Wilczyński, Bartek; Lesyng, Bogdan

    2015-11-02

    One of the requirements for a successful scientific tool is its availability. Developing a functional web service, however, is usually considered a mundane and ungratifying task, and quite often neglected. When publishing bioinformatic applications, such attitude puts additional burden on the reviewers who have to cope with poorly designed interfaces in order to assess quality of presented methods, as well as impairs actual usefulness to the scientific community at large. In this note we present WeBIAS-a simple, self-contained solution to make command-line programs accessible through web forms. It comprises a web portal capable of serving several applications and backend schedulers which carry out computations. The server handles user registration and authentication, stores queries and results, and provides a convenient administrator interface. WeBIAS is implemented in Python and available under GNU Affero General Public License. It has been developed and tested on GNU/Linux compatible platforms covering a vast majority of operational WWW servers. Since it is written in pure Python, it should be easy to deploy also on all other platforms supporting Python (e.g. Windows, Mac OS X). Documentation and source code, as well as a demonstration site are available at http://bioinfo.imdik.pan.pl/webias . WeBIAS has been designed specifically with ease of installation and deployment of services in mind. Setting up a simple application requires minimal effort, yet it is possible to create visually appealing, feature-rich interfaces for query submission and presentation of results.

  13. The web server of IBM's Bioinformatics and Pattern Discovery group.

    Science.gov (United States)

    Huynh, Tien; Rigoutsos, Isidore; Parida, Laxmi; Platt, Daniel; Shibuya, Tetsuo

    2003-07-01

    We herein present and discuss the services and content which are available on the web server of IBM's Bioinformatics and Pattern Discovery group. The server is operational around the clock and provides access to a variety of methods that have been published by the group's members and collaborators. The available tools correspond to applications ranging from the discovery of patterns in streams of events and the computation of multiple sequence alignments, to the discovery of genes in nucleic acid sequences and the interactive annotation of amino acid sequences. Additionally, annotations for more than 70 archaeal, bacterial, eukaryotic and viral genomes are available on-line and can be searched interactively. The tools and code bundles can be accessed beginning at http://cbcsrv.watson.ibm.com/Tspd.html whereas the genomics annotations are available at http://cbcsrv.watson.ibm.com/Annotations/.

  14. A decade of Web Server updates at the Bioinformatics Links Directory: 2003-2012.

    Science.gov (United States)

    Brazas, Michelle D; Yim, David; Yeung, Winston; Ouellette, B F Francis

    2012-07-01

    The 2012 Bioinformatics Links Directory update marks the 10th special Web Server issue from Nucleic Acids Research. Beginning with content from their 2003 publication, the Bioinformatics Links Directory in collaboration with Nucleic Acids Research has compiled and published a comprehensive list of freely accessible, online tools, databases and resource materials for the bioinformatics and life science research communities. The past decade has exhibited significant growth and change in the types of tools, databases and resources being put forth, reflecting both technology changes and the nature of research over that time. With the addition of 90 web server tools and 12 updates from the July 2012 Web Server issue of Nucleic Acids Research, the Bioinformatics Links Directory at http://bioinformatics.ca/links_directory/ now contains an impressive 134 resources, 455 databases and 1205 web server tools, mirroring the continued activity and efforts of our field.

  15. Bioinformatics

    DEFF Research Database (Denmark)

    Baldi, Pierre; Brunak, Søren

    , and medicine will be particularly affected by the new results and the increased understanding of life at the molecular level. Bioinformatics is the development and application of computer methods for analysis, interpretation, and prediction, as well as for the design of experiments. It has emerged...

  16. BioSWR--semantic web services registry for bioinformatics.

    Directory of Open Access Journals (Sweden)

    Dmitry Repchevsky

    Full Text Available Despite of the variety of available Web services registries specially aimed at Life Sciences, their scope is usually restricted to a limited set of well-defined types of services. While dedicated registries are generally tied to a particular format, general-purpose ones are more adherent to standards and usually rely on Web Service Definition Language (WSDL. Although WSDL is quite flexible to support common Web services types, its lack of semantic expressiveness led to various initiatives to describe Web services via ontology languages. Nevertheless, WSDL 2.0 descriptions gained a standard representation based on Web Ontology Language (OWL. BioSWR is a novel Web services registry that provides standard Resource Description Framework (RDF based Web services descriptions along with the traditional WSDL based ones. The registry provides Web-based interface for Web services registration, querying and annotation, and is also accessible programmatically via Representational State Transfer (REST API or using a SPARQL Protocol and RDF Query Language. BioSWR server is located at http://inb.bsc.es/BioSWR/and its code is available at https://sourceforge.net/projects/bioswr/under the LGPL license.

  17. BioSWR – Semantic Web Services Registry for Bioinformatics

    Science.gov (United States)

    Repchevsky, Dmitry; Gelpi, Josep Ll.

    2014-01-01

    Despite of the variety of available Web services registries specially aimed at Life Sciences, their scope is usually restricted to a limited set of well-defined types of services. While dedicated registries are generally tied to a particular format, general-purpose ones are more adherent to standards and usually rely on Web Service Definition Language (WSDL). Although WSDL is quite flexible to support common Web services types, its lack of semantic expressiveness led to various initiatives to describe Web services via ontology languages. Nevertheless, WSDL 2.0 descriptions gained a standard representation based on Web Ontology Language (OWL). BioSWR is a novel Web services registry that provides standard Resource Description Framework (RDF) based Web services descriptions along with the traditional WSDL based ones. The registry provides Web-based interface for Web services registration, querying and annotation, and is also accessible programmatically via Representational State Transfer (REST) API or using a SPARQL Protocol and RDF Query Language. BioSWR server is located at http://inb.bsc.es/BioSWR/and its code is available at https://sourceforge.net/projects/bioswr/under the LGPL license. PMID:25233118

  18. BioSWR--semantic web services registry for bioinformatics.

    Science.gov (United States)

    Repchevsky, Dmitry; Gelpi, Josep Ll

    2014-01-01

    Despite of the variety of available Web services registries specially aimed at Life Sciences, their scope is usually restricted to a limited set of well-defined types of services. While dedicated registries are generally tied to a particular format, general-purpose ones are more adherent to standards and usually rely on Web Service Definition Language (WSDL). Although WSDL is quite flexible to support common Web services types, its lack of semantic expressiveness led to various initiatives to describe Web services via ontology languages. Nevertheless, WSDL 2.0 descriptions gained a standard representation based on Web Ontology Language (OWL). BioSWR is a novel Web services registry that provides standard Resource Description Framework (RDF) based Web services descriptions along with the traditional WSDL based ones. The registry provides Web-based interface for Web services registration, querying and annotation, and is also accessible programmatically via Representational State Transfer (REST) API or using a SPARQL Protocol and RDF Query Language. BioSWR server is located at http://inb.bsc.es/BioSWR/and its code is available at https://sourceforge.net/projects/bioswr/under the LGPL license.

  19. A semantic web approach applied to integrative bioinformatics experimentation: a biological use case with genomics data.

    NARCIS (Netherlands)

    Post, L.J.G.; Roos, M.; Marshall, M.S.; van Driel, R.; Breit, T.M.

    2007-01-01

    The numerous public data resources make integrative bioinformatics experimentation increasingly important in life sciences research. However, it is severely hampered by the way the data and information are made available. The semantic web approach enhances data exchange and integration by providing

  20. The DBCLS BioHackathon: standardization and interoperability for bioinformatics web services and workflows

    NARCIS (Netherlands)

    Katayama, T.; Arakawa, K.; Nakao, M.; Prins, J.C.P.

    2010-01-01

    Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However,

  1. The EMBL-EBI bioinformatics web and programmatic tools framework.

    Science.gov (United States)

    Li, Weizhong; Cowley, Andrew; Uludag, Mahmut; Gur, Tamer; McWilliam, Hamish; Squizzato, Silvano; Park, Young Mi; Buso, Nicola; Lopez, Rodrigo

    2015-07-01

    Since 2009 the EMBL-EBI Job Dispatcher framework has provided free access to a range of mainstream sequence analysis applications. These include sequence similarity search services (https://www.ebi.ac.uk/Tools/sss/) such as BLAST, FASTA and PSI-Search, multiple sequence alignment tools (https://www.ebi.ac.uk/Tools/msa/) such as Clustal Omega, MAFFT and T-Coffee, and other sequence analysis tools (https://www.ebi.ac.uk/Tools/pfa/) such as InterProScan. Through these services users can search mainstream sequence databases such as ENA, UniProt and Ensembl Genomes, utilising a uniform web interface or systematically through Web Services interfaces (https://www.ebi.ac.uk/Tools/webservices/) using common programming languages, and obtain enriched results with novel visualisations. Integration with EBI Search (https://www.ebi.ac.uk/ebisearch/) and the dbfetch retrieval service (https://www.ebi.ac.uk/Tools/dbfetch/) further expands the usefulness of the framework. New tools and updates such as NCBI BLAST+, InterProScan 5 and PfamScan, new categories such as RNA analysis tools (https://www.ebi.ac.uk/Tools/rna/), new databases such as ENA non-coding, WormBase ParaSite, Pfam and Rfam, and new workflow methods, together with the retirement of depreciated services, ensure that the framework remains relevant to today's biological community. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  2. ClusterControl: a web interface for distributing and monitoring bioinformatics applications on a Linux cluster.

    Science.gov (United States)

    Stocker, Gernot; Rieder, Dietmar; Trajanoski, Zlatko

    2004-03-22

    ClusterControl is a web interface to simplify distributing and monitoring bioinformatics applications on Linux cluster systems. We have developed a modular concept that enables integration of command line oriented program into the application framework of ClusterControl. The systems facilitate integration of different applications accessed through one interface and executed on a distributed cluster system. The package is based on freely available technologies like Apache as web server, PHP as server-side scripting language and OpenPBS as queuing system and is available free of charge for academic and non-profit institutions. http://genome.tugraz.at/Software/ClusterControl

  3. BioXSD: the common data-exchange format for everyday bioinformatics web services

    Science.gov (United States)

    Kalaš, Matúš; Puntervoll, Pæl; Joseph, Alexandre; Bartaševičiūtė, Edita; Töpfer, Armin; Venkataraman, Prabakar; Pettifer, Steve; Bryne, Jan Christian; Ison, Jon; Blanchet, Christophe; Rapacki, Kristoffer; Jonassen, Inge

    2010-01-01

    Motivation: The world-wide community of life scientists has access to a large number of public bioinformatics databases and tools, which are developed and deployed using diverse technologies and designs. More and more of the resources offer programmatic web-service interface. However, efficient use of the resources is hampered by the lack of widely used, standard data-exchange formats for the basic, everyday bioinformatics data types. Results: BioXSD has been developed as a candidate for standard, canonical exchange format for basic bioinformatics data. BioXSD is represented by a dedicated XML Schema and defines syntax for biological sequences, sequence annotations, alignments and references to resources. We have adapted a set of web services to use BioXSD as the input and output format, and implemented a test-case workflow. This demonstrates that the approach is feasible and provides smooth interoperability. Semantics for BioXSD is provided by annotation with the EDAM ontology. We discuss in a separate section how BioXSD relates to other initiatives and approaches, including existing standards and the Semantic Web. Availability: The BioXSD 1.0 XML Schema is freely available at http://www.bioxsd.org/BioXSD-1.0.xsd under the Creative Commons BY-ND 3.0 license. The http://bioxsd.org web page offers documentation, examples of data in BioXSD format, example workflows with source codes in common programming languages, an updated list of compatible web services and tools and a repository of feature requests from the community. Contact: matus.kalas@bccs.uib.no; developers@bioxsd.org; support@bioxsd.org PMID:20823319

  4. BioXSD: the common data-exchange format for everyday bioinformatics web services.

    Science.gov (United States)

    Kalas, Matús; Puntervoll, Pål; Joseph, Alexandre; Bartaseviciūte, Edita; Töpfer, Armin; Venkataraman, Prabakar; Pettifer, Steve; Bryne, Jan Christian; Ison, Jon; Blanchet, Christophe; Rapacki, Kristoffer; Jonassen, Inge

    2010-09-15

    The world-wide community of life scientists has access to a large number of public bioinformatics databases and tools, which are developed and deployed using diverse technologies and designs. More and more of the resources offer programmatic web-service interface. However, efficient use of the resources is hampered by the lack of widely used, standard data-exchange formats for the basic, everyday bioinformatics data types. BioXSD has been developed as a candidate for standard, canonical exchange format for basic bioinformatics data. BioXSD is represented by a dedicated XML Schema and defines syntax for biological sequences, sequence annotations, alignments and references to resources. We have adapted a set of web services to use BioXSD as the input and output format, and implemented a test-case workflow. This demonstrates that the approach is feasible and provides smooth interoperability. Semantics for BioXSD is provided by annotation with the EDAM ontology. We discuss in a separate section how BioXSD relates to other initiatives and approaches, including existing standards and the Semantic Web. The BioXSD 1.0 XML Schema is freely available at http://www.bioxsd.org/BioXSD-1.0.xsd under the Creative Commons BY-ND 3.0 license. The http://bioxsd.org web page offers documentation, examples of data in BioXSD format, example workflows with source codes in common programming languages, an updated list of compatible web services and tools and a repository of feature requests from the community.

  5. GeneDig: a web application for accessing genomic and bioinformatics knowledge.

    Science.gov (United States)

    Suciu, Radu M; Aydin, Emir; Chen, Brian E

    2015-02-28

    With the exponential increase and widespread availability of genomic, transcriptomic, and proteomic data, accessing these '-omics' data is becoming increasingly difficult. The current resources for accessing and analyzing these data have been created to perform highly specific functions intended for specialists, and thus typically emphasize functionality over user experience. We have developed a web-based application, GeneDig.org, that allows any general user access to genomic information with ease and efficiency. GeneDig allows for searching and browsing genes and genomes, while a dynamic navigator displays genomic, RNA, and protein information simultaneously for co-navigation. We demonstrate that our application allows more than five times faster and efficient access to genomic information than any currently available methods. We have developed GeneDig as a platform for bioinformatics integration focused on usability as its central design. This platform will introduce genomic navigation to broader audiences while aiding the bioinformatics analyses performed in everyday biology research.

  6. BioXSD: the common data-exchange format for everyday bioinformatics web services

    DEFF Research Database (Denmark)

    Kalas, M.; Puntervoll, P.; Joseph, A.

    2010-01-01

    Motivation: The world-wide community of life scientists has access to a large number of public bioinformatics databases and tools, which are developed and deployed using diverse technologies and designs. More and more of the resources offer programmatic web-service interface. However, efficient use...... and defines syntax for biological sequences, sequence annotations, alignments and references to resources. We have adapted a set of web services to use BioXSD as the input and output format, and implemented a test-case workflow. This demonstrates that the approach is feasible and provides smooth...... interoperability. Semantics for BioXSD is provided by annotation with the EDAM ontology. We discuss in a separate section how BioXSD relates to other initiatives and approaches, including existing standards and the Semantic Web....

  7. The World-Wide Web: An Interface between Research and Teaching in Bioinformatics

    Directory of Open Access Journals (Sweden)

    James F. Aiton

    1994-01-01

    Full Text Available The rapid expansion occurring in World-Wide Web activity is beginning to make the concepts of ‘global hypermedia’ and ‘universal document readership’ realistic objectives of the new revolution in information technology. One consequence of this increase in usage is that educators and students are becoming more aware of the diversity of the knowledge base which can be accessed via the Internet. Although computerised databases and information services have long played a key role in bioinformatics these same resources can also be used to provide core materials for teaching and learning. The large datasets and arch ives th at have been compiled for biomedical research can be enhanced with the addition of a variety of multimedia elements (images. digital videos. animation etc.. The use of this digitally stored information in structured and self-directed learning environments is likely to increase as activity across World-Wide Web increases.

  8. XMPP for cloud computing in bioinformatics supporting discovery and invocation of asynchronous web services

    Directory of Open Access Journals (Sweden)

    Willighagen Egon L

    2009-09-01

    Full Text Available Abstract Background Life sciences make heavily use of the web for both data provision and analysis. However, the increasing amount of available data and the diversity of analysis tools call for machine accessible interfaces in order to be effective. HTTP-based Web service technologies, like the Simple Object Access Protocol (SOAP and REpresentational State Transfer (REST services, are today the most common technologies for this in bioinformatics. However, these methods have severe drawbacks, including lack of discoverability, and the inability for services to send status notifications. Several complementary workarounds have been proposed, but the results are ad-hoc solutions of varying quality that can be difficult to use. Results We present a novel approach based on the open standard Extensible Messaging and Presence Protocol (XMPP, consisting of an extension (IO Data to comprise discovery, asynchronous invocation, and definition of data types in the service. That XMPP cloud services are capable of asynchronous communication implies that clients do not have to poll repetitively for status, but the service sends the results back to the client upon completion. Implementations for Bioclipse and Taverna are presented, as are various XMPP cloud services in bio- and cheminformatics. Conclusion XMPP with its extensions is a powerful protocol for cloud services that demonstrate several advantages over traditional HTTP-based Web services: 1 services are discoverable without the need of an external registry, 2 asynchronous invocation eliminates the need for ad-hoc solutions like polling, and 3 input and output types defined in the service allows for generation of clients on the fly without the need of an external semantics description. The many advantages over existing technologies make XMPP a highly interesting candidate for next generation online services in bioinformatics.

  9. XMPP for cloud computing in bioinformatics supporting discovery and invocation of asynchronous web services.

    Science.gov (United States)

    Wagener, Johannes; Spjuth, Ola; Willighagen, Egon L; Wikberg, Jarl E S

    2009-09-04

    Life sciences make heavily use of the web for both data provision and analysis. However, the increasing amount of available data and the diversity of analysis tools call for machine accessible interfaces in order to be effective. HTTP-based Web service technologies, like the Simple Object Access Protocol (SOAP) and REpresentational State Transfer (REST) services, are today the most common technologies for this in bioinformatics. However, these methods have severe drawbacks, including lack of discoverability, and the inability for services to send status notifications. Several complementary workarounds have been proposed, but the results are ad-hoc solutions of varying quality that can be difficult to use. We present a novel approach based on the open standard Extensible Messaging and Presence Protocol (XMPP), consisting of an extension (IO Data) to comprise discovery, asynchronous invocation, and definition of data types in the service. That XMPP cloud services are capable of asynchronous communication implies that clients do not have to poll repetitively for status, but the service sends the results back to the client upon completion. Implementations for Bioclipse and Taverna are presented, as are various XMPP cloud services in bio- and cheminformatics. XMPP with its extensions is a powerful protocol for cloud services that demonstrate several advantages over traditional HTTP-based Web services: 1) services are discoverable without the need of an external registry, 2) asynchronous invocation eliminates the need for ad-hoc solutions like polling, and 3) input and output types defined in the service allows for generation of clients on the fly without the need of an external semantics description. The many advantages over existing technologies make XMPP a highly interesting candidate for next generation online services in bioinformatics.

  10. ImageJS: Personalized, participated, pervasive, and reproducible image bioinformatics in the web browser.

    Science.gov (United States)

    Almeida, Jonas S; Iriabho, Egiebade E; Gorrepati, Vijaya L; Wilkinson, Sean R; Grüneberg, Alexander; Robbins, David E; Hackney, James R

    2012-01-01

    Image bioinformatics infrastructure typically relies on a combination of server-side high-performance computing and client desktop applications tailored for graphic rendering. On the server side, matrix manipulation environments are often used as the back-end where deployment of specialized analytical workflows takes place. However, neither the server-side nor the client-side desktop solution, by themselves or combined, is conducive to the emergence of open, collaborative, computational ecosystems for image analysis that are both self-sustained and user driven. ImageJS was developed as a browser-based webApp, untethered from a server-side backend, by making use of recent advances in the modern web browser such as a very efficient compiler, high-end graphical rendering capabilities, and I/O tailored for code migration. Multiple versioned code hosting services were used to develop distinct ImageJS modules to illustrate its amenability to collaborative deployment without compromise of reproducibility or provenance. The illustrative examples include modules for image segmentation, feature extraction, and filtering. The deployment of image analysis by code migration is in sharp contrast with the more conventional, heavier, and less safe reliance on data transfer. Accordingly, code and data are loaded into the browser by exactly the same script tag loading mechanism, which offers a number of interesting applications that would be hard to attain with more conventional platforms, such as NIH's popular ImageJ application. The modern web browser was found to be advantageous for image bioinformatics in both the research and clinical environments. This conclusion reflects advantages in deployment scalability and analysis reproducibility, as well as the critical ability to deliver advanced computational statistical procedures machines where access to sensitive data is controlled, that is, without local "download and installation".

  11. ImageJS: Personalized, participated, pervasive, and reproducible image bioinformatics in the web browser

    Directory of Open Access Journals (Sweden)

    Jonas S Almeida

    2012-01-01

    Full Text Available Background: Image bioinformatics infrastructure typically relies on a combination of server-side high-performance computing and client desktop applications tailored for graphic rendering. On the server side, matrix manipulation environments are often used as the back-end where deployment of specialized analytical workflows takes place. However, neither the server-side nor the client-side desktop solution, by themselves or combined, is conducive to the emergence of open, collaborative, computational ecosystems for image analysis that are both self-sustained and user driven. Materials and Methods: ImageJS was developed as a browser-based webApp, untethered from a server-side backend, by making use of recent advances in the modern web browser such as a very efficient compiler, high-end graphical rendering capabilities, and I/O tailored for code migration. Results : Multiple versioned code hosting services were used to develop distinct ImageJS modules to illustrate its amenability to collaborative deployment without compromise of reproducibility or provenance. The illustrative examples include modules for image segmentation, feature extraction, and filtering. The deployment of image analysis by code migration is in sharp contrast with the more conventional, heavier, and less safe reliance on data transfer. Accordingly, code and data are loaded into the browser by exactly the same script tag loading mechanism, which offers a number of interesting applications that would be hard to attain with more conventional platforms, such as NIH′s popular ImageJ application. Conclusions : The modern web browser was found to be advantageous for image bioinformatics in both the research and clinical environments. This conclusion reflects advantages in deployment scalability and analysis reproducibility, as well as the critical ability to deliver advanced computational statistical procedures machines where access to sensitive data is controlled, that is, without

  12. XMPP for cloud computing in bioinformatics supporting discovery and invocation of asynchronous web services

    Science.gov (United States)

    Wagener, Johannes; Spjuth, Ola; Willighagen, Egon L; Wikberg, Jarl ES

    2009-01-01

    Background Life sciences make heavily use of the web for both data provision and analysis. However, the increasing amount of available data and the diversity of analysis tools call for machine accessible interfaces in order to be effective. HTTP-based Web service technologies, like the Simple Object Access Protocol (SOAP) and REpresentational State Transfer (REST) services, are today the most common technologies for this in bioinformatics. However, these methods have severe drawbacks, including lack of discoverability, and the inability for services to send status notifications. Several complementary workarounds have been proposed, but the results are ad-hoc solutions of varying quality that can be difficult to use. Results We present a novel approach based on the open standard Extensible Messaging and Presence Protocol (XMPP), consisting of an extension (IO Data) to comprise discovery, asynchronous invocation, and definition of data types in the service. That XMPP cloud services are capable of asynchronous communication implies that clients do not have to poll repetitively for status, but the service sends the results back to the client upon completion. Implementations for Bioclipse and Taverna are presented, as are various XMPP cloud services in bio- and cheminformatics. Conclusion XMPP with its extensions is a powerful protocol for cloud services that demonstrate several advantages over traditional HTTP-based Web services: 1) services are discoverable without the need of an external registry, 2) asynchronous invocation eliminates the need for ad-hoc solutions like polling, and 3) input and output types defined in the service allows for generation of clients on the fly without the need of an external semantics description. The many advantages over existing technologies make XMPP a highly interesting candidate for next generation online services in bioinformatics. PMID:19732427

  13. Galaxy Workflows for Web-based Bioinformatics Analysis of Aptamer High-throughput Sequencing Data

    Directory of Open Access Journals (Sweden)

    William H Thiel

    2016-01-01

    Full Text Available Development of RNA and DNA aptamers for diagnostic and therapeutic applications is a rapidly growing field. Aptamers are identified through iterative rounds of selection in a process termed SELEX (Systematic Evolution of Ligands by EXponential enrichment. High-throughput sequencing (HTS revolutionized the modern SELEX process by identifying millions of aptamer sequences across multiple rounds of aptamer selection. However, these vast aptamer HTS datasets necessitated bioinformatics techniques. Herein, we describe a semiautomated approach to analyze aptamer HTS datasets using the Galaxy Project, a web-based open source collection of bioinformatics tools that were originally developed to analyze genome, exome, and transcriptome HTS data. Using a series of Workflows created in the Galaxy webserver, we demonstrate efficient processing of aptamer HTS data and compilation of a database of unique aptamer sequences. Additional Workflows were created to characterize the abundance and persistence of aptamer sequences within a selection and to filter sequences based on these parameters. A key advantage of this approach is that the online nature of the Galaxy webserver and its graphical interface allow for the analysis of HTS data without the need to compile code or install multiple programs.

  14. Ergatis: a web interface and scalable software system for bioinformatics workflows

    Science.gov (United States)

    Orvis, Joshua; Crabtree, Jonathan; Galens, Kevin; Gussman, Aaron; Inman, Jason M.; Lee, Eduardo; Nampally, Sreenath; Riley, David; Sundaram, Jaideep P.; Felix, Victor; Whitty, Brett; Mahurkar, Anup; Wortman, Jennifer; White, Owen; Angiuoli, Samuel V.

    2010-01-01

    Motivation: The growth of sequence data has been accompanied by an increasing need to analyze data on distributed computer clusters. The use of these systems for routine analysis requires scalable and robust software for data management of large datasets. Software is also needed to simplify data management and make large-scale bioinformatics analysis accessible and reproducible to a wide class of target users. Results: We have developed a workflow management system named Ergatis that enables users to build, execute and monitor pipelines for computational analysis of genomics data. Ergatis contains preconfigured components and template pipelines for a number of common bioinformatics tasks such as prokaryotic genome annotation and genome comparisons. Outputs from many of these components can be loaded into a Chado relational database. Ergatis was designed to be accessible to a broad class of users and provides a user friendly, web-based interface. Ergatis supports high-throughput batch processing on distributed compute clusters and has been used for data management in a number of genome annotation and comparative genomics projects. Availability: Ergatis is an open-source project and is freely available at http://ergatis.sourceforge.net Contact: jorvis@users.sourceforge.net PMID:20413634

  15. JABAWS 2.2 distributed web services for Bioinformatics: protein disorder, conservation and RNA secondary structure.

    Science.gov (United States)

    Troshin, Peter V; Procter, James B; Sherstnev, Alexander; Barton, Daniel L; Madeira, Fábio; Barton, Geoffrey J

    2018-06-01

    JABAWS 2.2 is a computational framework that simplifies the deployment of web services for Bioinformatics. In addition to the five multiple sequence alignment (MSA) algorithms in JABAWS 1.0, JABAWS 2.2 includes three additional MSA programs (Clustal Omega, MSAprobs, GLprobs), four protein disorder prediction methods (DisEMBL, IUPred, Ronn, GlobPlot), 18 measures of protein conservation as implemented in AACon, and RNA secondary structure prediction by the RNAalifold program. JABAWS 2.2 can be deployed on a variety of in-house or hosted systems. JABAWS 2.2 web services may be accessed from the Jalview multiple sequence analysis workbench (Version 2.8 and later), as well as directly via the JABAWS command line interface (CLI) client. JABAWS 2.2 can be deployed on a local virtual server as a Virtual Appliance (VA) or simply as a Web Application Archive (WAR) for private use. Improvements in JABAWS 2.2 also include simplified installation and a range of utility tools for usage statistics collection, and web services querying and monitoring. The JABAWS CLI client has been updated to support all the new services and allow integration of JABAWS 2.2 services into conventional scripts. A public JABAWS 2 server has been in production since December 2011 and served over 800 000 analyses for users worldwide. JABAWS 2.2 is made freely available under the Apache 2 license and can be obtained from: http://www.compbio.dundee.ac.uk/jabaws. g.j.barton@dundee.ac.uk.

  16. The 2nd DBCLS BioHackathon: interoperable bioinformatics Web services for integrated applications

    Directory of Open Access Journals (Sweden)

    Katayama Toshiaki

    2011-08-01

    Full Text Available Abstract Background The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices and standards. Here, we report on an initiative in which software developers and genome biologists came together to explore and raise awareness of these issues: BioHackathon 2009. Results Developers in attendance came from diverse backgrounds, with experts in Web services, workflow tools, text mining and visualization. Genome biologists provided expertise and exemplar data from the domains of sequence and pathway analysis and glyco-informatics. One goal of the meeting was to evaluate the ability to address real world use cases in these domains using the tools that the developers represented. This resulted in i a workflow to annotate 100,000 sequences from an invertebrate species; ii an integrated system for analysis of the transcription factor binding sites (TFBSs enriched based on differential gene expression data obtained from a microarray experiment; iii a workflow to enumerate putative physical protein interactions among enzymes in a metabolic pathway using protein structure data; iv a workflow to analyze glyco-gene-related diseases by searching for human homologs of glyco-genes in other species, such as fruit flies, and retrieving their phenotype-annotated SNPs. Conclusions Beyond deriving prototype solutions for each use-case, a second major purpose of the BioHackathon was to highlight areas of insufficiency. We discuss the issues raised by our exploration of the problem/solution space, concluding that there are still problems with the way Web services are modeled and annotated, including: i the absence of several useful data or analysis functions in the Web service "space"; ii the lack of documentation of methods; iii lack of

  17. The 2nd DBCLS BioHackathon: interoperable bioinformatics Web services for integrated applications

    Science.gov (United States)

    2011-01-01

    Background The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices and standards. Here, we report on an initiative in which software developers and genome biologists came together to explore and raise awareness of these issues: BioHackathon 2009. Results Developers in attendance came from diverse backgrounds, with experts in Web services, workflow tools, text mining and visualization. Genome biologists provided expertise and exemplar data from the domains of sequence and pathway analysis and glyco-informatics. One goal of the meeting was to evaluate the ability to address real world use cases in these domains using the tools that the developers represented. This resulted in i) a workflow to annotate 100,000 sequences from an invertebrate species; ii) an integrated system for analysis of the transcription factor binding sites (TFBSs) enriched based on differential gene expression data obtained from a microarray experiment; iii) a workflow to enumerate putative physical protein interactions among enzymes in a metabolic pathway using protein structure data; iv) a workflow to analyze glyco-gene-related diseases by searching for human homologs of glyco-genes in other species, such as fruit flies, and retrieving their phenotype-annotated SNPs. Conclusions Beyond deriving prototype solutions for each use-case, a second major purpose of the BioHackathon was to highlight areas of insufficiency. We discuss the issues raised by our exploration of the problem/solution space, concluding that there are still problems with the way Web services are modeled and annotated, including: i) the absence of several useful data or analysis functions in the Web service "space"; ii) the lack of documentation of methods; iii) lack of compliance with the SOAP

  18. The web server of IBM's Bioinformatics and Pattern Discovery group: 2004 update.

    Science.gov (United States)

    Huynh, Tien; Rigoutsos, Isidore

    2004-07-01

    In this report, we provide an update on the services and content which are available on the web server of IBM's Bioinformatics and Pattern Discovery group. The server, which is operational around the clock, provides access to a large number of methods that have been developed and published by the group's members. There is an increasing number of problems that these tools can help tackle; these problems range from the discovery of patterns in streams of events and the computation of multiple sequence alignments, to the discovery of genes in nucleic acid sequences, the identification--directly from sequence--of structural deviations from alpha-helicity and the annotation of amino acid sequences for antimicrobial activity. Additionally, annotations for more than 130 archaeal, bacterial, eukaryotic and viral genomes are now available on-line and can be searched interactively. The tools and code bundles continue to be accessible from http://cbcsrv.watson.ibm.com/Tspd.html whereas the genomics annotations are available at http://cbcsrv.watson.ibm.com/Annotations/.

  19. PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasets.

    Science.gov (United States)

    Djokic-Petrovic, Marija; Cvjetkovic, Vladimir; Yang, Jeremy; Zivanovic, Marko; Wild, David J

    2017-09-20

    There are a huge variety of data sources relevant to chemical, biological and pharmacological research, but these data sources are highly siloed and cannot be queried together in a straightforward way. Semantic technologies offer the ability to create links and mappings across datasets and manage them as a single, linked network so that searching can be carried out across datasets, independently of the source. We have developed an application called PIBAS FedSPARQL that uses semantic technologies to allow researchers to carry out such searching across a vast array of data sources. PIBAS FedSPARQL is a web-based query builder and result set visualizer of bioinformatics data. As an advanced feature, our system can detect similar data items identified by different Uniform Resource Identifiers (URIs), using a text-mining algorithm based on the processing of named entities to be used in Vector Space Model and Cosine Similarity Measures. According to our knowledge, PIBAS FedSPARQL was unique among the systems that we found in that it allows detecting of similar data items. As a query builder, our system allows researchers to intuitively construct and run Federated SPARQL queries across multiple data sources, including global initiatives, such as Bio2RDF, Chem2Bio2RDF, EMBL-EBI, and one local initiative called CPCTAS, as well as additional user-specified data source. From the input topic, subtopic, template and keyword, a corresponding initial Federated SPARQL query is created and executed. Based on the data obtained, end users have the ability to choose the most appropriate data sources in their area of interest and exploit their Resource Description Framework (RDF) structure, which allows users to select certain properties of data to enhance query results. The developed system is flexible and allows intuitive creation and execution of queries for an extensive range of bioinformatics topics. Also, the novel "similar data items detection" algorithm can be particularly

  20. MSeqDR: A Centralized Knowledge Repository and Bioinformatics Web Resource to Facilitate Genomic Investigations in Mitochondrial Disease

    OpenAIRE

    Shen, Lishuang; Diroma, Maria Angela; Gonzalez, Michael; Navarro-Gomez, Daniel; Leipzig, Jeremy; Lott, Marie T.; Oven, Mannis; Wallace, D.C.; Muraresku, Colleen Clarke; Zolkipli-Cunningham, Zarazuela; Chinnery, Patrick; Attimonelli, Marcella; Zuchner, Stephan; Falk, Marni J.; Gai, Xiaowu

    2016-01-01

    textabstractMSeqDR is the Mitochondrial Disease Sequence Data Resource, a centralized and comprehensive genome and phenome bioinformatics resource built by the mitochondrial disease community to facilitate clinical diagnosis and research investigations of individual patient phenotypes, genomes, genes, and variants. A central Web portal (https://mseqdr.org) integrates community knowledge from expert-curated databases with genomic and phenotype data shared by clinicians and researchers. MSeqDR ...

  1. Implementing a Web-Based Introductory Bioinformatics Course for Non-Bioinformaticians That Incorporates Practical Exercises

    Science.gov (United States)

    Vincent, Antony T.; Bourbonnais, Yves; Brouard, Jean-Simon; Deveau, Hélène; Droit, Arnaud; Gagné, Stéphane M.; Guertin, Michel; Lemieux, Claude; Rathier, Louis; Charette, Steve J.; Lagüe, Patrick

    2018-01-01

    A recent scientific discipline, bioinformatics, defined as using informatics for the study of biological problems, is now a requirement for the study of biological sciences. Bioinformatics has become such a powerful and popular discipline that several academic institutions have created programs in this field, allowing students to become…

  2. Teaching Bioinformatics and Neuroinformatics by Using Free Web-Based Tools

    Science.gov (United States)

    Grisham, William; Schottler, Natalie A.; Valli-Marill, Joanne; Beck, Lisa; Beatty, Jackson

    2010-01-01

    This completely computer-based module's purpose is to introduce students to bioinformatics resources. We present an easy-to-adopt module that weaves together several important bioinformatic tools so students can grasp how these tools are used in answering research questions. Students integrate information gathered from websites dealing with…

  3. Incorporating a Collaborative Web-Based Virtual Laboratory in an Undergraduate Bioinformatics Course

    Science.gov (United States)

    Weisman, David

    2010-01-01

    Face-to-face bioinformatics courses commonly include a weekly, in-person computer lab to facilitate active learning, reinforce conceptual material, and teach practical skills. Similarly, fully-online bioinformatics courses employ hands-on exercises to achieve these outcomes, although students typically perform this work offsite. Combining a…

  4. The DBCLS BioHackathon: standardization and interoperability for bioinformatics web services and workflows. The DBCLS BioHackathon Consortium*

    Directory of Open Access Journals (Sweden)

    Katayama Toshiaki

    2010-08-01

    Full Text Available Abstract Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However, various incompatibilities among database resources and analysis services make it difficult to connect and integrate these into interoperable workflows. To resolve this situation, we invited domain specialists from web service providers, client software developers, Open Bio* projects, the BioMoby project and researchers of emerging areas where a standard exchange data format is not well established, for an intensive collaboration entitled the BioHackathon 2008. The meeting was hosted by the Database Center for Life Science (DBCLS and Computational Biology Research Center (CBRC and was held in Tokyo from February 11th to 15th, 2008. In this report we highlight the work accomplished and the common issues arisen from this event, including the standardization of data exchange formats and services in the emerging fields of glycoinformatics, biological interaction networks, text mining, and phyloinformatics. In addition, common shared object development based on BioSQL, as well as technical challenges in large data management, asynchronous services, and security are discussed. Consequently, we improved interoperability of web services in several fields, however, further cooperation among major database centers and continued collaborative efforts between service providers and software developers are still necessary for an effective advance in bioinformatics web service technologies.

  5. How the strengths of Lisp-family languages facilitate building complex and flexible bioinformatics applications.

    Science.gov (United States)

    Khomtchouk, Bohdan B; Weitz, Edmund; Karp, Peter D; Wahlestedt, Claes

    2016-12-31

    We present a rationale for expanding the presence of the Lisp family of programming languages in bioinformatics and computational biology research. Put simply, Lisp-family languages enable programmers to more quickly write programs that run faster than in other languages. Languages such as Common Lisp, Scheme and Clojure facilitate the creation of powerful and flexible software that is required for complex and rapidly evolving domains like biology. We will point out several important key features that distinguish languages of the Lisp family from other programming languages, and we will explain how these features can aid researchers in becoming more productive and creating better code. We will also show how these features make these languages ideal tools for artificial intelligence and machine learning applications. We will specifically stress the advantages of domain-specific languages (DSLs): languages that are specialized to a particular area, and thus not only facilitate easier research problem formulation, but also aid in the establishment of standards and best programming practices as applied to the specific research field at hand. DSLs are particularly easy to build in Common Lisp, the most comprehensive Lisp dialect, which is commonly referred to as the 'programmable programming language'. We are convinced that Lisp grants programmers unprecedented power to build increasingly sophisticated artificial intelligence systems that may ultimately transform machine learning and artificial intelligence research in bioinformatics and computational biology. © The Author 2016. Published by Oxford University Press.

  6. The OAuth 2.0 Web Authorization Protocol for the Internet Addiction Bioinformatics (IABio Database

    Directory of Open Access Journals (Sweden)

    Jeongseok Choi

    2016-03-01

    Full Text Available Internet addiction (IA has become a widespread and problematic phenomenon as smart devices pervade society. Moreover, internet gaming disorder leads to increases in social expenditures for both individuals and nations alike. Although the prevention and treatment of IA are getting more important, the diagnosis of IA remains problematic. Understanding the neurobiological mechanism of behavioral addictions is essential for the development of specific and effective treatments. Although there are many databases related to other addictions, a database for IA has not been developed yet. In addition, bioinformatics databases, especially genetic databases, require a high level of security and should be designed based on medical information standards. In this respect, our study proposes the OAuth standard protocol for database access authorization. The proposed IA Bioinformatics (IABio database system is based on internet user authentication, which is a guideline for medical information standards, and uses OAuth 2.0 for access control technology. This study designed and developed the system requirements and configuration. The OAuth 2.0 protocol is expected to establish the security of personal medical information and be applied to genomic research on IA.

  7. The OAuth 2.0 Web Authorization Protocol for the Internet Addiction Bioinformatics (IABio) Database.

    Science.gov (United States)

    Choi, Jeongseok; Kim, Jaekwon; Lee, Dong Kyun; Jang, Kwang Soo; Kim, Dai-Jin; Choi, In Young

    2016-03-01

    Internet addiction (IA) has become a widespread and problematic phenomenon as smart devices pervade society. Moreover, internet gaming disorder leads to increases in social expenditures for both individuals and nations alike. Although the prevention and treatment of IA are getting more important, the diagnosis of IA remains problematic. Understanding the neurobiological mechanism of behavioral addictions is essential for the development of specific and effective treatments. Although there are many databases related to other addictions, a database for IA has not been developed yet. In addition, bioinformatics databases, especially genetic databases, require a high level of security and should be designed based on medical information standards. In this respect, our study proposes the OAuth standard protocol for database access authorization. The proposed IA Bioinformatics (IABio) database system is based on internet user authentication, which is a guideline for medical information standards, and uses OAuth 2.0 for access control technology. This study designed and developed the system requirements and configuration. The OAuth 2.0 protocol is expected to establish the security of personal medical information and be applied to genomic research on IA.

  8. MSeqDR: A Centralized Knowledge Repository and Bioinformatics Web Resource to Facilitate Genomic Investigations in Mitochondrial Disease.

    Science.gov (United States)

    Shen, Lishuang; Diroma, Maria Angela; Gonzalez, Michael; Navarro-Gomez, Daniel; Leipzig, Jeremy; Lott, Marie T; van Oven, Mannis; Wallace, Douglas C; Muraresku, Colleen Clarke; Zolkipli-Cunningham, Zarazuela; Chinnery, Patrick F; Attimonelli, Marcella; Zuchner, Stephan; Falk, Marni J; Gai, Xiaowu

    2016-06-01

    MSeqDR is the Mitochondrial Disease Sequence Data Resource, a centralized and comprehensive genome and phenome bioinformatics resource built by the mitochondrial disease community to facilitate clinical diagnosis and research investigations of individual patient phenotypes, genomes, genes, and variants. A central Web portal (https://mseqdr.org) integrates community knowledge from expert-curated databases with genomic and phenotype data shared by clinicians and researchers. MSeqDR also functions as a centralized application server for Web-based tools to analyze data across both mitochondrial and nuclear DNA, including investigator-driven whole exome or genome dataset analyses through MSeqDR-Genesis. MSeqDR-GBrowse genome browser supports interactive genomic data exploration and visualization with custom tracks relevant to mtDNA variation and mitochondrial disease. MSeqDR-LSDB is a locus-specific database that currently manages 178 mitochondrial diseases, 1,363 genes associated with mitochondrial biology or disease, and 3,711 pathogenic variants in those genes. MSeqDR Disease Portal allows hierarchical tree-style disease exploration to evaluate their unique descriptions, phenotypes, and causative variants. Automated genomic data submission tools are provided that capture ClinVar compliant variant annotations. PhenoTips will be used for phenotypic data submission on deidentified patients using human phenotype ontology terminology. The development of a dynamic informed patient consent process to guide data access is underway to realize the full potential of these resources. © 2016 WILEY PERIODICALS, INC.

  9. DincRNA: a comprehensive web-based bioinformatics toolkit for exploring disease associations and ncRNA function.

    Science.gov (United States)

    Cheng, Liang; Hu, Yang; Sun, Jie; Zhou, Meng; Jiang, Qinghua

    2018-06-01

    DincRNA aims to provide a comprehensive web-based bioinformatics toolkit to elucidate the entangled relationships among diseases and non-coding RNAs (ncRNAs) from the perspective of disease similarity. The quantitative way to illustrate relationships of pair-wise diseases always depends on their molecular mechanisms, and structures of the directed acyclic graph of Disease Ontology (DO). Corresponding methods for calculating similarity of pair-wise diseases involve Resnik's, Lin's, Wang's, PSB and SemFunSim methods. Recently, disease similarity was validated suitable for calculating functional similarities of ncRNAs and prioritizing ncRNA-disease pairs, and it has been widely applied for predicting the ncRNA function due to the limited biological knowledge from wet lab experiments of these RNAs. For this purpose, a large number of algorithms and priori knowledge need to be integrated. e.g. 'pair-wise best, pairs-average' (PBPA) and 'pair-wise all, pairs-maximum' (PAPM) methods for calculating functional similarities of ncRNAs, and random walk with restart (RWR) method for prioritizing ncRNA-disease pairs. To facilitate the exploration of disease associations and ncRNA function, DincRNA implemented all of the above eight algorithms based on DO and disease-related genes. Currently, it provides the function to query disease similarity scores, miRNA and lncRNA functional similarity scores, and the prioritization scores of lncRNA-disease and miRNA-disease pairs. http://bio-annotation.cn:18080/DincRNAClient/. biofomeng@hotmail.com or qhjiang@hit.edu.cn. Supplementary data are available at Bioinformatics online.

  10. Aspects of Data Warehouse Technologies for Complex Web Data

    OpenAIRE

    Thomsen, Christian

    2008-01-01

    This thesis is about aspects of specification and development of datawarehouse technologies for complex web data. Today, large amounts of dataexist in different web resources and in different formats. But it is oftenhard to analyze and query the often big and complex data or data about thedata (i.e., metadata). It is therefore interesting to apply Data Warehouse(DW) technology to the data. But to apply DW technology to complex web datais not straightforward and the DW community faces new and ...

  11. MSeqDR: A Centralized Knowledge Repository and Bioinformatics Web Resource to Facilitate Genomic Investigations in Mitochondrial Disease

    NARCIS (Netherlands)

    L. Shen (Lishuang); M.A. Diroma (Maria Angela); M. Gonzalez (Michael); D. Navarro-Gomez (Daniel); J. Leipzig (Jeremy); M.T. Lott (Marie T.); M. van Oven (Mannis); D.C. Wallace; C.C. Muraresku (Colleen Clarke); Z. Zolkipli-Cunningham (Zarazuela); P.F. Chinnery (Patrick); M. Attimonelli (Marcella); S. Zuchner (Stephan); M.J. Falk (Marni J.); X. Gai (Xiaowu)

    2016-01-01

    textabstractMSeqDR is the Mitochondrial Disease Sequence Data Resource, a centralized and comprehensive genome and phenome bioinformatics resource built by the mitochondrial disease community to facilitate clinical diagnosis and research investigations of individual patient phenotypes, genomes,

  12. Forecasting of interaction between bee propolis and protective antigenic domain in anthrax using the software and bioinformatics web servers

    Directory of Open Access Journals (Sweden)

    Elmira Mohammadi

    2017-01-01

    Full Text Available Background: Protective antigen of anthrax toxin, after touching the cell receptors, plays an important role in the pathogenesis of toxin. The purpose of this study was to investigate the interaction of anthrax toxin protective antigen and four great combination propolis included caffeic acid, benzyl caffeate, cinnamic acid and kaempferol using the softwares and bioinformatics web servers. Methods: Three-dimensional structure of protective antigen (receptor obtains from Protein Data Bank (PDB. Four of the main components from propolis were selected          as ligand and their 3D-structures were obtained from ChemSpider and ZINC     compound database. The interaction of each ligand and receptor was assessed                   by SwissDock server (http://www.swissdock.ch/ and BSP-SLIM server (http://zhanglab.ccmb.med.umich.edu/BSP-SLIM. Docking results appears with Fullfitness numbers (in kcal/mol. Identification of amino acids involved in ligand and receptor interaction, was performed using the Chimera software; UCSF Chimera program (http://www.cgl.ucsf.edu/. Results: The results of interaction between propolis components and protective antigen by BSP-SLIM server showed that the most interaction was related with benzyl caffeate, caffeic acid, kaempferol and cinnamic acid, respectively. Results for the desired ligand Interaction with protective antigen genes using SwissDock server showed that the caffeic acid had ΔG equals -9.10 kcal/mol and FullFitness equal to -993.16 kcal/mol respectively. The analysis of interaction between ligands with amino-acids of protective antigen indicated that the interaction of Caffeic acid whit Glutamic acid 117 had energy -15.5429 kcal/mol. Conclusion: Finding strong and safe inhibitors for anthrax toxin is very useful method for inhibiting its toxicity to cell. In this study the binding ability of four flavonoids to protective antigen was studied. Glutamic acid 117 is very effective

  13. Food web complexity and stability across habitat connectivity gradients.

    Science.gov (United States)

    LeCraw, Robin M; Kratina, Pavel; Srivastava, Diane S

    2014-12-01

    The effects of habitat connectivity on food webs have been studied both empirically and theoretically, yet the question of whether empirical results support theoretical predictions for any food web metric other than species richness has received little attention. Our synthesis brings together theory and empirical evidence for how habitat connectivity affects both food web stability and complexity. Food web stability is often predicted to be greatest at intermediate levels of connectivity, representing a compromise between the stabilizing effects of dispersal via rescue effects and prey switching, and the destabilizing effects of dispersal via regional synchronization of population dynamics. Empirical studies of food web stability generally support both this pattern and underlying mechanisms. Food chain length has been predicted to have both increasing and unimodal relationships with connectivity as a result of predators being constrained by the patch occupancy of their prey. Although both patterns have been documented empirically, the underlying mechanisms may differ from those predicted by models. In terms of other measures of food web complexity, habitat connectivity has been empirically found to generally increase link density but either reduce or have no effect on connectance, whereas a unimodal relationship is expected. In general, there is growing concordance between empirical patterns and theoretical predictions for some effects of habitat connectivity on food webs, but many predictions remain to be tested over a full connectivity gradient, and empirical metrics of complexity are rarely modeled. Closing these gaps will allow a deeper understanding of how natural and anthropogenic changes in connectivity can affect real food webs.

  14. Complexity aspects of web services composition

    OpenAIRE

    Ennaoui, Karima; Nourine, Lhouari; Toumani, Farouk

    2015-01-01

    The web service composition problem can be stated as follows: given a finite state machine M , representing a service business protocol, and a set of finite state machines R, representing the business protocols of existing services, the question is to check whether there is a simulation relation between M and the shuffle product closure of R. In fact the shuffle product is a subclass of the communication free petri net and basic parallel processes, for which the same problem of simulation is ...

  15. Enabling the democratization of the genomics revolution with a fully integrated web-based bioinformatics platform, Version 1.5 and 1.x.

    Energy Technology Data Exchange (ETDEWEB)

    2017-05-18

    EDGE bioinformatics was developed to help biologists process Next Generation Sequencing data (in the form of raw FASTQ files), even if they have little to no bioinformatics expertise. EDGE is a highly integrated and interactive web-based platform that is capable of running many of the standard analyses that biologists require for viral, bacterial/archaeal, and metagenomic samples. EDGE provides the following analytical workflows: quality trimming and host removal, assembly and annotation, comparisons against known references, taxonomy classification of reads and contigs, whole genome SNP-based phylogenetic analysis, and PCR analysis. EDGE provides an intuitive web-based interface for user input, allows users to visualize and interact with selected results (e.g. JBrowse genome browser), and generates a final detailed PDF report. Results in the form of tables, text files, graphic files, and PDFs can be downloaded. A user management system allows tracking of an individual’s EDGE runs, along with the ability to share, post publicly, delete, or archive their results.

  16. pocketZebra: a web-server for automated selection and classification of subfamily-specific binding sites by bioinformatic analysis of diverse protein families.

    Science.gov (United States)

    Suplatov, Dmitry; Kirilin, Eugeny; Arbatsky, Mikhail; Takhaveev, Vakil; Svedas, Vytas

    2014-07-01

    The new web-server pocketZebra implements the power of bioinformatics and geometry-based structural approaches to identify and rank subfamily-specific binding sites in proteins by functional significance, and select particular positions in the structure that determine selective accommodation of ligands. A new scoring function has been developed to annotate binding sites by the presence of the subfamily-specific positions in diverse protein families. pocketZebra web-server has multiple input modes to meet the needs of users with different experience in bioinformatics. The server provides on-site visualization of the results as well as off-line version of the output in annotated text format and as PyMol sessions ready for structural analysis. pocketZebra can be used to study structure-function relationship and regulation in large protein superfamilies, classify functionally important binding sites and annotate proteins with unknown function. The server can be used to engineer ligand-binding sites and allosteric regulation of enzymes, or implemented in a drug discovery process to search for potential molecular targets and novel selective inhibitors/effectors. The server, documentation and examples are freely available at http://biokinet.belozersky.msu.ru/pocketzebra and there are no login requirements. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. Aspects of Data Warehouse Technologies for Complex Web Data

    DEFF Research Database (Denmark)

    Thomsen, Christian

    This thesis is about aspects of specification and development of data warehouse technologies for complex web data. Today, large amounts of data exist in different web resources and in different formats. But it is often hard to analyze and query the often big and complex data or data about the data...... (i.e., metadata). It is therefore interesting to apply Data Warehouse (DW) technology to the data. But to apply DW technology to complex web data is not straightforward and the DW community faces new and exciting challenges. This thesis considers some of these challenges. The work leading...... to this thesis has primarily been done in relation to the project European Internet Accessibility Observatory (EIAO) where a data warehouse for accessibility data (roughly data about how usable web resources are for disabled users) has been specified and developed. But the results of the thesis can also...

  18. Bioinformatics and Cancer

    Science.gov (United States)

    Researchers take on challenges and opportunities to mine "Big Data" for answers to complex biological questions. Learn how bioinformatics uses advanced computing, mathematics, and technological platforms to store, manage, analyze, and understand data.

  19. Web life: ComplexityBlog.com

    Science.gov (United States)

    2010-02-01

    The site's homepage calls it "a repository of ideas and perspectives regarding the science, engineering and philosophy of complexity", and it pretty much does what it says on the tin. Part blog, part links archive, part library of modelling tips and tricks, the site is chock full of information that comes under the general heading of "complexity".

  20. Towards Plant Species Identification in Complex Samples: A Bioinformatics Pipeline for the Identification of Novel Nuclear Barcode Candidates.

    Directory of Open Access Journals (Sweden)

    Alexandre Angers-Loustau

    Full Text Available Monitoring of the food chain to fight fraud and protect consumer health relies on the availability of methods to correctly identify the species present in samples, for which DNA barcoding is a promising candidate. The nuclear genome is a rich potential source of barcode targets, but has been relatively unexploited until now. Here, we show the development and use of a bioinformatics pipeline that processes available genome sequences to automatically screen large numbers of input candidates, identifies novel nuclear barcode targets and designs associated primer pairs, according to a specific set of requirements. We applied this pipeline to identify novel barcodes for plant species, a kingdom for which the currently available solutions are known to be insufficient. We tested one of the identified primer pairs and show its capability to correctly identify the plant species in simple and complex samples, validating the output of our approach.

  1. Analyzing HT-SELEX data with the Galaxy Project tools--A web based bioinformatics platform for biomedical research.

    Science.gov (United States)

    Thiel, William H; Giangrande, Paloma H

    2016-03-15

    The development of DNA and RNA aptamers for research as well as diagnostic and therapeutic applications is a rapidly growing field. In the past decade, the process of identifying aptamers has been revolutionized with the advent of high-throughput sequencing (HTS). However, bioinformatics tools that enable the average molecular biologist to analyze these large datasets and expedite the identification of candidate aptamer sequences have been lagging behind the HTS revolution. The Galaxy Project was developed in order to efficiently analyze genome, exome, and transcriptome HTS data, and we have now applied these tools to aptamer HTS data. The Galaxy Project's public webserver is an open source collection of bioinformatics tools that are powerful, flexible, dynamic, and user friendly. The online nature of the Galaxy webserver and its graphical interface allow users to analyze HTS data without compiling code or installing multiple programs. Herein we describe how tools within the Galaxy webserver can be adapted to pre-process, compile, filter and analyze aptamer HTS data from multiple rounds of selection. Copyright © 2015 Elsevier Inc. All rights reserved.

  2. Applications and methods utilizing the Simple Semantic Web Architecture and Protocol (SSWAP for bioinformatics resource discovery and disparate data and service integration

    Directory of Open Access Journals (Sweden)

    Nelson Rex T

    2010-06-01

    Full Text Available Abstract Background Scientific data integration and computational service discovery are challenges for the bioinformatic community. This process is made more difficult by the separate and independent construction of biological databases, which makes the exchange of data between information resources difficult and labor intensive. A recently described semantic web protocol, the Simple Semantic Web Architecture and Protocol (SSWAP; pronounced "swap" offers the ability to describe data and services in a semantically meaningful way. We report how three major information resources (Gramene, SoyBase and the Legume Information System [LIS] used SSWAP to semantically describe selected data and web services. Methods We selected high-priority Quantitative Trait Locus (QTL, genomic mapping, trait, phenotypic, and sequence data and associated services such as BLAST for publication, data retrieval, and service invocation via semantic web services. Data and services were mapped to concepts and categories as implemented in legacy and de novo community ontologies. We used SSWAP to express these offerings in OWL Web Ontology Language (OWL, Resource Description Framework (RDF and eXtensible Markup Language (XML documents, which are appropriate for their semantic discovery and retrieval. We implemented SSWAP services to respond to web queries and return data. These services are registered with the SSWAP Discovery Server and are available for semantic discovery at http://sswap.info. Results A total of ten services delivering QTL information from Gramene were created. From SoyBase, we created six services delivering information about soybean QTLs, and seven services delivering genetic locus information. For LIS we constructed three services, two of which allow the retrieval of DNA and RNA FASTA sequences with the third service providing nucleic acid sequence comparison capability (BLAST. Conclusions The need for semantic integration technologies has preceded

  3. Experience of web-complex development of NPP thermophysical optimization

    International Nuclear Information System (INIS)

    Nikolaev, M.A.

    2014-01-01

    Current state of developing computation web complex (CWC) of thermophysical optimization of nuclear power plants is described. Main databases of CWC is realized on the MySQL platform. CWC information architecture, its functionality, optimization algorithms and CWC user interface are under consideration [ru

  4. Caught in the food web: complexity made simple?

    Directory of Open Access Journals (Sweden)

    Lawrence R. Pomeroy

    2001-12-01

    Full Text Available Several historically separate lines of food-web research are merging into a unified approach. Connections between microbial and metazoan food webs are significant. Interactions of control by predators, defenses against predation, and availability of organic and inorganic nutrition, not any one of these, shape food webs. The same principles of population ecology apply to metazoans and microorganisms, but microorganisms dominate the flux of energy in both marine and terrestrial systems. Microbial biomass often is a major fraction of total biomass, and very small organisms have a very large ratio of production and respiration to biomass. Assimilation efficiency of bacteria in natural systems is often not as high as in experimental systems, so more primary production is lost to microbial respiration than had been thought. Simulation has been a highly useful adjunct to experiments in both population theory and in studies of biogeochemical mass balance, but it does not fully encompass the complexity of real systems. A major challenge for the future is to find better ways to deal with the real complexity of food webs, both in modeling and in empirical observations, and to do a better job of bringing together conceptually the dynamics of population processes and biogeochemistry.

  5. The effect of query complexity on Web searching results

    Directory of Open Access Journals (Sweden)

    B.J. Jansen

    2000-01-01

    Full Text Available This paper presents findings from a study of the effects of query structure on retrieval by Web search services. Fifteen queries were selected from the transaction log of a major Web search service in simple query form with no advanced operators (e.g., Boolean operators, phrase operators, etc. and submitted to 5 major search engines - Alta Vista, Excite, FAST Search, Infoseek, and Northern Light. The results from these queries became the baseline data. The original 15 queries were then modified using the various search operators supported by each of the 5 search engines for a total of 210 queries. Each of these 210 queries was also submitted to the applicable search service. The results obtained were then compared to the baseline results. A total of 2,768 search results were returned by the set of all queries. In general, increasing the complexity of the queries had little effect on the results with a greater than 70% overlap in results, on average. Implications for the design of Web search services and directions for future research are discussed.

  6. Navigating the changing learning landscape: perspective from bioinformatics.ca

    OpenAIRE

    Brazas, Michelle D.; Ouellette, B. F. Francis

    2013-01-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable...

  7. Applications and Methods Utilizing the Simple Semantic Web Architecture and Protocol (SSWAP) for Bioinformatics Resource Discovery and Disparate Data and Service Integration

    Science.gov (United States)

    Scientific data integration and computational service discovery are challenges for the bioinformatic community. This process is made more difficult by the separate and independent construction of biological databases, which makes the exchange of scientific data between information resources difficu...

  8. Wikipedias: Collaborative web-based encyclopedias as complex networks

    Science.gov (United States)

    Zlatić, V.; Božičević, M.; Štefančić, H.; Domazet, M.

    2006-07-01

    Wikipedia is a popular web-based encyclopedia edited freely and collaboratively by its users. In this paper we present an analysis of Wikipedias in several languages as complex networks. The hyperlinks pointing from one Wikipedia article to another are treated as directed links while the articles represent the nodes of the network. We show that many network characteristics are common to different language versions of Wikipedia, such as their degree distributions, growth, topology, reciprocity, clustering, assortativity, path lengths, and triad significance profiles. These regularities, found in the ensemble of Wikipedias in different languages and of different sizes, point to the existence of a unique growth process. We also compare Wikipedias to other previously studied networks.

  9. Functional Complexity and Web Site Design : Evaluating the Online Presence of UNESCO World Heritage Sites

    NARCIS (Netherlands)

    De Jong, Menno D.T.; Wu, Yuguang

    2018-01-01

    Functional complexity is a widespread and underresearched phenomenon in Web sites. This article explores a specific case of functional complexity by analyzing the content of UNESCO World Heritage Web sites, which have to meet demands from both World Heritage and tourism perspectives. Based on a

  10. Web-based bioinformatics workflows for end-to-end RNA-seq data computation and analysis in agricultural animal species

    Science.gov (United States)

    Remarkable advances in next-generation sequencing (NGS) technologies, bioinformatics algorithms, and computational technologies have significantly accelerated genomic research. However, complicated NGS data analysis still remains as a major bottleneck. RNA-seq, as one of the major area in the NGS fi...

  11. MAJIQ-SPEL: Web-tool to interrogate classical and complex splicing variations from RNA-Seq data.

    Science.gov (United States)

    Green, Christopher J; Gazzara, Matthew R; Barash, Yoseph

    2017-09-11

    Analysis of RNA sequencing (RNA-Seq) data have highlighted the fact that most genes undergo alternative splicing (AS) and that these patterns are tightly regulated. Many of these events are complex, resulting in numerous possible isoforms that quickly become difficult to visualize, interpret, and experimentally validate. To address these challenges we developed MAJIQ-SPEL, a web-tool that takes as input local splicing variations (LSVs) quantified from RNA-Seq data and provides users with visualization and quantification of gene isoforms associated with those. Importantly, MAJIQ-SPEL is able to handle both classical (binary) and complex, non-binary, splicing variations. Using a matching primer design algorithm it also suggests users possible primers for experimental validation by RT-PCR and displays those, along with the matching protein domains affected by the LSV, on UCSC Genome Browser for further downstream analysis. Program and code will be available at http://majiq.biociphers.org/majiq-spel. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  12. Temporal Patterns in Sheep Fetal Heart Rate Variability Correlate to Systemic Cytokine Inflammatory Response: A Methodological Exploration of Monitoring Potential Using Complex Signals Bioinformatics.

    Directory of Open Access Journals (Sweden)

    Christophe L Herry

    Full Text Available Fetal inflammation is associated with increased risk for postnatal organ injuries. No means of early detection exist. We hypothesized that systemic fetal inflammation leads to distinct alterations of fetal heart rate variability (fHRV. We tested this hypothesis deploying a novel series of approaches from complex signals bioinformatics. In chronically instrumented near-term fetal sheep, we induced an inflammatory response with lipopolysaccharide (LPS injected intravenously (n = 10 observing it over 54 hours; seven additional fetuses served as controls. Fifty-one fHRV measures were determined continuously every 5 minutes using Continuous Individualized Multi-organ Variability Analysis (CIMVA. CIMVA creates an fHRV measures matrix across five signal-analytical domains, thus describing complementary properties of fHRV. We implemented, validated and tested methodology to obtain a subset of CIMVA fHRV measures that matched best the temporal profile of the inflammatory cytokine IL-6. In the LPS group, IL-6 peaked at 3 hours. For the LPS, but not control group, a sharp increase in standardized difference in variability with respect to baseline levels was observed between 3 h and 6 h abating to baseline levels, thus tracking closely the IL-6 inflammatory profile. We derived fHRV inflammatory index (FII consisting of 15 fHRV measures reflecting the fetal inflammatory response with prediction accuracy of 90%. Hierarchical clustering validated the selection of 14 out of 15 fHRV measures comprising FII. We developed methodology to identify a distinctive subset of fHRV measures that tracks inflammation over time. The broader potential of this bioinformatics approach is discussed to detect physiological responses encoded in HRV measures.

  13. Voids and the Cosmic Web: cosmic depression & spatial complexity

    NARCIS (Netherlands)

    van de Weygaert, Rien; Shandarin, S.; Saar, E.; Einasto, J.

    2016-01-01

    Voids form a prominent aspect of the Megaparsec distribution of galaxies and matter. Not only do theyrepresent a key constituent of the Cosmic Web, they also are one of the cleanest probesand measures of global cosmological parameters. The shape and evolution of voids are highly sensitive tothe

  14. Navigating the changing learning landscape: perspective from bioinformatics.ca.

    Science.gov (United States)

    Brazas, Michelle D; Ouellette, B F Francis

    2013-09-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable in the learning continuum. Bioinformatics.ca, which hosts the Canadian Bioinformatics Workshops, has blended more traditional learning styles with current online and social learning styles. Here we share our growing experiences over the past 12 years and look toward what the future holds for bioinformatics training programs.

  15. Report on the EMBER Project--A European Multimedia Bioinformatics Educational Resource

    Science.gov (United States)

    Attwood, Terri K.; Selimas, Ioannis; Buis, Rob; Altenburg, Ruud; Herzog, Robert; Ledent, Valerie; Ghita, Viorica; Fernandes, Pedro; Marques, Isabel; Brugman, Marc

    2005-01-01

    EMBER was a European project aiming to develop bioinformatics teaching materials on the Web and CD-ROM to help address the recognised skills shortage in bioinformatics. The project grew out of pilot work on the development of an interactive web-based bioinformatics tutorial and the desire to repackage that resource with the help of a professional…

  16. AMMOS2: a web server for protein–ligand–water complexes refinement via molecular mechanics

    Science.gov (United States)

    Labbé, Céline M.; Pencheva, Tania; Jereva, Dessislava; Desvillechabrol, Dimitri; Becot, Jérôme; Villoutreix, Bruno O.; Pajeva, Ilza

    2017-01-01

    Abstract AMMOS2 is an interactive web server for efficient computational refinement of protein–small organic molecule complexes. The AMMOS2 protocol employs atomic-level energy minimization of a large number of experimental or modeled protein–ligand complexes. The web server is based on the previously developed standalone software AMMOS (Automatic Molecular Mechanics Optimization for in silico Screening). AMMOS utilizes the physics-based force field AMMP sp4 and performs optimization of protein–ligand interactions at five levels of flexibility of the protein receptor. The new version 2 of AMMOS implemented in the AMMOS2 web server allows the users to include explicit water molecules and individual metal ions in the protein–ligand complexes during minimization. The web server provides comprehensive analysis of computed energies and interactive visualization of refined protein–ligand complexes. The ligands are ranked by the minimized binding energies allowing the users to perform additional analysis for drug discovery or chemical biology projects. The web server has been extensively tested on 21 diverse protein–ligand complexes. AMMOS2 minimization shows consistent improvement over the initial complex structures in terms of minimized protein–ligand binding energies and water positions optimization. The AMMOS2 web server is freely available without any registration requirement at the URL: http://drugmod.rpbs.univ-paris-diderot.fr/ammosHome.php. PMID:28486703

  17. AMMOS2: a web server for protein-ligand-water complexes refinement via molecular mechanics.

    Science.gov (United States)

    Labbé, Céline M; Pencheva, Tania; Jereva, Dessislava; Desvillechabrol, Dimitri; Becot, Jérôme; Villoutreix, Bruno O; Pajeva, Ilza; Miteva, Maria A

    2017-07-03

    AMMOS2 is an interactive web server for efficient computational refinement of protein-small organic molecule complexes. The AMMOS2 protocol employs atomic-level energy minimization of a large number of experimental or modeled protein-ligand complexes. The web server is based on the previously developed standalone software AMMOS (Automatic Molecular Mechanics Optimization for in silico Screening). AMMOS utilizes the physics-based force field AMMP sp4 and performs optimization of protein-ligand interactions at five levels of flexibility of the protein receptor. The new version 2 of AMMOS implemented in the AMMOS2 web server allows the users to include explicit water molecules and individual metal ions in the protein-ligand complexes during minimization. The web server provides comprehensive analysis of computed energies and interactive visualization of refined protein-ligand complexes. The ligands are ranked by the minimized binding energies allowing the users to perform additional analysis for drug discovery or chemical biology projects. The web server has been extensively tested on 21 diverse protein-ligand complexes. AMMOS2 minimization shows consistent improvement over the initial complex structures in terms of minimized protein-ligand binding energies and water positions optimization. The AMMOS2 web server is freely available without any registration requirement at the URL: http://drugmod.rpbs.univ-paris-diderot.fr/ammosHome.php. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  18. Bioinformatics of genomic association mapping

    NARCIS (Netherlands)

    Vaez Barzani, Ahmad

    2015-01-01

    In this thesis we present an overview of bioinformatics-based approaches for genomic association mapping, with emphasis on human quantitative traits and their contribution to complex diseases. We aim to provide a comprehensive walk-through of the classic steps of genomic association mapping

  19. The CAD-score web server: contact area-based comparison of structures and interfaces of proteins, nucleic acids and their complexes.

    Science.gov (United States)

    Olechnovič, Kliment; Venclovas, Ceslovas

    2014-07-01

    The Contact Area Difference score (CAD-score) web server provides a universal framework to compute and analyze discrepancies between different 3D structures of the same biological macromolecule or complex. The server accepts both single-subunit and multi-subunit structures and can handle all the major types of macromolecules (proteins, RNA, DNA and their complexes). It can perform numerical comparison of both structures and interfaces. In addition to entire structures and interfaces, the server can assess user-defined subsets. The CAD-score server performs both global and local numerical evaluations of structural differences between structures or interfaces. The results can be explored interactively using sortable tables of global scores, profiles of local errors, superimposed contact maps and 3D structure visualization. The web server could be used for tasks such as comparison of models with the native (reference) structure, comparison of X-ray structures of the same macromolecule obtained in different states (e.g. with and without a bound ligand), analysis of nuclear magnetic resonance (NMR) structural ensemble or structures obtained in the course of molecular dynamics simulation. The web server is freely accessible at: http://www.ibt.lt/bioinformatics/cad-score. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. BIRCH: A user-oriented, locally-customizable, bioinformatics system

    Science.gov (United States)

    Fristensky, Brian

    2007-01-01

    Background Molecular biologists need sophisticated analytical tools which often demand extensive computational resources. While finding, installing, and using these tools can be challenging, pipelining data from one program to the next is particularly awkward, especially when using web-based programs. At the same time, system administrators tasked with maintaining these tools do not always appreciate the needs of research biologists. Results BIRCH (Biological Research Computing Hierarchy) is an organizational framework for delivering bioinformatics resources to a user group, scaling from a single lab to a large institution. The BIRCH core distribution includes many popular bioinformatics programs, unified within the GDE (Genetic Data Environment) graphic interface. Of equal importance, BIRCH provides the system administrator with tools that simplify the job of managing a multiuser bioinformatics system across different platforms and operating systems. These include tools for integrating locally-installed programs and databases into BIRCH, and for customizing the local BIRCH system to meet the needs of the user base. BIRCH can also act as a front end to provide a unified view of already-existing collections of bioinformatics software. Documentation for the BIRCH and locally-added programs is merged in a hierarchical set of web pages. In addition to manual pages for individual programs, BIRCH tutorials employ step by step examples, with screen shots and sample files, to illustrate both the important theoretical and practical considerations behind complex analytical tasks. Conclusion BIRCH provides a versatile organizational framework for managing software and databases, and making these accessible to a user base. Because of its network-centric design, BIRCH makes it possible for any user to do any task from anywhere. PMID:17291351

  1. BIRCH: A user-oriented, locally-customizable, bioinformatics system

    Directory of Open Access Journals (Sweden)

    Fristensky Brian

    2007-02-01

    Full Text Available Abstract Background Molecular biologists need sophisticated analytical tools which often demand extensive computational resources. While finding, installing, and using these tools can be challenging, pipelining data from one program to the next is particularly awkward, especially when using web-based programs. At the same time, system administrators tasked with maintaining these tools do not always appreciate the needs of research biologists. Results BIRCH (Biological Research Computing Hierarchy is an organizational framework for delivering bioinformatics resources to a user group, scaling from a single lab to a large institution. The BIRCH core distribution includes many popular bioinformatics programs, unified within the GDE (Genetic Data Environment graphic interface. Of equal importance, BIRCH provides the system administrator with tools that simplify the job of managing a multiuser bioinformatics system across different platforms and operating systems. These include tools for integrating locally-installed programs and databases into BIRCH, and for customizing the local BIRCH system to meet the needs of the user base. BIRCH can also act as a front end to provide a unified view of already-existing collections of bioinformatics software. Documentation for the BIRCH and locally-added programs is merged in a hierarchical set of web pages. In addition to manual pages for individual programs, BIRCH tutorials employ step by step examples, with screen shots and sample files, to illustrate both the important theoretical and practical considerations behind complex analytical tasks. Conclusion BIRCH provides a versatile organizational framework for managing software and databases, and making these accessible to a user base. Because of its network-centric design, BIRCH makes it possible for any user to do any task from anywhere.

  2. Bioinformatic tools for PCR Primer design

    African Journals Online (AJOL)

    ES

    Bioinformatics is an emerging scientific discipline that uses information ... complex biological questions. ... and computer programs for various purposes of primer ..... polymerase chain reaction: Human Immunodeficiency Virus 1 model studies.

  3. The Food Web of Potter Cove (Antarctica): complexity, structure and function

    Science.gov (United States)

    Marina, Tomás I.; Salinas, Vanesa; Cordone, Georgina; Campana, Gabriela; Moreira, Eugenia; Deregibus, Dolores; Torre, Luciana; Sahade, Ricardo; Tatián, Marcos; Barrera Oro, Esteban; De Troch, Marleen; Doyle, Santiago; Quartino, María Liliana; Saravia, Leonardo A.; Momo, Fernando R.

    2018-01-01

    Knowledge of the food web structure and complexity are central to better understand ecosystem functioning. A food-web approach includes both species and energy flows among them, providing a natural framework for characterizing species' ecological roles and the mechanisms through which biodiversity influences ecosystem dynamics. Here we present for the first time a high-resolution food web for a marine ecosystem at Potter Cove (northern Antarctic Peninsula). Eleven food web properties were analyzed in order to document network complexity, structure and topology. We found a low linkage density (3.4), connectance (0.04) and omnivory percentage (45), as well as a short path length (1.8) and a low clustering coefficient (0.08). Furthermore, relating the structure of the food web to its dynamics, an exponential degree distribution (in- and out-links) was found. This suggests that the Potter Cove food web may be vulnerable if the most connected species became locally extinct. For two of the three more connected functional groups, competition overlap graphs imply high trophic interaction between demersal fish and niche specialization according to feeding strategies in amphipods. On the other hand, the prey overlap graph shows also that multiple energy pathways of carbon flux exist across benthic and pelagic habitats in the Potter Cove ecosystem. Although alternative food sources might add robustness to the web, network properties (low linkage density, connectance and omnivory) suggest fragility and potential trophic cascade effects.

  4. The Complexities of Developing Accessible Web Content for Mobile Devices

    Science.gov (United States)

    Hancock, Richard

    This paper is concerned with the development of accessible mobile content and the complexities that arise during this process. The paper gives an overview of the popularity and advantages of mobile devices before tackling the issues surrounding content development, particularly within an educational context. The paper concludes with an overview of an application that was developed for higher education students within a UK college that had a mobile counterpart, allowing the artefact to transcend the typical desktop environment.

  5. Management intensity and vegetation complexity affect web-building spiders and their prey.

    Science.gov (United States)

    Diehl, Eva; Mader, Viktoria L; Wolters, Volkmar; Birkhofer, Klaus

    2013-10-01

    Agricultural management and vegetation complexity affect arthropod diversity and may alter trophic interactions between predators and their prey. Web-building spiders are abundant generalist predators and important natural enemies of pests. We analyzed how management intensity (tillage, cutting of the vegetation, grazing by cattle, and synthetic and organic inputs) and vegetation complexity (plant species richness, vegetation height, coverage, and density) affect rarefied richness and composition of web-building spiders and their prey with respect to prey availability and aphid predation in 12 habitats, ranging from an uncut fallow to a conventionally managed maize field. Spiders and prey from webs were collected manually and the potential prey were quantified using sticky traps. The species richness of web-building spiders and the order richness of prey increased with plant diversity and vegetation coverage. Prey order richness was lower at tilled compared to no-till sites. Hemipterans (primarily aphids) were overrepresented, while dipterans, hymenopterans, and thysanopterans were underrepresented in webs compared to sticky traps. The per spider capture efficiency for aphids was higher at tilled than at no-till sites and decreased with vegetation complexity. After accounting for local densities, 1.8 times more aphids were captured at uncut compared to cut sites. Our results emphasize the functional role of web-building spiders in aphid predation, but suggest negative effects of cutting or harvesting. We conclude that reduced management intensity and increased vegetation complexity help to conserve local invertebrate diversity, and that web-building spiders at sites under low management intensity (e.g., semi-natural habitats) contribute to aphid suppression at the landscape scale.

  6. Biggest challenges in bioinformatics.

    Science.gov (United States)

    Fuller, Jonathan C; Khoueiry, Pierre; Dinkel, Holger; Forslund, Kristoffer; Stamatakis, Alexandros; Barry, Joseph; Budd, Aidan; Soldatos, Theodoros G; Linssen, Katja; Rajput, Abdul Mateen

    2013-04-01

    The third Heidelberg Unseminars in Bioinformatics (HUB) was held on 18th October 2012, at Heidelberg University, Germany. HUB brought together around 40 bioinformaticians from academia and industry to discuss the 'Biggest Challenges in Bioinformatics' in a 'World Café' style event.

  7. Biggest challenges in bioinformatics

    OpenAIRE

    Fuller, Jonathan C; Khoueiry, Pierre; Dinkel, Holger; Forslund, Kristoffer; Stamatakis, Alexandros; Barry, Joseph; Budd, Aidan; Soldatos, Theodoros G; Linssen, Katja; Rajput, Abdul Mateen

    2013-01-01

    The third Heidelberg Unseminars in Bioinformatics (HUB) was held in October at Heidelberg University in Germany. HUB brought together around 40 bioinformaticians from academia and industry to discuss the ‘Biggest Challenges in Bioinformatics' in a ‘World Café' style event.

  8. Application of machine learning methods in bioinformatics

    Science.gov (United States)

    Yang, Haoyu; An, Zheng; Zhou, Haotian; Hou, Yawen

    2018-05-01

    Faced with the development of bioinformatics, high-throughput genomic technology have enabled biology to enter the era of big data. [1] Bioinformatics is an interdisciplinary, including the acquisition, management, analysis, interpretation and application of biological information, etc. It derives from the Human Genome Project. The field of machine learning, which aims to develop computer algorithms that improve with experience, holds promise to enable computers to assist humans in the analysis of large, complex data sets.[2]. This paper analyzes and compares various algorithms of machine learning and their applications in bioinformatics.

  9. A bioinformatics potpourri.

    Science.gov (United States)

    Schönbach, Christian; Li, Jinyan; Ma, Lan; Horton, Paul; Sjaugi, Muhammad Farhan; Ranganathan, Shoba

    2018-01-19

    The 16th International Conference on Bioinformatics (InCoB) was held at Tsinghua University, Shenzhen from September 20 to 22, 2017. The annual conference of the Asia-Pacific Bioinformatics Network featured six keynotes, two invited talks, a panel discussion on big data driven bioinformatics and precision medicine, and 66 oral presentations of accepted research articles or posters. Fifty-seven articles comprising a topic assortment of algorithms, biomolecular networks, cancer and disease informatics, drug-target interactions and drug efficacy, gene regulation and expression, imaging, immunoinformatics, metagenomics, next generation sequencing for genomics and transcriptomics, ontologies, post-translational modification, and structural bioinformatics are the subject of this editorial for the InCoB2017 supplement issues in BMC Genomics, BMC Bioinformatics, BMC Systems Biology and BMC Medical Genomics. New Delhi will be the location of InCoB2018, scheduled for September 26-28, 2018.

  10. Parasites affect food web structure primarily through increased diversity and complexity.

    Directory of Open Access Journals (Sweden)

    Jennifer A Dunne

    Full Text Available Comparative research on food web structure has revealed generalities in trophic organization, produced simple models, and allowed assessment of robustness to species loss. These studies have mostly focused on free-living species. Recent research has suggested that inclusion of parasites alters structure. We assess whether such changes in network structure result from unique roles and traits of parasites or from changes to diversity and complexity. We analyzed seven highly resolved food webs that include metazoan parasite data. Our analyses show that adding parasites usually increases link density and connectance (simple measures of complexity, particularly when including concomitant links (links from predators to parasites of their prey. However, we clarify prior claims that parasites "dominate" food web links. Although parasites can be involved in a majority of links, in most cases classic predation links outnumber classic parasitism links. Regarding network structure, observed changes in degree distributions, 14 commonly studied metrics, and link probabilities are consistent with scale-dependent changes in structure associated with changes in diversity and complexity. Parasite and free-living species thus have similar effects on these aspects of structure. However, two changes point to unique roles of parasites. First, adding parasites and concomitant links strongly alters the frequency of most motifs of interactions among three taxa, reflecting parasites' roles as resources for predators of their hosts, driven by trophic intimacy with their hosts. Second, compared to free-living consumers, many parasites' feeding niches appear broader and less contiguous, which may reflect complex life cycles and small body sizes. This study provides new insights about generic versus unique impacts of parasites on food web structure, extends the generality of food web theory, gives a more rigorous framework for assessing the impact of any species on trophic

  11. ComplexContact: a web server for inter-protein contact prediction using deep learning

    KAUST Repository

    Zeng, Hong; Wang, Sheng; Zhou, Tianming; Zhao, Feifeng; Li, Xiufeng; Wu, Qing; Xu, Jinbo

    2018-01-01

    ComplexContact (http://raptorx2.uchicago.edu/ComplexContact/) is a web server for sequence-based interfacial residue-residue contact prediction of a putative protein complex. Interfacial residue-residue contacts are critical for understanding how proteins form complex and interact at residue level. When receiving a pair of protein sequences, ComplexContact first searches for their sequence homologs and builds two paired multiple sequence alignments (MSA), then it applies co-evolution analysis and a CASP-winning deep learning (DL) method to predict interfacial contacts from paired MSAs and visualizes the prediction as an image. The DL method was originally developed for intra-protein contact prediction and performed the best in CASP12. Our large-scale experimental test further shows that ComplexContact greatly outperforms pure co-evolution methods for inter-protein contact prediction, regardless of the species.

  12. ComplexContact: a web server for inter-protein contact prediction using deep learning

    KAUST Repository

    Zeng, Hong

    2018-05-20

    ComplexContact (http://raptorx2.uchicago.edu/ComplexContact/) is a web server for sequence-based interfacial residue-residue contact prediction of a putative protein complex. Interfacial residue-residue contacts are critical for understanding how proteins form complex and interact at residue level. When receiving a pair of protein sequences, ComplexContact first searches for their sequence homologs and builds two paired multiple sequence alignments (MSA), then it applies co-evolution analysis and a CASP-winning deep learning (DL) method to predict interfacial contacts from paired MSAs and visualizes the prediction as an image. The DL method was originally developed for intra-protein contact prediction and performed the best in CASP12. Our large-scale experimental test further shows that ComplexContact greatly outperforms pure co-evolution methods for inter-protein contact prediction, regardless of the species.

  13. ComplexContact: a web server for inter-protein contact prediction using deep learning.

    Science.gov (United States)

    Zeng, Hong; Wang, Sheng; Zhou, Tianming; Zhao, Feifeng; Li, Xiufeng; Wu, Qing; Xu, Jinbo

    2018-05-22

    ComplexContact (http://raptorx2.uchicago.edu/ComplexContact/) is a web server for sequence-based interfacial residue-residue contact prediction of a putative protein complex. Interfacial residue-residue contacts are critical for understanding how proteins form complex and interact at residue level. When receiving a pair of protein sequences, ComplexContact first searches for their sequence homologs and builds two paired multiple sequence alignments (MSA), then it applies co-evolution analysis and a CASP-winning deep learning (DL) method to predict interfacial contacts from paired MSAs and visualizes the prediction as an image. The DL method was originally developed for intra-protein contact prediction and performed the best in CASP12. Our large-scale experimental test further shows that ComplexContact greatly outperforms pure co-evolution methods for inter-protein contact prediction, regardless of the species.

  14. Semantic Web Requirements through Web Mining Techniques

    OpenAIRE

    Hassanzadeh, Hamed; Keyvanpour, Mohammad Reza

    2012-01-01

    In recent years, Semantic web has become a topic of active research in several fields of computer science and has applied in a wide range of domains such as bioinformatics, life sciences, and knowledge management. The two fast-developing research areas semantic web and web mining can complement each other and their different techniques can be used jointly or separately to solve the issues in both areas. In addition, since shifting from current web to semantic web mainly depends on the enhance...

  15. Deep learning in bioinformatics.

    Science.gov (United States)

    Min, Seonwoo; Lee, Byunghan; Yoon, Sungroh

    2017-09-01

    In the era of big data, transformation of biomedical big data into valuable knowledge has been one of the most important challenges in bioinformatics. Deep learning has advanced rapidly since the early 2000s and now demonstrates state-of-the-art performance in various fields. Accordingly, application of deep learning in bioinformatics to gain insight from data has been emphasized in both academia and industry. Here, we review deep learning in bioinformatics, presenting examples of current research. To provide a useful and comprehensive perspective, we categorize research both by the bioinformatics domain (i.e. omics, biomedical imaging, biomedical signal processing) and deep learning architecture (i.e. deep neural networks, convolutional neural networks, recurrent neural networks, emergent architectures) and present brief descriptions of each study. Additionally, we discuss theoretical and practical issues of deep learning in bioinformatics and suggest future research directions. We believe that this review will provide valuable insights and serve as a starting point for researchers to apply deep learning approaches in their bioinformatics studies. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  16. The secondary metabolite bioinformatics portal

    DEFF Research Database (Denmark)

    Weber, Tilmann; Kim, Hyun Uk

    2016-01-01

    . In this context, this review gives a summary of tools and databases that currently are available to mine, identify and characterize natural product biosynthesis pathways and their producers based on ‘omics data. A web portal called Secondary Metabolite Bioinformatics Portal (SMBP at http...... analytical and chemical methods gave access to this group of compounds, nowadays genomics-based methods offer complementary approaches to find, identify and characterize such molecules. This paradigm shift also resulted in a high demand for computational tools to assist researchers in their daily work......Natural products are among the most important sources of lead molecules for drug discovery. With the development of affordable whole-genome sequencing technologies and other ‘omics tools, the field of natural products research is currently undergoing a shift in paradigms. While, for decades, mainly...

  17. Web mapping system for complex processing and visualization of environmental geospatial datasets

    Science.gov (United States)

    Titov, Alexander; Gordov, Evgeny; Okladnikov, Igor

    2016-04-01

    Environmental geospatial datasets (meteorological observations, modeling and reanalysis results, etc.) are used in numerous research applications. Due to a number of objective reasons such as inherent heterogeneity of environmental datasets, big dataset volume, complexity of data models used, syntactic and semantic differences that complicate creation and use of unified terminology, the development of environmental geodata access, processing and visualization services as well as client applications turns out to be quite a sophisticated task. According to general INSPIRE requirements to data visualization geoportal web applications have to provide such standard functionality as data overview, image navigation, scrolling, scaling and graphical overlay, displaying map legends and corresponding metadata information. It should be noted that modern web mapping systems as integrated geoportal applications are developed based on the SOA and might be considered as complexes of interconnected software tools for working with geospatial data. In the report a complex web mapping system including GIS web client and corresponding OGC services for working with geospatial (NetCDF, PostGIS) dataset archive is presented. There are three basic tiers of the GIS web client in it: 1. Tier of geospatial metadata retrieved from central MySQL repository and represented in JSON format 2. Tier of JavaScript objects implementing methods handling: --- NetCDF metadata --- Task XML object for configuring user calculations, input and output formats --- OGC WMS/WFS cartographical services 3. Graphical user interface (GUI) tier representing JavaScript objects realizing web application business logic Metadata tier consists of a number of JSON objects containing technical information describing geospatial datasets (such as spatio-temporal resolution, meteorological parameters, valid processing methods, etc). The middleware tier of JavaScript objects implementing methods for handling geospatial

  18. iview: an interactive WebGL visualizer for protein-ligand complex.

    Science.gov (United States)

    Li, Hongjian; Leung, Kwong-Sak; Nakane, Takanori; Wong, Man-Hon

    2014-02-25

    Visualization of protein-ligand complex plays an important role in elaborating protein-ligand interactions and aiding novel drug design. Most existing web visualizers either rely on slow software rendering, or lack virtual reality support. The vital feature of macromolecular surface construction is also unavailable. We have developed iview, an easy-to-use interactive WebGL visualizer of protein-ligand complex. It exploits hardware acceleration rather than software rendering. It features three special effects in virtual reality settings, namely anaglyph, parallax barrier and oculus rift, resulting in visually appealing identification of intermolecular interactions. It supports four surface representations including Van der Waals surface, solvent excluded surface, solvent accessible surface and molecular surface. Moreover, based on the feature-rich version of iview, we have also developed a neat and tailor-made version specifically for our istar web platform for protein-ligand docking purpose. This demonstrates the excellent portability of iview. Using innovative 3D techniques, we provide a user friendly visualizer that is not intended to compete with professional visualizers, but to enable easy accessibility and platform independence.

  19. Examining the Potential of Web-Based Multimedia to Support Complex Fine Motor Skill Learning: An Empirical Study

    Science.gov (United States)

    Papastergiou, Marina; Pollatou, Elisana; Theofylaktou, Ioannis; Karadimou, Konstantina

    2014-01-01

    Research on the utilization of the Web for complex fine motor skill learning that involves whole body movements is still scarce. The aim of this study was to evaluate the impact of the introduction of a multimedia web-based learning environment, which was targeted at a rhythmic gymnastics routine consisting of eight fine motor skills, into an…

  20. Food-Web Complexity in Guaymas Basin Hydrothermal Vents and Cold Seeps.

    Directory of Open Access Journals (Sweden)

    Marie Portail

    Full Text Available In the Guaymas Basin, the presence of cold seeps and hydrothermal vents in close proximity, similar sedimentary settings and comparable depths offers a unique opportunity to assess and compare the functioning of these deep-sea chemosynthetic ecosystems. The food webs of five seep and four vent assemblages were studied using stable carbon and nitrogen isotope analyses. Although the two ecosystems shared similar potential basal sources, their food webs differed: seeps relied predominantly on methanotrophy and thiotrophy via the Calvin-Benson-Bassham (CBB cycle and vents on petroleum-derived organic matter and thiotrophy via the CBB and reductive tricarboxylic acid (rTCA cycles. In contrast to symbiotic species, the heterotrophic fauna exhibited high trophic flexibility among assemblages, suggesting weak trophic links to the metabolic diversity of chemosynthetic primary producers. At both ecosystems, food webs did not appear to be organised through predator-prey links but rather through weak trophic relationships among co-occurring species. Examples of trophic or spatial niche differentiation highlighted the importance of species-sorting processes within chemosynthetic ecosystems. Variability in food web structure, addressed through Bayesian metrics, revealed consistent trends across ecosystems. Food-web complexity significantly decreased with increasing methane concentrations, a common proxy for the intensity of seep and vent fluid fluxes. Although high fluid-fluxes have the potential to enhance primary productivity, they generate environmental constraints that may limit microbial diversity, colonisation of consumers and the structuring role of competitive interactions, leading to an overall reduction of food-web complexity and an increase in trophic redundancy. Heterogeneity provided by foundation species was identified as an additional structuring factor. According to their biological activities, foundation species may have the potential to

  1. The GMOD Drupal Bioinformatic Server Framework

    Science.gov (United States)

    Papanicolaou, Alexie; Heckel, David G.

    2010-01-01

    Motivation: Next-generation sequencing technologies have led to the widespread use of -omic applications. As a result, there is now a pronounced bioinformatic bottleneck. The general model organism database (GMOD) tool kit (http://gmod.org) has produced a number of resources aimed at addressing this issue. It lacks, however, a robust online solution that can deploy heterogeneous data and software within a Web content management system (CMS). Results: We present a bioinformatic framework for the Drupal CMS. It consists of three modules. First, GMOD-DBSF is an application programming interface module for the Drupal CMS that simplifies the programming of bioinformatic Drupal modules. Second, the Drupal Bioinformatic Software Bench (biosoftware_bench) allows for a rapid and secure deployment of bioinformatic software. An innovative graphical user interface (GUI) guides both use and administration of the software, including the secure provision of pre-publication datasets. Third, we present genes4all_experiment, which exemplifies how our work supports the wider research community. Conclusion: Given the infrastructure presented here, the Drupal CMS may become a powerful new tool set for bioinformaticians. The GMOD-DBSF base module is an expandable community resource that decreases development time of Drupal modules for bioinformatics. The biosoftware_bench module can already enhance biologists' ability to mine their own data. The genes4all_experiment module has already been responsible for archiving of more than 150 studies of RNAi from Lepidoptera, which were previously unpublished. Availability and implementation: Implemented in PHP and Perl. Freely available under the GNU Public License 2 or later from http://gmod-dbsf.googlecode.com Contact: alexie@butterflybase.org PMID:20971988

  2. The GMOD Drupal bioinformatic server framework.

    Science.gov (United States)

    Papanicolaou, Alexie; Heckel, David G

    2010-12-15

    Next-generation sequencing technologies have led to the widespread use of -omic applications. As a result, there is now a pronounced bioinformatic bottleneck. The general model organism database (GMOD) tool kit (http://gmod.org) has produced a number of resources aimed at addressing this issue. It lacks, however, a robust online solution that can deploy heterogeneous data and software within a Web content management system (CMS). We present a bioinformatic framework for the Drupal CMS. It consists of three modules. First, GMOD-DBSF is an application programming interface module for the Drupal CMS that simplifies the programming of bioinformatic Drupal modules. Second, the Drupal Bioinformatic Software Bench (biosoftware_bench) allows for a rapid and secure deployment of bioinformatic software. An innovative graphical user interface (GUI) guides both use and administration of the software, including the secure provision of pre-publication datasets. Third, we present genes4all_experiment, which exemplifies how our work supports the wider research community. Given the infrastructure presented here, the Drupal CMS may become a powerful new tool set for bioinformaticians. The GMOD-DBSF base module is an expandable community resource that decreases development time of Drupal modules for bioinformatics. The biosoftware_bench module can already enhance biologists' ability to mine their own data. The genes4all_experiment module has already been responsible for archiving of more than 150 studies of RNAi from Lepidoptera, which were previously unpublished. Implemented in PHP and Perl. Freely available under the GNU Public License 2 or later from http://gmod-dbsf.googlecode.com.

  3. Food-web complexity across hydrothermal vents on the Azores triple junction

    Science.gov (United States)

    Portail, Marie; Brandily, Christophe; Cathalot, Cécile; Colaço, Ana; Gélinas, Yves; Husson, Bérengère; Sarradin, Pierre-Marie; Sarrazin, Jozée

    2018-01-01

    The assessment and comparison of food webs across various hydrothermal vent sites can enhance our understanding of ecological processes involved in the structure and function of biodiversity. The Menez Gwen, Lucky Strike and Rainbow vent fields are located on the Azores triple junction of the Mid-Atlantic Ridge. These fields have distinct depths (from 850 to 2320 m) and geological contexts (basaltic and ultramafic), but share similar faunal assemblages defined by the presence of foundation species that include Bathymodiolus azoricus, alvinocarid shrimp and gastropods. We compared the food webs of 13 faunal assemblages at these three sites using carbon and nitrogen stable isotope analyses (SIA). Results showed that photosynthesis-derived organic matter is a negligible basal source for vent food webs, at all depths. The contribution of methanotrophy versus autotrophy based on Calvin-Benson-Bassham (CBB) or reductive tricarboxylic acid (rTCA) cycles varied between and within vent fields according to the concentrations of reduced compounds (e.g. CH4, H2S). Species that were common to vent fields showed high trophic flexibility, suggesting weak trophic links to the metabolism of chemosynthetic primary producers. At the community level, a comparison of SIA-derived metrics between mussel assemblages from two vent fields (Menez Gwen & Lucky Strike) showed that the functional structure of food webs was highly similar in terms of basal niche diversification, functional specialization and redundancy. Coupling SIA to functional trait approaches included more variability within the analyses, but the functional structures were still highly comparable. These results suggest that despite variable environmental conditions (physico-chemical factors and basal sources) and faunal community structure, functional complexity remained relatively constant among mussel assemblages. This functional similarity may be favoured by the propensity of species to adapt to fluid variations and

  4. Introduction to bioinformatics.

    Science.gov (United States)

    Can, Tolga

    2014-01-01

    Bioinformatics is an interdisciplinary field mainly involving molecular biology and genetics, computer science, mathematics, and statistics. Data intensive, large-scale biological problems are addressed from a computational point of view. The most common problems are modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution usually involves the following steps: Collect statistics from biological data. Build a computational model. Solve a computational modeling problem. Test and evaluate a computational algorithm. This chapter gives a brief introduction to bioinformatics by first providing an introduction to biological terminology and then discussing some classical bioinformatics problems organized by the types of data sources. Sequence analysis is the analysis of DNA and protein sequences for clues regarding function and includes subproblems such as identification of homologs, multiple sequence alignment, searching sequence patterns, and evolutionary analyses. Protein structures are three-dimensional data and the associated problems are structure prediction (secondary and tertiary), analysis of protein structures for clues regarding function, and structural alignment. Gene expression data is usually represented as matrices and analysis of microarray data mostly involves statistics analysis, classification, and clustering approaches. Biological networks such as gene regulatory networks, metabolic pathways, and protein-protein interaction networks are usually modeled as graphs and graph theoretic approaches are used to solve associated problems such as construction and analysis of large-scale networks.

  5. COEUS: "semantic web in a box" for biomedical applications.

    Science.gov (United States)

    Lopes, Pedro; Oliveira, José Luís

    2012-12-17

    As the "omics" revolution unfolds, the growth in data quantity and diversity is bringing about the need for pioneering bioinformatics software, capable of significantly improving the research workflow. To cope with these computer science demands, biomedical software engineers are adopting emerging semantic web technologies that better suit the life sciences domain. The latter's complex relationships are easily mapped into semantic web graphs, enabling a superior understanding of collected knowledge. Despite increased awareness of semantic web technologies in bioinformatics, their use is still limited. COEUS is a new semantic web framework, aiming at a streamlined application development cycle and following a "semantic web in a box" approach. The framework provides a single package including advanced data integration and triplification tools, base ontologies, a web-oriented engine and a flexible exploration API. Resources can be integrated from heterogeneous sources, including CSV and XML files or SQL and SPARQL query results, and mapped directly to one or more ontologies. Advanced interoperability features include REST services, a SPARQL endpoint and LinkedData publication. These enable the creation of multiple applications for web, desktop or mobile environments, and empower a new knowledge federation layer. The platform, targeted at biomedical application developers, provides a complete skeleton ready for rapid application deployment, enhancing the creation of new semantic information systems. COEUS is available as open source at http://bioinformatics.ua.pt/coeus/.

  6. EpiCollect+: linking smartphones to web applications for complex data collection projects.

    Science.gov (United States)

    Aanensen, David M; Huntley, Derek M; Menegazzo, Mirko; Powell, Chris I; Spratt, Brian G

    2014-01-01

    Previously, we have described the development of the generic mobile phone data gathering tool, EpiCollect, and an associated web application, providing two-way communication between multiple data gatherers and a project database. This software only allows data collection on the phone using a single questionnaire form that is tailored to the needs of the user (including a single GPS point and photo per entry), whereas many applications require a more complex structure, allowing users to link a series of forms in a linear or branching hierarchy, along with the addition of any number of media types accessible from smartphones and/or tablet devices (e.g., GPS, photos, videos, sound clips and barcode scanning). A much enhanced version of EpiCollect has been developed (EpiCollect+). The individual data collection forms in EpiCollect+ provide more design complexity than the single form used in EpiCollect, and the software allows the generation of complex data collection projects through the ability to link many forms together in a linear (or branching) hierarchy. Furthermore, EpiCollect+ allows the collection of multiple media types as well as standard text fields, increased data validation and form logic. The entire process of setting up a complex mobile phone data collection project to the specification of a user (project and form definitions) can be undertaken at the EpiCollect+ website using a simple 'drag and drop' procedure, with visualisation of the data gathered using Google Maps and charts at the project website. EpiCollect+ is suitable for situations where multiple users transmit complex data by mobile phone (or other Android devices) to a single project web database and is already being used for a range of field projects, particularly public health projects in sub-Saharan Africa. However, many uses can be envisaged from education, ecology and epidemiology to citizen science.

  7. Bioboxes: standardised containers for interchangeable bioinformatics software.

    Science.gov (United States)

    Belmann, Peter; Dröge, Johannes; Bremges, Andreas; McHardy, Alice C; Sczyrba, Alexander; Barton, Michael D

    2015-01-01

    Software is now both central and essential to modern biology, yet lack of availability, difficult installations, and complex user interfaces make software hard to obtain and use. Containerisation, as exemplified by the Docker platform, has the potential to solve the problems associated with sharing software. We propose bioboxes: containers with standardised interfaces to make bioinformatics software interchangeable.

  8. Bioinformatics for Exploration

    Science.gov (United States)

    Johnson, Kathy A.

    2006-01-01

    For the purpose of this paper, bioinformatics is defined as the application of computer technology to the management of biological information. It can be thought of as the science of developing computer databases and algorithms to facilitate and expedite biological research. This is a crosscutting capability that supports nearly all human health areas ranging from computational modeling, to pharmacodynamics research projects, to decision support systems within autonomous medical care. Bioinformatics serves to increase the efficiency and effectiveness of the life sciences research program. It provides data, information, and knowledge capture which further supports management of the bioastronautics research roadmap - identifying gaps that still remain and enabling the determination of which risks have been addressed.

  9. Advance in structural bioinformatics

    CERN Document Server

    Wei, Dongqing; Zhao, Tangzhen; Dai, Hao

    2014-01-01

    This text examines in detail mathematical and physical modeling, computational methods and systems for obtaining and analyzing biological structures, using pioneering research cases as examples. As such, it emphasizes programming and problem-solving skills. It provides information on structure bioinformatics at various levels, with individual chapters covering introductory to advanced aspects, from fundamental methods and guidelines on acquiring and analyzing genomics and proteomics sequences, the structures of protein, DNA and RNA, to the basics of physical simulations and methods for conform

  10. Crowdsourcing for bioinformatics.

    Science.gov (United States)

    Good, Benjamin M; Su, Andrew I

    2013-08-15

    Bioinformatics is faced with a variety of problems that require human involvement. Tasks like genome annotation, image analysis, knowledge-base population and protein structure determination all benefit from human input. In some cases, people are needed in vast quantities, whereas in others, we need just a few with rare abilities. Crowdsourcing encompasses an emerging collection of approaches for harnessing such distributed human intelligence. Recently, the bioinformatics community has begun to apply crowdsourcing in a variety of contexts, yet few resources are available that describe how these human-powered systems work and how to use them effectively in scientific domains. Here, we provide a framework for understanding and applying several different types of crowdsourcing. The framework considers two broad classes: systems for solving large-volume 'microtasks' and systems for solving high-difficulty 'megatasks'. Within these classes, we discuss system types, including volunteer labor, games with a purpose, microtask markets and open innovation contests. We illustrate each system type with successful examples in bioinformatics and conclude with a guide for matching problems to crowdsourcing solutions that highlights the positives and negatives of different approaches.

  11. Phylogenetic trees in bioinformatics

    Energy Technology Data Exchange (ETDEWEB)

    Burr, Tom L [Los Alamos National Laboratory

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  12. Data mining for bioinformatics applications

    CERN Document Server

    Zengyou, He

    2015-01-01

    Data Mining for Bioinformatics Applications provides valuable information on the data mining methods have been widely used for solving real bioinformatics problems, including problem definition, data collection, data preprocessing, modeling, and validation. The text uses an example-based method to illustrate how to apply data mining techniques to solve real bioinformatics problems, containing 45 bioinformatics problems that have been investigated in recent research. For each example, the entire data mining process is described, ranging from data preprocessing to modeling and result validation. Provides valuable information on the data mining methods have been widely used for solving real bioinformatics problems Uses an example-based method to illustrate how to apply data mining techniques to solve real bioinformatics problems Contains 45 bioinformatics problems that have been investigated in recent research.

  13. Planning bioinformatics workflows using an expert system

    Science.gov (United States)

    Chen, Xiaoling; Chang, Jeffrey T.

    2017-01-01

    Abstract Motivation: Bioinformatic analyses are becoming formidably more complex due to the increasing number of steps required to process the data, as well as the proliferation of methods that can be used in each step. To alleviate this difficulty, pipelines are commonly employed. However, pipelines are typically implemented to automate a specific analysis, and thus are difficult to use for exploratory analyses requiring systematic changes to the software or parameters used. Results: To automate the development of pipelines, we have investigated expert systems. We created the Bioinformatics ExperT SYstem (BETSY) that includes a knowledge base where the capabilities of bioinformatics software is explicitly and formally encoded. BETSY is a backwards-chaining rule-based expert system comprised of a data model that can capture the richness of biological data, and an inference engine that reasons on the knowledge base to produce workflows. Currently, the knowledge base is populated with rules to analyze microarray and next generation sequencing data. We evaluated BETSY and found that it could generate workflows that reproduce and go beyond previously published bioinformatics results. Finally, a meta-investigation of the workflows generated from the knowledge base produced a quantitative measure of the technical burden imposed by each step of bioinformatics analyses, revealing the large number of steps devoted to the pre-processing of data. In sum, an expert system approach can facilitate exploratory bioinformatic analysis by automating the development of workflows, a task that requires significant domain expertise. Availability and Implementation: https://github.com/jefftc/changlab Contact: jeffrey.t.chang@uth.tmc.edu PMID:28052928

  14. Microsoft Biology Initiative: .NET Bioinformatics Platform and Tools

    Science.gov (United States)

    Diaz Acosta, B.

    2011-01-01

    The Microsoft Biology Initiative (MBI) is an effort in Microsoft Research to bring new technology and tools to the area of bioinformatics and biology. This initiative is comprised of two primary components, the Microsoft Biology Foundation (MBF) and the Microsoft Biology Tools (MBT). MBF is a language-neutral bioinformatics toolkit built as an extension to the Microsoft .NET Framework—initially aimed at the area of Genomics research. Currently, it implements a range of parsers for common bioinformatics file formats; a range of algorithms for manipulating DNA, RNA, and protein sequences; and a set of connectors to biological web services such as NCBI BLAST. MBF is available under an open source license, and executables, source code, demo applications, documentation and training materials are freely downloadable from http://research.microsoft.com/bio. MBT is a collection of tools that enable biology and bioinformatics researchers to be more productive in making scientific discoveries.

  15. Analysis of co-occurrence toponyms in web pages based on complex networks

    Science.gov (United States)

    Zhong, Xiang; Liu, Jiajun; Gao, Yong; Wu, Lun

    2017-01-01

    A large number of geographical toponyms exist in web pages and other documents, providing abundant geographical resources for GIS. It is very common for toponyms to co-occur in the same documents. To investigate these relations associated with geographic entities, a novel complex network model for co-occurrence toponyms is proposed. Then, 12 toponym co-occurrence networks are constructed from the toponym sets extracted from the People's Daily Paper documents of 2010. It is found that two toponyms have a high co-occurrence probability if they are at the same administrative level or if they possess a part-whole relationship. By applying complex network analysis methods to toponym co-occurrence networks, we find the following characteristics. (1) The navigation vertices of the co-occurrence networks can be found by degree centrality analysis. (2) The networks express strong cluster characteristics, and it takes only several steps to reach one vertex from another one, implying that the networks are small-world graphs. (3) The degree distribution satisfies the power law with an exponent of 1.7, so the networks are free-scale. (4) The networks are disassortative and have similar assortative modes, with assortative exponents of approximately 0.18 and assortative indexes less than 0. (5) The frequency of toponym co-occurrence is weakly negatively correlated with geographic distance, but more strongly negatively correlated with administrative hierarchical distance. Considering the toponym frequencies and co-occurrence relationships, a novel method based on link analysis is presented to extract the core toponyms from web pages. This method is suitable and effective for geographical information retrieval.

  16. LXtoo: an integrated live Linux distribution for the bioinformatics community.

    Science.gov (United States)

    Yu, Guangchuang; Wang, Li-Gen; Meng, Xiao-Hua; He, Qing-Yu

    2012-07-19

    Recent advances in high-throughput technologies dramatically increase biological data generation. However, many research groups lack computing facilities and specialists. This is an obstacle that remains to be addressed. Here, we present a Linux distribution, LXtoo, to provide a flexible computing platform for bioinformatics analysis. Unlike most of the existing live Linux distributions for bioinformatics limiting their usage to sequence analysis and protein structure prediction, LXtoo incorporates a comprehensive collection of bioinformatics software, including data mining tools for microarray and proteomics, protein-protein interaction analysis, and computationally complex tasks like molecular dynamics. Moreover, most of the programs have been configured and optimized for high performance computing. LXtoo aims to provide well-supported computing environment tailored for bioinformatics research, reducing duplication of efforts in building computing infrastructure. LXtoo is distributed as a Live DVD and freely available at http://bioinformatics.jnu.edu.cn/LXtoo.

  17. COMPARISON OF POPULAR BIOINFORMATICS DATABASES

    OpenAIRE

    Abdulganiyu Abdu Yusuf; Zahraddeen Sufyanu; Kabir Yusuf Mamman; Abubakar Umar Suleiman

    2016-01-01

    Bioinformatics is the application of computational tools to capture and interpret biological data. It has wide applications in drug development, crop improvement, agricultural biotechnology and forensic DNA analysis. There are various databases available to researchers in bioinformatics. These databases are customized for a specific need and are ranged in size, scope, and purpose. The main drawbacks of bioinformatics databases include redundant information, constant change, data spread over m...

  18. Bioinformatics-Aided Venomics

    Directory of Open Access Journals (Sweden)

    Quentin Kaas

    2015-06-01

    Full Text Available Venomics is a modern approach that combines transcriptomics and proteomics to explore the toxin content of venoms. This review will give an overview of computational approaches that have been created to classify and consolidate venomics data, as well as algorithms that have helped discovery and analysis of toxin nucleic acid and protein sequences, toxin three-dimensional structures and toxin functions. Bioinformatics is used to tackle specific challenges associated with the identification and annotations of toxins. Recognizing toxin transcript sequences among second generation sequencing data cannot rely only on basic sequence similarity because toxins are highly divergent. Mass spectrometry sequencing of mature toxins is challenging because toxins can display a large number of post-translational modifications. Identifying the mature toxin region in toxin precursor sequences requires the prediction of the cleavage sites of proprotein convertases, most of which are unknown or not well characterized. Tracing the evolutionary relationships between toxins should consider specific mechanisms of rapid evolution as well as interactions between predatory animals and prey. Rapidly determining the activity of toxins is the main bottleneck in venomics discovery, but some recent bioinformatics and molecular modeling approaches give hope that accurate predictions of toxin specificity could be made in the near future.

  19. Learning Genetics through an Authentic Research Simulation in Bioinformatics

    Science.gov (United States)

    Gelbart, Hadas; Yarden, Anat

    2006-01-01

    Following the rationale that learning is an active process of knowledge construction as well as enculturation into a community of experts, we developed a novel web-based learning environment in bioinformatics for high-school biology majors in Israel. The learning environment enables the learners to actively participate in a guided inquiry process…

  20. mORCA: sailing bioinformatics world with mobile devices.

    Science.gov (United States)

    Díaz-Del-Pino, Sergio; Falgueras, Juan; Perez-Wohlfeil, Esteban; Trelles, Oswaldo

    2018-03-01

    Nearly 10 years have passed since the first mobile apps appeared. Given the fact that bioinformatics is a web-based world and that mobile devices are endowed with web-browsers, it seemed natural that bioinformatics would transit from personal computers to mobile devices but nothing could be further from the truth. The transition demands new paradigms, designs and novel implementations. Throughout an in-depth analysis of requirements of existing bioinformatics applications we designed and deployed an easy-to-use web-based lightweight mobile client. Such client is able to browse, select, compose automatically interface parameters, invoke services and monitor the execution of Web Services using the service's metadata stored in catalogs or repositories. mORCA is available at http://bitlab-es.com/morca/app as a web-app. It is also available in the App store by Apple and Play Store by Google. The software will be available for at least 2 years. ortrelles@uma.es. Source code, final web-app, training material and documentation is available at http://bitlab-es.com/morca. © The Author(s) 2017. Published by Oxford University Press.

  1. Reciprocal diversification in a complex plant-herbivore-parasitoid food web

    Directory of Open Access Journals (Sweden)

    Bokma Folmer

    2007-11-01

    Full Text Available Abstract Background Plants, plant-feeding insects, and insect parasitoids form some of the most complex and species-rich food webs. According to the classic escape-and-radiate (EAR hypothesis, these hyperdiverse communities result from coevolutionary arms races consisting of successive cycles of enemy escape, radiation, and colonization by new enemy lineages. It has also been suggested that "enemy-free space" provided by novel host plants could promote host shifts by herbivores, and that parasitoids could similarly drive diversification of gall form in insects that induce galls on plants. Because these central coevolutionary hypotheses have never been tested in a phylogenetic framework, we combined phylogenetic information on willow-galling sawflies with data on their host plants, gall types, and enemy communities. Results We found that evolutionary shifts in host plant use and habitat have led to dramatic prunings of parasitoid communities, and that changes in gall phenotype can provide "enemy-free morphospace" for millions of years even in the absence of host plant shifts. Some parasites have nevertheless managed to colonize recently-evolved gall types, and this has apparently led to adaptive speciation in several enemy groups. However, having fewer enemies does not in itself increase speciation probabilities in individual sawfly lineages, partly because the high diversity of the enemy community facilitates compensatory attack by remaining parasite taxa. Conclusion Taken together, our results indicate that niche-dependent parasitism is a major force promoting ecological divergence in herbivorous insects, and that prey divergence can cause speciation in parasite lineages. However, the results also show that the EAR hypothesis is too simplistic for species-rich food webs: instead, diversification seems to be spurred by a continuous stepwise process, in which ecological and phenotypic shifts in prey lineages are followed by a lagged evolutionary

  2. Emergent Computation Emphasizing Bioinformatics

    CERN Document Server

    Simon, Matthew

    2005-01-01

    Emergent Computation is concerned with recent applications of Mathematical Linguistics or Automata Theory. This subject has a primary focus upon "Bioinformatics" (the Genome and arising interest in the Proteome), but the closing chapter also examines applications in Biology, Medicine, Anthropology, etc. The book is composed of an organized examination of DNA, RNA, and the assembly of amino acids into proteins. Rather than examine these areas from a purely mathematical viewpoint (that excludes much of the biochemical reality), the author uses scientific papers written mostly by biochemists based upon their laboratory observations. Thus while DNA may exist in its double stranded form, triple stranded forms are not excluded. Similarly, while bases exist in Watson-Crick complements, mismatched bases and abasic pairs are not excluded, nor are Hoogsteen bonds. Just as there are four bases naturally found in DNA, the existence of additional bases is not ignored, nor amino acids in addition to the usual complement of...

  3. EDAM: an ontology of bioinformatics operations, types of data and identifiers, topics and formats

    Science.gov (United States)

    Ison, Jon; Kalaš, Matúš; Jonassen, Inge; Bolser, Dan; Uludag, Mahmut; McWilliam, Hamish; Malone, James; Lopez, Rodrigo; Pettifer, Steve; Rice, Peter

    2013-01-01

    Motivation: Advancing the search, publication and integration of bioinformatics tools and resources demands consistent machine-understandable descriptions. A comprehensive ontology allowing such descriptions is therefore required. Results: EDAM is an ontology of bioinformatics operations (tool or workflow functions), types of data and identifiers, application domains and data formats. EDAM supports semantic annotation of diverse entities such as Web services, databases, programmatic libraries, standalone tools, interactive applications, data schemas, datasets and publications within bioinformatics. EDAM applies to organizing and finding suitable tools and data and to automating their integration into complex applications or workflows. It includes over 2200 defined concepts and has successfully been used for annotations and implementations. Availability: The latest stable version of EDAM is available in OWL format from http://edamontology.org/EDAM.owl and in OBO format from http://edamontology.org/EDAM.obo. It can be viewed online at the NCBO BioPortal and the EBI Ontology Lookup Service. For documentation and license please refer to http://edamontology.org. This article describes version 1.2 available at http://edamontology.org/EDAM_1.2.owl. Contact: jison@ebi.ac.uk PMID:23479348

  4. Bioinformatics and moonlighting proteins

    Directory of Open Access Journals (Sweden)

    Sergio eHernández

    2015-06-01

    Full Text Available Multitasking or moonlighting is the capability of some proteins to execute two or more biochemical functions. Usually, moonlighting proteins are experimentally revealed by serendipity. For this reason, it would be helpful that Bioinformatics could predict this multifunctionality, especially because of the large amounts of sequences from genome projects. In the present work, we analyse and describe several approaches that use sequences, structures, interactomics and current bioinformatics algorithms and programs to try to overcome this problem. Among these approaches are: a remote homology searches using Psi-Blast, b detection of functional motifs and domains, c analysis of data from protein-protein interaction databases (PPIs, d match the query protein sequence to 3D databases (i.e., algorithms as PISITE, e mutation correlation analysis between amino acids by algorithms as MISTIC. Programs designed to identify functional motif/domains detect mainly the canonical function but usually fail in the detection of the moonlighting one, Pfam and ProDom being the best methods. Remote homology search by Psi-Blast combined with data from interactomics databases (PPIs have the best performance. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can only be used in very specific situations –it requires the existence of multialigned family protein sequences - but can suggest how the evolutionary process of second function acquisition took place. The multitasking protein database MultitaskProtDB (http://wallace.uab.es/multitask/, previously published by our group, has been used as a benchmark for the all of the analyses.

  5. ALF: a strategy for identification of unauthorized GMOs in complex mixtures by a GW-NGS method and dedicated bioinformatics analysis.

    Science.gov (United States)

    Košir, Alexandra Bogožalec; Arulandhu, Alfred J; Voorhuijzen, Marleen M; Xiao, Hongmei; Hagelaar, Rico; Staats, Martijn; Costessi, Adalberto; Žel, Jana; Kok, Esther J; Dijk, Jeroen P van

    2017-10-26

    The majority of feed products in industrialised countries contains materials derived from genetically modified organisms (GMOs). In parallel, the number of reports of unauthorised GMOs (UGMOs) is gradually increasing. There is a lack of specific detection methods for UGMOs, due to the absence of detailed sequence information and reference materials. In this research, an adapted genome walking approach was developed, called ALF: Amplification of Linearly-enriched Fragments. Coupling of ALF to NGS aims for simultaneous detection and identification of all GMOs, including UGMOs, in one sample, in a single analysis. The ALF approach was assessed on a mixture made of DNA extracts from four reference materials, in an uneven distribution, mimicking a real life situation. The complete insert and genomic flanking regions were known for three of the included GMO events, while for MON15985 only partial sequence information was available. Combined with a known organisation of elements, this GMO served as a model for a UGMO. We successfully identified sequences matching with this organisation of elements serving as proof of principle for ALF as new UGMO detection strategy. Additionally, this study provides a first outline of an automated, web-based analysis pipeline for identification of UGMOs containing known GM elements.

  6. Interdisciplinary Introductory Course in Bioinformatics

    Science.gov (United States)

    Kortsarts, Yana; Morris, Robert W.; Utell, Janine M.

    2010-01-01

    Bioinformatics is a relatively new interdisciplinary field that integrates computer science, mathematics, biology, and information technology to manage, analyze, and understand biological, biochemical and biophysical information. We present our experience in teaching an interdisciplinary course, Introduction to Bioinformatics, which was developed…

  7. Virtual Bioinformatics Distance Learning Suite

    Science.gov (United States)

    Tolvanen, Martti; Vihinen, Mauno

    2004-01-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material…

  8. Bioinformatics of cardiovascular miRNA biology.

    Science.gov (United States)

    Kunz, Meik; Xiao, Ke; Liang, Chunguang; Viereck, Janika; Pachel, Christina; Frantz, Stefan; Thum, Thomas; Dandekar, Thomas

    2015-12-01

    MicroRNAs (miRNAs) are small ~22 nucleotide non-coding RNAs and are highly conserved among species. Moreover, miRNAs regulate gene expression of a large number of genes associated with important biological functions and signaling pathways. Recently, several miRNAs have been found to be associated with cardiovascular diseases. Thus, investigating the complex regulatory effect of miRNAs may lead to a better understanding of their functional role in the heart. To achieve this, bioinformatics approaches have to be coupled with validation and screening experiments to understand the complex interactions of miRNAs with the genome. This will boost the subsequent development of diagnostic markers and our understanding of the physiological and therapeutic role of miRNAs in cardiac remodeling. In this review, we focus on and explain different bioinformatics strategies and algorithms for the identification and analysis of miRNAs and their regulatory elements to better understand cardiac miRNA biology. Starting with the biogenesis of miRNAs, we present approaches such as LocARNA and miRBase for combining sequence and structure analysis including phylogenetic comparisons as well as detailed analysis of RNA folding patterns, functional target prediction, signaling pathway as well as functional analysis. We also show how far bioinformatics helps to tackle the unprecedented level of complexity and systemic effects by miRNA, underlining the strong therapeutic potential of miRNA and miRNA target structures in cardiovascular disease. In addition, we discuss drawbacks and limitations of bioinformatics algorithms and the necessity of experimental approaches for miRNA target identification. This article is part of a Special Issue entitled 'Non-coding RNAs'. Copyright © 2014 Elsevier Ltd. All rights reserved.

  9. The HADDOCK2.2 Web Server: User-Friendly Integrative Modeling of Biomolecular Complexes.

    Science.gov (United States)

    van Zundert, G C P; Rodrigues, J P G L M; Trellet, M; Schmitz, C; Kastritis, P L; Karaca, E; Melquiond, A S J; van Dijk, M; de Vries, S J; Bonvin, A M J J

    2016-02-22

    The prediction of the quaternary structure of biomolecular macromolecules is of paramount importance for fundamental understanding of cellular processes and drug design. In the era of integrative structural biology, one way of increasing the accuracy of modeling methods used to predict the structure of biomolecular complexes is to include as much experimental or predictive information as possible in the process. This has been at the core of our information-driven docking approach HADDOCK. We present here the updated version 2.2 of the HADDOCK portal, which offers new features such as support for mixed molecule types, additional experimental restraints and improved protocols, all of this in a user-friendly interface. With well over 6000 registered users and 108,000 jobs served, an increasing fraction of which on grid resources, we hope that this timely upgrade will help the community to solve important biological questions and further advance the field. The HADDOCK2.2 Web server is freely accessible to non-profit users at http://haddock.science.uu.nl/services/HADDOCK2.2. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  10. Web Resources for Metagenomics Studies

    Directory of Open Access Journals (Sweden)

    Pravin Dudhagara

    2015-10-01

    Full Text Available The development of next-generation sequencing (NGS platforms spawned an enormous volume of data. This explosion in data has unearthed new scalability challenges for existing bioinformatics tools. The analysis of metagenomic sequences using bioinformatics pipelines is complicated by the substantial complexity of these data. In this article, we review several commonly-used online tools for metagenomics data analysis with respect to their quality and detail of analysis using simulated metagenomics data. There are at least a dozen such software tools presently available in the public domain. Among them, MGRAST, IMG/M, and METAVIR are the most well-known tools according to the number of citations by peer-reviewed scientific media up to mid-2015. Here, we describe 12 online tools with respect to their web link, annotation pipelines, clustering methods, online user support, and availability of data storage. We have also done the rating for each tool to screen more potential and preferential tools and evaluated five best tools using synthetic metagenome. The article comprehensively deals with the contemporary problems and the prospects of metagenomics from a bioinformatics viewpoint.

  11. Aesthetic design of e-commerce web pages—Complexity, order, and preferences

    NARCIS (Netherlands)

    Poole, M.S.

    2012-01-01

    This study was conducted to understand the perceptual structure of e-commerce webpage visual aesthetics and to provide insight into how physical design features of web pages influence users' aesthetic perception of and preference for web pages. Drawing on the environmental aesthetics, human-computer

  12. Microbial bioinformatics 2020.

    Science.gov (United States)

    Pallen, Mark J

    2016-09-01

    Microbial bioinformatics in 2020 will remain a vibrant, creative discipline, adding value to the ever-growing flood of new sequence data, while embracing novel technologies and fresh approaches. Databases and search strategies will struggle to cope and manual curation will not be sustainable during the scale-up to the million-microbial-genome era. Microbial taxonomy will have to adapt to a situation in which most microorganisms are discovered and characterised through the analysis of sequences. Genome sequencing will become a routine approach in clinical and research laboratories, with fresh demands for interpretable user-friendly outputs. The "internet of things" will penetrate healthcare systems, so that even a piece of hospital plumbing might have its own IP address that can be integrated with pathogen genome sequences. Microbiome mania will continue, but the tide will turn from molecular barcoding towards metagenomics. Crowd-sourced analyses will collide with cloud computing, but eternal vigilance will be the price of preventing the misinterpretation and overselling of microbial sequence data. Output from hand-held sequencers will be analysed on mobile devices. Open-source training materials will address the need for the development of a skilled labour force. As we boldly go into the third decade of the twenty-first century, microbial sequence space will remain the final frontier! © 2016 The Author. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.

  13. The growing need for microservices in bioinformatics

    Directory of Open Access Journals (Sweden)

    Christopher L Williams

    2016-01-01

    Full Text Available Objective: Within the information technology (IT industry, best practices and standards are constantly evolving and being refined. In contrast, computer technology utilized within the healthcare industry often evolves at a glacial pace, with reduced opportunities for justified innovation. Although the use of timely technology refreshes within an enterprise′s overall technology stack can be costly, thoughtful adoption of select technologies with a demonstrated return on investment can be very effective in increasing productivity and at the same time, reducing the burden of maintenance often associated with older and legacy systems. In this brief technical communication, we introduce the concept of microservices as applied to the ecosystem of data analysis pipelines. Microservice architecture is a framework for dividing complex systems into easily managed parts. Each individual service is limited in functional scope, thereby conferring a higher measure of functional isolation and reliability to the collective solution. Moreover, maintenance challenges are greatly simplified by virtue of the reduced architectural complexity of each constitutive module. This fact notwithstanding, rendered overall solutions utilizing a microservices-based approach provide equal or greater levels of functionality as compared to conventional programming approaches. Bioinformatics, with its ever-increasing demand for performance and new testing algorithms, is the perfect use-case for such a solution. Moreover, if promulgated within the greater development community as an open-source solution, such an approach holds potential to be transformative to current bioinformatics software development. Context: Bioinformatics relies on nimble IT framework which can adapt to changing requirements. Aims: To present a well-established software design and deployment strategy as a solution for current challenges within bioinformatics Conclusions: Use of the microservices framework

  14. The growing need for microservices in bioinformatics.

    Science.gov (United States)

    Williams, Christopher L; Sica, Jeffrey C; Killen, Robert T; Balis, Ulysses G J

    2016-01-01

    Within the information technology (IT) industry, best practices and standards are constantly evolving and being refined. In contrast, computer technology utilized within the healthcare industry often evolves at a glacial pace, with reduced opportunities for justified innovation. Although the use of timely technology refreshes within an enterprise's overall technology stack can be costly, thoughtful adoption of select technologies with a demonstrated return on investment can be very effective in increasing productivity and at the same time, reducing the burden of maintenance often associated with older and legacy systems. In this brief technical communication, we introduce the concept of microservices as applied to the ecosystem of data analysis pipelines. Microservice architecture is a framework for dividing complex systems into easily managed parts. Each individual service is limited in functional scope, thereby conferring a higher measure of functional isolation and reliability to the collective solution. Moreover, maintenance challenges are greatly simplified by virtue of the reduced architectural complexity of each constitutive module. This fact notwithstanding, rendered overall solutions utilizing a microservices-based approach provide equal or greater levels of functionality as compared to conventional programming approaches. Bioinformatics, with its ever-increasing demand for performance and new testing algorithms, is the perfect use-case for such a solution. Moreover, if promulgated within the greater development community as an open-source solution, such an approach holds potential to be transformative to current bioinformatics software development. Bioinformatics relies on nimble IT framework which can adapt to changing requirements. To present a well-established software design and deployment strategy as a solution for current challenges within bioinformatics. Use of the microservices framework is an effective methodology for the fabrication and

  15. The growing need for microservices in bioinformatics

    Science.gov (United States)

    Williams, Christopher L.; Sica, Jeffrey C.; Killen, Robert T.; Balis, Ulysses G. J.

    2016-01-01

    Objective: Within the information technology (IT) industry, best practices and standards are constantly evolving and being refined. In contrast, computer technology utilized within the healthcare industry often evolves at a glacial pace, with reduced opportunities for justified innovation. Although the use of timely technology refreshes within an enterprise's overall technology stack can be costly, thoughtful adoption of select technologies with a demonstrated return on investment can be very effective in increasing productivity and at the same time, reducing the burden of maintenance often associated with older and legacy systems. In this brief technical communication, we introduce the concept of microservices as applied to the ecosystem of data analysis pipelines. Microservice architecture is a framework for dividing complex systems into easily managed parts. Each individual service is limited in functional scope, thereby conferring a higher measure of functional isolation and reliability to the collective solution. Moreover, maintenance challenges are greatly simplified by virtue of the reduced architectural complexity of each constitutive module. This fact notwithstanding, rendered overall solutions utilizing a microservices-based approach provide equal or greater levels of functionality as compared to conventional programming approaches. Bioinformatics, with its ever-increasing demand for performance and new testing algorithms, is the perfect use-case for such a solution. Moreover, if promulgated within the greater development community as an open-source solution, such an approach holds potential to be transformative to current bioinformatics software development. Context: Bioinformatics relies on nimble IT framework which can adapt to changing requirements. Aims: To present a well-established software design and deployment strategy as a solution for current challenges within bioinformatics Conclusions: Use of the microservices framework is an effective

  16. Comprehensive decision tree models in bioinformatics.

    Directory of Open Access Journals (Sweden)

    Gregor Stiglic

    Full Text Available PURPOSE: Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. METHODS: This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. RESULTS: The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. CONCLUSIONS: The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets

  17. Comprehensive decision tree models in bioinformatics.

    Science.gov (United States)

    Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter

    2012-01-01

    Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class attributes and a high number of possibly

  18. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software.

    Science.gov (United States)

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians.

  19. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software

    Science.gov (United States)

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians. PMID:25996054

  20. Implementation of a web-based national child health-care programme in a local context: A complex facilitator role.

    Science.gov (United States)

    Tell, Johanna; Olander, Ewy; Anderberg, Peter; Berglund, Johan Sanmartin

    2018-02-01

    The aim of this study was to investigate child health-care coordinators' experiences of being a facilitator for the implementation of a new national child health-care programme in the form of a web-based national guide. The study was based on eight remote, online focus groups, using Skype for Business. A qualitative content analysis was performed. The analysis generated three categories: adapt to a local context, transition challenges and led by strong incentives. There were eight subcategories. In the latent analysis, the theme 'Being a facilitator: a complex role' was formed to express the child health-care coordinators' experiences. Facilitating a national guideline or decision support in a local context is a complex task that requires an advocating and mediating role. For successful implementation, guidelines and decision support, such as a web-based guide and the new child health-care programme, must match professional consensus and needs and be seen as relevant by all. Participation in the development and a strong bottom-up approach was important, making the web-based guide and the programme relevant to whom it is intended to serve, and for successful implementation. The study contributes valuable knowledge when planning to implement a national web-based decision support and policy programme in a local health-care context.

  1. Designing XML schemas for bioinformatics.

    Science.gov (United States)

    Bruhn, Russel Elton; Burton, Philip John

    2003-06-01

    Data interchange bioinformatics databases will, in the future, most likely take place using extensible markup language (XML). The document structure will be described by an XML Schema rather than a document type definition (DTD). To ensure flexibility, the XML Schema must incorporate aspects of Object-Oriented Modeling. This impinges on the choice of the data model, which, in turn, is based on the organization of bioinformatics data by biologists. Thus, there is a need for the general bioinformatics community to be aware of the design issues relating to XML Schema. This paper, which is aimed at a general bioinformatics audience, uses examples to describe the differences between a DTD and an XML Schema and indicates how Unified Modeling Language diagrams may be used to incorporate Object-Oriented Modeling in the design of schema.

  2. When process mining meets bioinformatics

    NARCIS (Netherlands)

    Jagadeesh Chandra Bose, R.P.; Aalst, van der W.M.P.; Nurcan, S.

    2011-01-01

    Process mining techniques can be used to extract non-trivial process related knowledge and thus generate interesting insights from event logs. Similarly, bioinformatics aims at increasing the understanding of biological processes through the analysis of information associated with biological

  3. Adapting bioinformatics curricula for big data.

    Science.gov (United States)

    Greene, Anna C; Giffin, Kristine A; Greene, Casey S; Moore, Jason H

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. © The Author 2015. Published by Oxford University Press.

  4. Adapting bioinformatics curricula for big data

    Science.gov (United States)

    Greene, Anna C.; Giffin, Kristine A.; Greene, Casey S.

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. PMID:25829469

  5. High-throughput bioinformatics with the Cyrille2 pipeline system

    Directory of Open Access Journals (Sweden)

    de Groot Joost CW

    2008-02-01

    Full Text Available Abstract Background Modern omics research involves the application of high-throughput technologies that generate vast volumes of data. These data need to be pre-processed, analyzed and integrated with existing knowledge through the use of diverse sets of software tools, models and databases. The analyses are often interdependent and chained together to form complex workflows or pipelines. Given the volume of the data used and the multitude of computational resources available, specialized pipeline software is required to make high-throughput analysis of large-scale omics datasets feasible. Results We have developed a generic pipeline system called Cyrille2. The system is modular in design and consists of three functionally distinct parts: 1 a web based, graphical user interface (GUI that enables a pipeline operator to manage the system; 2 the Scheduler, which forms the functional core of the system and which tracks what data enters the system and determines what jobs must be scheduled for execution, and; 3 the Executor, which searches for scheduled jobs and executes these on a compute cluster. Conclusion The Cyrille2 system is an extensible, modular system, implementing the stated requirements. Cyrille2 enables easy creation and execution of high throughput, flexible bioinformatics pipelines.

  6. Deciphering psoriasis. A bioinformatic approach.

    Science.gov (United States)

    Melero, Juan L; Andrades, Sergi; Arola, Lluís; Romeu, Antoni

    2018-02-01

    Psoriasis is an immune-mediated, inflammatory and hyperproliferative disease of the skin and joints. The cause of psoriasis is still unknown. The fundamental feature of the disease is the hyperproliferation of keratinocytes and the recruitment of cells from the immune system in the region of the affected skin, which leads to deregulation of many well-known gene expressions. Based on data mining and bioinformatic scripting, here we show a new dimension of the effect of psoriasis at the genomic level. Using our own pipeline of scripts in Perl and MySql and based on the freely available NCBI Gene Expression Omnibus (GEO) database: DataSet Record GDS4602 (Series GSE13355), we explore the extent of the effect of psoriasis on gene expression in the affected tissue. We give greater insight into the effects of psoriasis on the up-regulation of some genes in the cell cycle (CCNB1, CCNA2, CCNE2, CDK1) or the dynamin system (GBPs, MXs, MFN1), as well as the down-regulation of typical antioxidant genes (catalase, CAT; superoxide dismutases, SOD1-3; and glutathione reductase, GSR). We also provide a complete list of the human genes and how they respond in a state of psoriasis. Our results show that psoriasis affects all chromosomes and many biological functions. If we further consider the stable and mitotically inheritable character of the psoriasis phenotype, and the influence of environmental factors, then it seems that psoriasis has an epigenetic origin. This fit well with the strong hereditary character of the disease as well as its complex genetic background. Copyright © 2017 Japanese Society for Investigative Dermatology. Published by Elsevier B.V. All rights reserved.

  7. Taking Bioinformatics to Systems Medicine.

    Science.gov (United States)

    van Kampen, Antoine H C; Moerland, Perry D

    2016-01-01

    Systems medicine promotes a range of approaches and strategies to study human health and disease at a systems level with the aim of improving the overall well-being of (healthy) individuals, and preventing, diagnosing, or curing disease. In this chapter we discuss how bioinformatics critically contributes to systems medicine. First, we explain the role of bioinformatics in the management and analysis of data. In particular we show the importance of publicly available biological and clinical repositories to support systems medicine studies. Second, we discuss how the integration and analysis of multiple types of omics data through integrative bioinformatics may facilitate the determination of more predictive and robust disease signatures, lead to a better understanding of (patho)physiological molecular mechanisms, and facilitate personalized medicine. Third, we focus on network analysis and discuss how gene networks can be constructed from omics data and how these networks can be decomposed into smaller modules. We discuss how the resulting modules can be used to generate experimentally testable hypotheses, provide insight into disease mechanisms, and lead to predictive models. Throughout, we provide several examples demonstrating how bioinformatics contributes to systems medicine and discuss future challenges in bioinformatics that need to be addressed to enable the advancement of systems medicine.

  8. Generalized Centroid Estimators in Bioinformatics

    Science.gov (United States)

    Hamada, Michiaki; Kiryu, Hisanori; Iwasaki, Wataru; Asai, Kiyoshi

    2011-01-01

    In a number of estimation problems in bioinformatics, accuracy measures of the target problem are usually given, and it is important to design estimators that are suitable to those accuracy measures. However, there is often a discrepancy between an employed estimator and a given accuracy measure of the problem. In this study, we introduce a general class of efficient estimators for estimation problems on high-dimensional binary spaces, which represent many fundamental problems in bioinformatics. Theoretical analysis reveals that the proposed estimators generally fit with commonly-used accuracy measures (e.g. sensitivity, PPV, MCC and F-score) as well as it can be computed efficiently in many cases, and cover a wide range of problems in bioinformatics from the viewpoint of the principle of maximum expected accuracy (MEA). It is also shown that some important algorithms in bioinformatics can be interpreted in a unified manner. Not only the concept presented in this paper gives a useful framework to design MEA-based estimators but also it is highly extendable and sheds new light on many problems in bioinformatics. PMID:21365017

  9. An innovative approach for testing bioinformatics programs using metamorphic testing

    Directory of Open Access Journals (Sweden)

    Liu Huai

    2009-01-01

    Full Text Available Abstract Background Recent advances in experimental and computational technologies have fueled the development of many sophisticated bioinformatics programs. The correctness of such programs is crucial as incorrectly computed results may lead to wrong biological conclusion or misguide downstream experimentation. Common software testing procedures involve executing the target program with a set of test inputs and then verifying the correctness of the test outputs. However, due to the complexity of many bioinformatics programs, it is often difficult to verify the correctness of the test outputs. Therefore our ability to perform systematic software testing is greatly hindered. Results We propose to use a novel software testing technique, metamorphic testing (MT, to test a range of bioinformatics programs. Instead of requiring a mechanism to verify whether an individual test output is correct, the MT technique verifies whether a pair of test outputs conform to a set of domain specific properties, called metamorphic relations (MRs, thus greatly increases the number and variety of test cases that can be applied. To demonstrate how MT is used in practice, we applied MT to test two open-source bioinformatics programs, namely GNLab and SeqMap. In particular we show that MT is simple to implement, and is effective in detecting faults in a real-life program and some artificially fault-seeded programs. Further, we discuss how MT can be applied to test programs from various domains of bioinformatics. Conclusion This paper describes the application of a simple, effective and automated technique to systematically test a range of bioinformatics programs. We show how MT can be implemented in practice through two real-life case studies. Since many bioinformatics programs, particularly those for large scale simulation and data analysis, are hard to test systematically, their developers may benefit from using MT as part of the testing strategy. Therefore our work

  10. Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud.

    Directory of Open Access Journals (Sweden)

    Enis Afgan

    Full Text Available Analyzing high throughput genomics data is a complex and compute intensive task, generally requiring numerous software tools and large reference data sets, tied together in successive stages of data transformation and visualisation. A computational platform enabling best practice genomics analysis ideally meets a number of requirements, including: a wide range of analysis and visualisation tools, closely linked to large user and reference data sets; workflow platform(s enabling accessible, reproducible, portable analyses, through a flexible set of interfaces; highly available, scalable computational resources; and flexibility and versatility in the use of these resources to meet demands and expertise of a variety of users. Access to an appropriate computational platform can be a significant barrier to researchers, as establishing such a platform requires a large upfront investment in hardware, experience, and expertise.We designed and implemented the Genomics Virtual Laboratory (GVL as a middleware layer of machine images, cloud management tools, and online services that enable researchers to build arbitrarily sized compute clusters on demand, pre-populated with fully configured bioinformatics tools, reference datasets and workflow and visualisation options. The platform is flexible in that users can conduct analyses through web-based (Galaxy, RStudio, IPython Notebook or command-line interfaces, and add/remove compute nodes and data resources as required. Best-practice tutorials and protocols provide a path from introductory training to practice. The GVL is available on the OpenStack-based Australian Research Cloud (http://nectar.org.au and the Amazon Web Services cloud. The principles, implementation and build process are designed to be cloud-agnostic.This paper provides a blueprint for the design and implementation of a cloud-based Genomics Virtual Laboratory. We discuss scope, design considerations and technical and logistical constraints

  11. Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud.

    Science.gov (United States)

    Afgan, Enis; Sloggett, Clare; Goonasekera, Nuwan; Makunin, Igor; Benson, Derek; Crowe, Mark; Gladman, Simon; Kowsar, Yousef; Pheasant, Michael; Horst, Ron; Lonie, Andrew

    2015-01-01

    Analyzing high throughput genomics data is a complex and compute intensive task, generally requiring numerous software tools and large reference data sets, tied together in successive stages of data transformation and visualisation. A computational platform enabling best practice genomics analysis ideally meets a number of requirements, including: a wide range of analysis and visualisation tools, closely linked to large user and reference data sets; workflow platform(s) enabling accessible, reproducible, portable analyses, through a flexible set of interfaces; highly available, scalable computational resources; and flexibility and versatility in the use of these resources to meet demands and expertise of a variety of users. Access to an appropriate computational platform can be a significant barrier to researchers, as establishing such a platform requires a large upfront investment in hardware, experience, and expertise. We designed and implemented the Genomics Virtual Laboratory (GVL) as a middleware layer of machine images, cloud management tools, and online services that enable researchers to build arbitrarily sized compute clusters on demand, pre-populated with fully configured bioinformatics tools, reference datasets and workflow and visualisation options. The platform is flexible in that users can conduct analyses through web-based (Galaxy, RStudio, IPython Notebook) or command-line interfaces, and add/remove compute nodes and data resources as required. Best-practice tutorials and protocols provide a path from introductory training to practice. The GVL is available on the OpenStack-based Australian Research Cloud (http://nectar.org.au) and the Amazon Web Services cloud. The principles, implementation and build process are designed to be cloud-agnostic. This paper provides a blueprint for the design and implementation of a cloud-based Genomics Virtual Laboratory. We discuss scope, design considerations and technical and logistical constraints, and explore the

  12. Bioinformatics in translational drug discovery.

    Science.gov (United States)

    Wooller, Sarah K; Benstead-Hume, Graeme; Chen, Xiangrong; Ali, Yusuf; Pearl, Frances M G

    2017-08-31

    Bioinformatics approaches are becoming ever more essential in translational drug discovery both in academia and within the pharmaceutical industry. Computational exploitation of the increasing volumes of data generated during all phases of drug discovery is enabling key challenges of the process to be addressed. Here, we highlight some of the areas in which bioinformatics resources and methods are being developed to support the drug discovery pipeline. These include the creation of large data warehouses, bioinformatics algorithms to analyse 'big data' that identify novel drug targets and/or biomarkers, programs to assess the tractability of targets, and prediction of repositioning opportunities that use licensed drugs to treat additional indications. © 2017 The Author(s).

  13. Onebox: Free-Text Interfaces as an Alternative to Complex Web Forms

    NARCIS (Netherlands)

    Tjin-Kam-Jet, Kien; Trieschnigg, Rudolf Berend; Hiemstra, Djoerd

    2011-01-01

    This paper investigates the problem of translating free-text queries into key-value pairs as an alternative means for searching `behind' web forms. We introduce a novel specication language for specifying free-text interfaces, and report the results of a user study where we evaluated our prototype

  14. Parasites Affect Food Web Structure Primarily through Increased Diversity and Complexity

    NARCIS (Netherlands)

    Dunne, J.A.; Lafferty, K.D.; Dobson, A.P.; Hechinger, R.F.; Kuris, A.M.; Martinez, N.D.; McLaughlin, J.P.; Mouritsen, K.N.; Poulin, R.; Reise, K.; Stouffer, D.B.; Thieltges, D.W.; Williams, R.J.; Zander, C.D.

    2013-01-01

    Comparative research on food web structure has revealed generalities in trophic organization, produced simple models, and allowed assessment of robustness to species loss. These studies have mostly focused on free-living species. Recent research has suggested that inclusion of parasites alters

  15. Bioinformatics Training Network (BTN): a community resource for bioinformatics trainers

    DEFF Research Database (Denmark)

    Schneider, Maria V.; Walter, Peter; Blatter, Marie-Claude

    2012-01-01

    and clearly tagged in relation to target audiences, learning objectives, etc. Ideally, they would also be peer reviewed, and easily and efficiently accessible for downloading. Here, we present the Bioinformatics Training Network (BTN), a new enterprise that has been initiated to address these needs and review...

  16. The EMBRACE web service collection.

    NARCIS (Netherlands)

    Pettifer, S.; Ison, J.; Kalas, M.; Thorne, D.; McDermott, P.; Jonassen, I.; Liaquat, A.; Fernandez, J.M.; Rodriguez, J.M.; Pisano, D.G.; Blanchet, C; Uludag, M.; Rice, P.; Bartaseviciute, E.; Rapacki, K.; Hekkelman, M.L.; Sand, O.; Stockinger, H.; Clegg, A.B.; Bongcam-Rudloff, E.; Salzemann, J.; Breton, V.; Attwood, T.K.; Cameron, G.; Vriend, G.

    2010-01-01

    The EMBRACE (European Model for Bioinformatics Research and Community Education) web service collection is the culmination of a 5-year project that set out to investigate issues involved in developing and deploying web services for use in the life sciences. The project concluded that in order for

  17. Peer Mentoring for Bioinformatics presentation

    OpenAIRE

    Budd, Aidan

    2014-01-01

    A handout used in a HUB (Heidelberg Unseminars in Bioinformatics) meeting focused on career development for bioinformaticians. It describes an activity for use to help introduce the idea of peer mentoring, potnetially acting as an opportunity to create peer-mentoring groups.

  18. Reproducible Bioinformatics Research for Biologists

    Science.gov (United States)

    This book chapter describes the current Big Data problem in Bioinformatics and the resulting issues with performing reproducible computational research. The core of the chapter provides guidelines and summaries of current tools/techniques that a noncomputational researcher would need to learn to pe...

  19. Taking Bioinformatics to Systems Medicine

    NARCIS (Netherlands)

    van Kampen, Antoine H. C.; Moerland, Perry D.

    2016-01-01

    Systems medicine promotes a range of approaches and strategies to study human health and disease at a systems level with the aim of improving the overall well-being of (healthy) individuals, and preventing, diagnosing, or curing disease. In this chapter we discuss how bioinformatics critically

  20. Bioinformatics and the Undergraduate Curriculum

    Science.gov (United States)

    Maloney, Mark; Parker, Jeffrey; LeBlanc, Mark; Woodard, Craig T.; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of…

  1. Bioinformatics for cancer immunotherapy target discovery

    DEFF Research Database (Denmark)

    Olsen, Lars Rønn; Campos, Benito; Barnkob, Mike Stein

    2014-01-01

    therapy target discovery in a bioinformatics analysis pipeline. We describe specialized bioinformatics tools and databases for three main bottlenecks in immunotherapy target discovery: the cataloging of potentially antigenic proteins, the identification of potential HLA binders, and the selection epitopes...

  2. EURASIP journal on bioinformatics & systems biology

    National Research Council Canada - National Science Library

    2006-01-01

    "The overall aim of "EURASIP Journal on Bioinformatics and Systems Biology" is to publish research results related to signal processing and bioinformatics theories and techniques relevant to a wide...

  3. Tools and data services registry: a community effort to document bioinformatics resources

    Science.gov (United States)

    Ison, Jon; Rapacki, Kristoffer; Ménager, Hervé; Kalaš, Matúš; Rydza, Emil; Chmura, Piotr; Anthon, Christian; Beard, Niall; Berka, Karel; Bolser, Dan; Booth, Tim; Bretaudeau, Anthony; Brezovsky, Jan; Casadio, Rita; Cesareni, Gianni; Coppens, Frederik; Cornell, Michael; Cuccuru, Gianmauro; Davidsen, Kristian; Vedova, Gianluca Della; Dogan, Tunca; Doppelt-Azeroual, Olivia; Emery, Laura; Gasteiger, Elisabeth; Gatter, Thomas; Goldberg, Tatyana; Grosjean, Marie; Grüning, Björn; Helmer-Citterich, Manuela; Ienasescu, Hans; Ioannidis, Vassilios; Jespersen, Martin Closter; Jimenez, Rafael; Juty, Nick; Juvan, Peter; Koch, Maximilian; Laibe, Camille; Li, Jing-Woei; Licata, Luana; Mareuil, Fabien; Mičetić, Ivan; Friborg, Rune Møllegaard; Moretti, Sebastien; Morris, Chris; Möller, Steffen; Nenadic, Aleksandra; Peterson, Hedi; Profiti, Giuseppe; Rice, Peter; Romano, Paolo; Roncaglia, Paola; Saidi, Rabie; Schafferhans, Andrea; Schwämmle, Veit; Smith, Callum; Sperotto, Maria Maddalena; Stockinger, Heinz; Vařeková, Radka Svobodová; Tosatto, Silvio C.E.; de la Torre, Victor; Uva, Paolo; Via, Allegra; Yachdav, Guy; Zambelli, Federico; Vriend, Gert; Rost, Burkhard; Parkinson, Helen; Løngreen, Peter; Brunak, Søren

    2016-01-01

    Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand. Here we present a community-driven curation effort, supported by ELIXIR—the European infrastructure for biological information—that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners. As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools. PMID:26538599

  4. Hidden in the Middle: Culture, Value and Reward in Bioinformatics

    Science.gov (United States)

    Lewis, Jamie; Bartlett, Andrew; Atkinson, Paul

    2016-01-01

    Bioinformatics--the so-called shotgun marriage between biology and computer science--is an interdiscipline. Despite interdisciplinarity being seen as a virtue, for having the capacity to solve complex problems and foster innovation, it has the potential to place projects and people in anomalous categories. For example, valorised…

  5. Preface to Introduction to Structural Bioinformatics

    NARCIS (Netherlands)

    Feenstra, K. Anton; Abeln, Sanne

    2018-01-01

    While many good textbooks are available on Protein Structure, Molecular Simulations, Thermodynamics and Bioinformatics methods in general, there is no good introductory level book for the field of Structural Bioinformatics. This book aims to give an introduction into Structural Bioinformatics, which

  6. Winners and losers in the complex web of global supply chains.

    Science.gov (United States)

    Glendon, Lee

    2013-01-01

    This paper discusses how supply chain, risk and business continuity professionals can collaboratively address the consequences of increasing supply chain complexity in order to deliver both resilient and sustainable supply chains. The paper identifies the key drivers of complexity supported by recent case examples, including the equine DNA scandal and the Rana Plaza tragedy in Bangladesh.

  7. Evaluating Complex Interventions and Health Technologies Using Normalization Process Theory: Development of a Simplified Approach and Web-Enabled Toolkit

    LENUS (Irish Health Repository)

    May, Carl R

    2011-09-30

    Abstract Background Normalization Process Theory (NPT) can be used to explain implementation processes in health care relating to new technologies and complex interventions. This paper describes the processes by which we developed a simplified version of NPT for use by clinicians, managers, and policy makers, and which could be embedded in a web-enabled toolkit and on-line users manual. Methods Between 2006 and 2010 we undertook four tasks. (i) We presented NPT to potential and actual users in multiple workshops, seminars, and presentations. (ii) Using what we discovered from these meetings, we decided to create a simplified set of statements and explanations expressing core constructs of the theory (iii) We circulated these statements to a criterion sample of 60 researchers, clinicians and others, using SurveyMonkey to collect qualitative textual data about their criticisms of the statements. (iv) We then reconstructed the statements and explanations to meet users\\' criticisms, embedded them in a web-enabled toolkit, and beta tested this \\'in the wild\\'. Results On-line data collection was effective: over a four week period 50\\/60 participants responded using SurveyMonkey (40\\/60) or direct phone and email contact (10\\/60). An additional nine responses were received from people who had been sent the SurveyMonkey form by other respondents. Beta testing of the web enabled toolkit produced 13 responses, from 327 visits to http:\\/\\/www.normalizationprocess.org. Qualitative analysis of both sets of responses showed a high level of support for the statements but also showed that some statements poorly expressed their underlying constructs or overlapped with others. These were rewritten to take account of users\\' criticisms and then embedded in a web-enabled toolkit. As a result we were able translate the core constructs into a simplified set of statements that could be utilized by non-experts. Conclusion Normalization Process Theory has been developed through

  8. Evaluating complex interventions and health technologies using normalization process theory: development of a simplified approach and web-enabled toolkit

    Directory of Open Access Journals (Sweden)

    Murray Elizabeth

    2011-09-01

    Full Text Available Abstract Background Normalization Process Theory (NPT can be used to explain implementation processes in health care relating to new technologies and complex interventions. This paper describes the processes by which we developed a simplified version of NPT for use by clinicians, managers, and policy makers, and which could be embedded in a web-enabled toolkit and on-line users manual. Methods Between 2006 and 2010 we undertook four tasks. (i We presented NPT to potential and actual users in multiple workshops, seminars, and presentations. (ii Using what we discovered from these meetings, we decided to create a simplified set of statements and explanations expressing core constructs of the theory (iii We circulated these statements to a criterion sample of 60 researchers, clinicians and others, using SurveyMonkey to collect qualitative textual data about their criticisms of the statements. (iv We then reconstructed the statements and explanations to meet users' criticisms, embedded them in a web-enabled toolkit, and beta tested this 'in the wild'. Results On-line data collection was effective: over a four week period 50/60 participants responded using SurveyMonkey (40/60 or direct phone and email contact (10/60. An additional nine responses were received from people who had been sent the SurveyMonkey form by other respondents. Beta testing of the web enabled toolkit produced 13 responses, from 327 visits to http://www.normalizationprocess.org. Qualitative analysis of both sets of responses showed a high level of support for the statements but also showed that some statements poorly expressed their underlying constructs or overlapped with others. These were rewritten to take account of users' criticisms and then embedded in a web-enabled toolkit. As a result we were able translate the core constructs into a simplified set of statements that could be utilized by non-experts. Conclusion Normalization Process Theory has been developed through

  9. A Bioinformatics Facility for NASA

    Science.gov (United States)

    Schweighofer, Karl; Pohorille, Andrew

    2006-01-01

    Building on an existing prototype, we have fielded a facility with bioinformatics technologies that will help NASA meet its unique requirements for biological research. This facility consists of a cluster of computers capable of performing computationally intensive tasks, software tools, databases and knowledge management systems. Novel computational technologies for analyzing and integrating new biological data and already existing knowledge have been developed. With continued development and support, the facility will fulfill strategic NASA s bioinformatics needs in astrobiology and space exploration. . As a demonstration of these capabilities, we will present a detailed analysis of how spaceflight factors impact gene expression in the liver and kidney for mice flown aboard shuttle flight STS-108. We have found that many genes involved in signal transduction, cell cycle, and development respond to changes in microgravity, but that most metabolic pathways appear unchanged.

  10. Establishing bioinformatics research in the Asia Pacific

    OpenAIRE

    Ranganathan, Shoba; Tammi, Martti; Gribskov, Michael; Tan, Tin Wee

    2006-01-01

    Abstract In 1998, the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation was set up to champion the advancement of bioinformatics in the Asia Pacific. By 2002, APBioNet was able to gain sufficient critical mass to initiate the first International Conference on Bioinformatics (InCoB) bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2006 Conference was organized as the 5th annual conference of the Asia-...

  11. Biowep: a workflow enactment portal for bioinformatics applications.

    Science.gov (United States)

    Romano, Paolo; Bartocci, Ezio; Bertolini, Guglielmo; De Paoli, Flavio; Marra, Domenico; Mauri, Giancarlo; Merelli, Emanuela; Milanesi, Luciano

    2007-03-08

    The huge amount of biological information, its distribution over the Internet and the heterogeneity of available software tools makes the adoption of new data integration and analysis network tools a necessity in bioinformatics. ICT standards and tools, like Web Services and Workflow Management Systems (WMS), can support the creation and deployment of such systems. Many Web Services are already available and some WMS have been proposed. They assume that researchers know which bioinformatics resources can be reached through a programmatic interface and that they are skilled in programming and building workflows. Therefore, they are not viable to the majority of unskilled researchers. A portal enabling these to take profit from new technologies is still missing. We designed biowep, a web based client application that allows for the selection and execution of a set of predefined workflows. The system is available on-line. Biowep architecture includes a Workflow Manager, a User Interface and a Workflow Executor. The task of the Workflow Manager is the creation and annotation of workflows. These can be created by using either the Taverna Workbench or BioWMS. Enactment of workflows is carried out by FreeFluo for Taverna workflows and by BioAgent/Hermes, a mobile agent-based middleware, for BioWMS ones. Main workflows' processing steps are annotated on the basis of their input and output, elaboration type and application domain by using a classification of bioinformatics data and tasks. The interface supports users authentication and profiling. Workflows can be selected on the basis of users' profiles and can be searched through their annotations. Results can be saved. We developed a web system that support the selection and execution of predefined workflows, thus simplifying access for all researchers. The implementation of Web Services allowing specialized software to interact with an exhaustive set of biomedical databases and analysis software and the creation of

  12. Biowep: a workflow enactment portal for bioinformatics applications

    Directory of Open Access Journals (Sweden)

    Romano Paolo

    2007-03-01

    Full Text Available Abstract Background The huge amount of biological information, its distribution over the Internet and the heterogeneity of available software tools makes the adoption of new data integration and analysis network tools a necessity in bioinformatics. ICT standards and tools, like Web Services and Workflow Management Systems (WMS, can support the creation and deployment of such systems. Many Web Services are already available and some WMS have been proposed. They assume that researchers know which bioinformatics resources can be reached through a programmatic interface and that they are skilled in programming and building workflows. Therefore, they are not viable to the majority of unskilled researchers. A portal enabling these to take profit from new technologies is still missing. Results We designed biowep, a web based client application that allows for the selection and execution of a set of predefined workflows. The system is available on-line. Biowep architecture includes a Workflow Manager, a User Interface and a Workflow Executor. The task of the Workflow Manager is the creation and annotation of workflows. These can be created by using either the Taverna Workbench or BioWMS. Enactment of workflows is carried out by FreeFluo for Taverna workflows and by BioAgent/Hermes, a mobile agent-based middleware, for BioWMS ones. Main workflows' processing steps are annotated on the basis of their input and output, elaboration type and application domain by using a classification of bioinformatics data and tasks. The interface supports users authentication and profiling. Workflows can be selected on the basis of users' profiles and can be searched through their annotations. Results can be saved. Conclusion We developed a web system that support the selection and execution of predefined workflows, thus simplifying access for all researchers. The implementation of Web Services allowing specialized software to interact with an exhaustive set of biomedical

  13. Opal web services for biomedical applications.

    Science.gov (United States)

    Ren, Jingyuan; Williams, Nadya; Clementi, Luca; Krishnan, Sriram; Li, Wilfred W

    2010-07-01

    Biomedical applications have become increasingly complex, and they often require large-scale high-performance computing resources with a large number of processors and memory. The complexity of application deployment and the advances in cluster, grid and cloud computing require new modes of support for biomedical research. Scientific Software as a Service (sSaaS) enables scalable and transparent access to biomedical applications through simple standards-based Web interfaces. Towards this end, we built a production web server (http://ws.nbcr.net) in August 2007 to support the bioinformatics application called MEME. The server has grown since to include docking analysis with AutoDock and AutoDock Vina, electrostatic calculations using PDB2PQR and APBS, and off-target analysis using SMAP. All the applications on the servers are powered by Opal, a toolkit that allows users to wrap scientific applications easily as web services without any modification to the scientific codes, by writing simple XML configuration files. Opal allows both web forms-based access and programmatic access of all our applications. The Opal toolkit currently supports SOAP-based Web service access to a number of popular applications from the National Biomedical Computation Resource (NBCR) and affiliated collaborative and service projects. In addition, Opal's programmatic access capability allows our applications to be accessed through many workflow tools, including Vision, Kepler, Nimrod/K and VisTrails. From mid-August 2007 to the end of 2009, we have successfully executed 239,814 jobs. The number of successfully executed jobs more than doubled from 205 to 411 per day between 2008 and 2009. The Opal-enabled service model is useful for a wide range of applications. It provides for interoperation with other applications with Web Service interfaces, and allows application developers to focus on the scientific tool and workflow development. Web server availability: http://ws.nbcr.net.

  14. The structural bioinformatics library: modeling in biomolecular science and beyond.

    Science.gov (United States)

    Cazals, Frédéric; Dreyfus, Tom

    2017-04-01

    Software in structural bioinformatics has mainly been application driven. To favor practitioners seeking off-the-shelf applications, but also developers seeking advanced building blocks to develop novel applications, we undertook the design of the Structural Bioinformatics Library ( SBL , http://sbl.inria.fr ), a generic C ++/python cross-platform software library targeting complex problems in structural bioinformatics. Its tenet is based on a modular design offering a rich and versatile framework allowing the development of novel applications requiring well specified complex operations, without compromising robustness and performances. The SBL involves four software components (1-4 thereafter). For end-users, the SBL provides ready to use, state-of-the-art (1) applications to handle molecular models defined by unions of balls, to deal with molecular flexibility, to model macro-molecular assemblies. These applications can also be combined to tackle integrated analysis problems. For developers, the SBL provides a broad C ++ toolbox with modular design, involving core (2) algorithms , (3) biophysical models and (4) modules , the latter being especially suited to develop novel applications. The SBL comes with a thorough documentation consisting of user and reference manuals, and a bugzilla platform to handle community feedback. The SBL is available from http://sbl.inria.fr. Frederic.Cazals@inria.fr. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  15. Improving Ecological Forecasting: Data Assimilation Enhances the Ecological Forecast Horizon of a Complex Food Web

    Science.gov (United States)

    Massoud, E. C.; Huisman, J.; Benincà, E.; Bouten, W.; Vrugt, J. A.

    2017-12-01

    Species abundances in ecological communities can display chaotic non-equilibrium dynamics. A characteristic feature of chaotic systems is that long-term prediction of the system's trajectory is fundamentally impossible. How then should we make predictions for complex multi-species communities? We explore data assimilation (DA) with the Ensemble Kalman Filter (EnKF) to fuse a two-predator-two-prey model with abundance data from a long term experiment of a plankton community which displays chaotic dynamics. The results show that DA improves substantially the predictability and ecological forecast horizon of complex community dynamics. In addition, we show that DA helps provide guidance on measurement design, for instance on defining the frequency of observations. The study presented here is highly innovative, because DA methods at the current stage are almost unknown in ecology.

  16. 9th International Conference on Practical Applications of Computational Biology and Bioinformatics

    CERN Document Server

    Rocha, Miguel; Fdez-Riverola, Florentino; Paz, Juan

    2015-01-01

    This proceedings presents recent practical applications of Computational Biology and  Bioinformatics. It contains the proceedings of the 9th International Conference on Practical Applications of Computational Biology & Bioinformatics held at University of Salamanca, Spain, at June 3rd-5th, 2015. The International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB) is an annual international meeting dedicated to emerging and challenging applied research in Bioinformatics and Computational Biology. Biological and biomedical research are increasingly driven by experimental techniques that challenge our ability to analyse, process and extract meaningful knowledge from the underlying data. The impressive capabilities of next generation sequencing technologies, together with novel and ever evolving distinct types of omics data technologies, have put an increasingly complex set of challenges for the growing fields of Bioinformatics and Computational Biology. The analysis o...

  17. Fast-Dissolving, Prolonged Release, and Antibacterial Cyclodextrin/Limonene-Inclusion Complex Nanofibrous Webs via Polymer-Free Electrospinning.

    Science.gov (United States)

    Aytac, Zeynep; Yildiz, Zehra Irem; Kayaci-Senirmak, Fatma; San Keskin, Nalan Oya; Kusku, Semran Ipek; Durgun, Engin; Tekinay, Turgay; Uyar, Tamer

    2016-10-05

    We have proposed a new strategy for preparing free-standing nanofibrous webs from an inclusion complex (IC) of a well-known flavor/fragrance compound (limonene) with three modified cyclodextrins (HPβCD, MβCD, and HPγCD) via electrospinning (CD/limonene-IC-NFs) without using a polymeric matrix. The experimental and computational modeling studies proved that the stoichiometry of the complexes was 1:1 for CD/limonene systems. MβCD/limonene-IC-NF released much more limonene at 37, 50, and 75 °C than HPβCD/limonene-IC-NF and HPγCD/limonene-IC-NF because of the greater amount of preserved limonene. Moreover, MβCD/limonene-IC-NF has released only 25% (w/w) of its limonene, whereas HPβCD/limonene-IC-NF and HPγCD/limonene-IC-NF released 51 and 88% (w/w) of their limonene in 100 days, respectively. CD/limonene-IC-NFs exhibited high antibacterial activity against E. coli and S. aureus. The water solubility of limonene increased significantly and CD/limonene-IC-NFs were dissolved in water in a few seconds. In brief, CD/limonene-IC-NFs with fast-dissolving character enhanced the thermal stability and prolonged the shelf life along with antibacterial properties could be quite applicable in food and oral care applications.

  18. Establishing bioinformatics research in the Asia Pacific

    Directory of Open Access Journals (Sweden)

    Tammi Martti

    2006-12-01

    Full Text Available Abstract In 1998, the Asia Pacific Bioinformatics Network (APBioNet, Asia's oldest bioinformatics organisation was set up to champion the advancement of bioinformatics in the Asia Pacific. By 2002, APBioNet was able to gain sufficient critical mass to initiate the first International Conference on Bioinformatics (InCoB bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2006 Conference was organized as the 5th annual conference of the Asia-Pacific Bioinformatics Network, on Dec. 18–20, 2006 in New Delhi, India, following a series of successful events in Bangkok (Thailand, Penang (Malaysia, Auckland (New Zealand and Busan (South Korea. This Introduction provides a brief overview of the peer-reviewed manuscripts accepted for publication in this Supplement. It exemplifies a typical snapshot of the growing research excellence in bioinformatics of the region as we embark on a trajectory of establishing a solid bioinformatics research culture in the Asia Pacific that is able to contribute fully to the global bioinformatics community.

  19. Emerging strengths in Asia Pacific bioinformatics.

    Science.gov (United States)

    Ranganathan, Shoba; Hsu, Wen-Lian; Yang, Ueng-Cheng; Tan, Tin Wee

    2008-12-12

    The 2008 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998, was organized as the 7th International Conference on Bioinformatics (InCoB), jointly with the Bioinformatics and Systems Biology in Taiwan (BIT 2008) Conference, Oct. 20-23, 2008 at Taipei, Taiwan. Besides bringing together scientists from the field of bioinformatics in this region, InCoB is actively involving researchers from the area of systems biology, to facilitate greater synergy between these two groups. Marking the 10th Anniversary of APBioNet, this InCoB 2008 meeting followed on from a series of successful annual events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea), New Delhi (India) and Hong Kong. Additionally, tutorials and the Workshop on Education in Bioinformatics and Computational Biology (WEBCB) immediately prior to the 20th Federation of Asian and Oceanian Biochemists and Molecular Biologists (FAOBMB) Taipei Conference provided ample opportunity for inducting mainstream biochemists and molecular biologists from the region into a greater level of awareness of the importance of bioinformatics in their craft. In this editorial, we provide a brief overview of the peer-reviewed manuscripts accepted for publication herein, grouped into thematic areas. As the regional research expertise in bioinformatics matures, the papers fall into thematic areas, illustrating the specific contributions made by APBioNet to global bioinformatics efforts.

  20. Relax with CouchDB--into the non-relational DBMS era of bioinformatics.

    Science.gov (United States)

    Manyam, Ganiraju; Payton, Michelle A; Roth, Jack A; Abruzzo, Lynne V; Coombes, Kevin R

    2012-07-01

    With the proliferation of high-throughput technologies, genome-level data analysis has become common in molecular biology. Bioinformaticians are developing extensive resources to annotate and mine biological features from high-throughput data. The underlying database management systems for most bioinformatics software are based on a relational model. Modern non-relational databases offer an alternative that has flexibility, scalability, and a non-rigid design schema. Moreover, with an accelerated development pace, non-relational databases like CouchDB can be ideal tools to construct bioinformatics utilities. We describe CouchDB by presenting three new bioinformatics resources: (a) geneSmash, which collates data from bioinformatics resources and provides automated gene-centric annotations, (b) drugBase, a database of drug-target interactions with a web interface powered by geneSmash, and (c) HapMap-CN, which provides a web interface to query copy number variations from three SNP-chip HapMap datasets. In addition to the web sites, all three systems can be accessed programmatically via web services. Copyright © 2012 Elsevier Inc. All rights reserved.

  1. Relax with CouchDB - Into the non-relational DBMS era of Bioinformatics

    Science.gov (United States)

    Manyam, Ganiraju; Payton, Michelle A.; Roth, Jack A.; Abruzzo, Lynne V.; Coombes, Kevin R.

    2012-01-01

    With the proliferation of high-throughput technologies, genome-level data analysis has become common in molecular biology. Bioinformaticians are developing extensive resources to annotate and mine biological features from high-throughput data. The underlying database management systems for most bioinformatics software are based on a relational model. Modern non-relational databases offer an alternative that has flexibility, scalability, and a non-rigid design schema. Moreover, with an accelerated development pace, non-relational databases like CouchDB can be ideal tools to construct bioinformatics utilities. We describe CouchDB by presenting three new bioinformatics resources: (a) geneSmash, which collates data from bioinformatics resources and provides automated gene-centric annotations, (b) drugBase, a database of drug-target interactions with a web interface powered by geneSmash, and (c) HapMap-CN, which provides a web interface to query copy number variations from three SNP-chip HapMap datasets. In addition to the web sites, all three systems can be accessed programmatically via web services. PMID:22609849

  2. Web-based communications systems: innovate solutions to complex development challenges

    International Nuclear Information System (INIS)

    Levy, D.S.; Sordi, G.M.; Villavicencio, A.L.H.; Biazini Filho, F.

    2015-01-01

    This research work focus on the potential value of Information and Communication Technologies (ICTs) to enhance communication and education on Radiological Protection throughout Brazil. ICTs present unprecedented opportunities to innovate solutions to complex development issues, in this large country where it is a strong challenge to ensure access to information to as many people as possible, minimizing costs and optimizing results. Therefore, taking advantage of the impact of ICTs in modern Information Society and its institutions, some research works include education for workers, researchers and the public, offering conditions for learning and improving professional and personal skills. UNIPRORAD is a research work of informatization of radiological protection programs to offer unified programs and inter-related information in Portuguese. The system provides Brazilian facilities and researchers a complete repository for research, consultation and information. The content includes the best practices for optimization and monitoring programs, taking into account that in order to establish a Radiation Protection Plan or a Radiation Emergency Plan, there must be observed all procedures based on national and international recommendations published by different organizations over the past decades: International Commission on Radiological Protection (ICRP), International Atomic Energy Agency (IAEA) and National Nuclear Energy Commission (CNEN). Other than the efforts to disseminate information to radioactive facilities and researches, it is equally essential to invest in education and communication to increase public knowledge and understanding of the benefits of Nuclear Technology, such as food irradiation and social responsibility for electric power generation, for public acceptance of Nuclear Technology depends on public understanding of radiation and its effects on individuals, workers and environment. This research work aims to present several important initiatives

  3. Web-based bioinformatic resources for protein and nucleic acids ...

    African Journals Online (AJOL)

    Admin

    DNA sequencing is the deciphering of hereditary information. It is an indispensable prerequisite for many biotechnical applications and technologies and the continual acquisition of genomic information is very important. This opens the door not only for further research and better understanding of the architectural plan of life ...

  4. BioRuby: bioinformatics software for the Ruby programming language.

    Science.gov (United States)

    Goto, Naohisa; Prins, Pjotr; Nakao, Mitsuteru; Bonnal, Raoul; Aerts, Jan; Katayama, Toshiaki

    2010-10-15

    The BioRuby software toolkit contains a comprehensive set of free development tools and libraries for bioinformatics and molecular biology, written in the Ruby programming language. BioRuby has components for sequence analysis, pathway analysis, protein modelling and phylogenetic analysis; it supports many widely used data formats and provides easy access to databases, external programs and public web services, including BLAST, KEGG, GenBank, MEDLINE and GO. BioRuby comes with a tutorial, documentation and an interactive environment, which can be used in the shell, and in the web browser. BioRuby is free and open source software, made available under the Ruby license. BioRuby runs on all platforms that support Ruby, including Linux, Mac OS X and Windows. And, with JRuby, BioRuby runs on the Java Virtual Machine. The source code is available from http://www.bioruby.org/. katayama@bioruby.org

  5. Biology in 'silico': The Bioinformatics Revolution.

    Science.gov (United States)

    Bloom, Mark

    2001-01-01

    Explains the Human Genome Project (HGP) and efforts to sequence the human genome. Describes the role of bioinformatics in the project and considers it the genetics Swiss Army Knife, which has many different uses, for use in forensic science, medicine, agriculture, and environmental sciences. Discusses the use of bioinformatics in the high school…

  6. Using "Arabidopsis" Genetic Sequences to Teach Bioinformatics

    Science.gov (United States)

    Zhang, Xiaorong

    2009-01-01

    This article describes a new approach to teaching bioinformatics using "Arabidopsis" genetic sequences. Several open-ended and inquiry-based laboratory exercises have been designed to help students grasp key concepts and gain practical skills in bioinformatics, using "Arabidopsis" leucine-rich repeat receptor-like kinase (LRR…

  7. A Mathematical Optimization Problem in Bioinformatics

    Science.gov (United States)

    Heyer, Laurie J.

    2008-01-01

    This article describes the sequence alignment problem in bioinformatics. Through examples, we formulate sequence alignment as an optimization problem and show how to compute the optimal alignment with dynamic programming. The examples and sample exercises have been used by the author in a specialized course in bioinformatics, but could be adapted…

  8. Online Bioinformatics Tutorials | Office of Cancer Genomics

    Science.gov (United States)

    Bioinformatics is a scientific discipline that applies computer science and information technology to help understand biological processes. The NIH provides a list of free online bioinformatics tutorials, either generated by the NIH Library or other institutes, which includes introductory lectures and "how to" videos on using various tools.

  9. Fuzzy Logic in Medicine and Bioinformatics

    Directory of Open Access Journals (Sweden)

    Angela Torres

    2006-01-01

    Full Text Available The purpose of this paper is to present a general view of the current applications of fuzzy logic in medicine and bioinformatics. We particularly review the medical literature using fuzzy logic. We then recall the geometrical interpretation of fuzzy sets as points in a fuzzy hypercube and present two concrete illustrations in medicine (drug addictions and in bioinformatics (comparison of genomes.

  10. Rising Strengths Hong Kong SAR in Bioinformatics.

    Science.gov (United States)

    Chakraborty, Chiranjib; George Priya Doss, C; Zhu, Hailong; Agoramoorthy, Govindasamy

    2017-06-01

    Hong Kong's bioinformatics sector is attaining new heights in combination with its economic boom and the predominance of the working-age group in its population. Factors such as a knowledge-based and free-market economy have contributed towards a prominent position on the world map of bioinformatics. In this review, we have considered the educational measures, landmark research activities and the achievements of bioinformatics companies and the role of the Hong Kong government in the establishment of bioinformatics as strength. However, several hurdles remain. New government policies will assist computational biologists to overcome these hurdles and further raise the profile of the field. There is a high expectation that bioinformatics in Hong Kong will be a promising area for the next generation.

  11. Bioinformatics clouds for big data manipulation

    Directory of Open Access Journals (Sweden)

    Dai Lin

    2012-11-01

    Full Text Available Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS, Software as a Service (SaaS, Platform as a Service (PaaS, and Infrastructure as a Service (IaaS, and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.

  12. The 2016 Bioinformatics Open Source Conference (BOSC).

    Science.gov (United States)

    Harris, Nomi L; Cock, Peter J A; Chapman, Brad; Fields, Christopher J; Hokamp, Karsten; Lapp, Hilmar; Muñoz-Torres, Monica; Wiencko, Heather

    2016-01-01

    Message from the ISCB: The Bioinformatics Open Source Conference (BOSC) is a yearly meeting organized by the Open Bioinformatics Foundation (OBF), a non-profit group dedicated to promoting the practice and philosophy of Open Source software development and Open Science within the biological research community. BOSC has been run since 2000 as a two-day Special Interest Group (SIG) before the annual ISMB conference. The 17th annual BOSC ( http://www.open-bio.org/wiki/BOSC_2016) took place in Orlando, Florida in July 2016. As in previous years, the conference was preceded by a two-day collaborative coding event open to the bioinformatics community. The conference brought together nearly 100 bioinformatics researchers, developers and users of open source software to interact and share ideas about standards, bioinformatics software development, and open and reproducible science.

  13. Bioinformatics clouds for big data manipulation

    KAUST Repository

    Dai, Lin

    2012-11-28

    As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics.This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor. 2012 Dai et al.; licensee BioMed Central Ltd.

  14. Bioinformatics clouds for big data manipulation.

    Science.gov (United States)

    Dai, Lin; Gao, Xin; Guo, Yan; Xiao, Jingfa; Zhang, Zhang

    2012-11-28

    As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.

  15. BioWarehouse: a bioinformatics database warehouse toolkit

    Directory of Open Access Journals (Sweden)

    Stringer-Calvert David WJ

    2006-03-01

    Full Text Available Abstract Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the

  16. BioWarehouse: a bioinformatics database warehouse toolkit.

    Science.gov (United States)

    Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David W J; Tenenbaum, Jessica D; Karp, Peter D

    2006-03-23

    This article addresses the problem of interoperation of heterogeneous bioinformatics databases. We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. BioWarehouse embodies significant progress on the database integration problem for bioinformatics.

  17. SpolSimilaritySearch - A web tool to compare and search similarities between spoligotypes of Mycobacterium tuberculosis complex.

    Science.gov (United States)

    Couvin, David; Zozio, Thierry; Rastogi, Nalin

    2017-07-01

    Spoligotyping is one of the most commonly used polymerase chain reaction (PCR)-based methods for identification and study of genetic diversity of Mycobacterium tuberculosis complex (MTBC). Despite its known limitations if used alone, the methodology is particularly useful when used in combination with other methods such as mycobacterial interspersed repetitive units - variable number of tandem DNA repeats (MIRU-VNTRs). At a worldwide scale, spoligotyping has allowed identification of information on 103,856 MTBC isolates (corresponding to 98049 clustered strains plus 5807 unique isolates from 169 countries of patient origin) contained within the SITVIT2 proprietary database of the Institut Pasteur de la Guadeloupe. The SpolSimilaritySearch web-tool described herein (available at: http://www.pasteur-guadeloupe.fr:8081/SpolSimilaritySearch) incorporates a similarity search algorithm allowing users to get a complete overview of similar spoligotype patterns (with information on presence or absence of 43 spacers) in the aforementioned worldwide database. This tool allows one to analyze spread and evolutionary patterns of MTBC by comparing similar spoligotype patterns, to distinguish between widespread, specific and/or confined patterns, as well as to pinpoint patterns with large deleted blocks, which play an intriguing role in the genetic epidemiology of M. tuberculosis. Finally, the SpolSimilaritySearch tool also provides with the country distribution patterns for each queried spoligotype. Copyright © 2017 Elsevier Ltd. All rights reserved.

  18. Keemei: cloud-based validation of tabular bioinformatics file formats in Google Sheets.

    Science.gov (United States)

    Rideout, Jai Ram; Chase, John H; Bolyen, Evan; Ackermann, Gail; González, Antonio; Knight, Rob; Caporaso, J Gregory

    2016-06-13

    Bioinformatics software often requires human-generated tabular text files as input and has specific requirements for how those data are formatted. Users frequently manage these data in spreadsheet programs, which is convenient for researchers who are compiling the requisite information because the spreadsheet programs can easily be used on different platforms including laptops and tablets, and because they provide a familiar interface. It is increasingly common for many different researchers to be involved in compiling these data, including study coordinators, clinicians, lab technicians and bioinformaticians. As a result, many research groups are shifting toward using cloud-based spreadsheet programs, such as Google Sheets, which support the concurrent editing of a single spreadsheet by different users working on different platforms. Most of the researchers who enter data are not familiar with the formatting requirements of the bioinformatics programs that will be used, so validating and correcting file formats is often a bottleneck prior to beginning bioinformatics analysis. We present Keemei, a Google Sheets Add-on, for validating tabular files used in bioinformatics analyses. Keemei is available free of charge from Google's Chrome Web Store. Keemei can be installed and run on any web browser supported by Google Sheets. Keemei currently supports the validation of two widely used tabular bioinformatics formats, the Quantitative Insights into Microbial Ecology (QIIME) sample metadata mapping file format and the Spatially Referenced Genetic Data (SRGD) format, but is designed to easily support the addition of others. Keemei will save researchers time and frustration by providing a convenient interface for tabular bioinformatics file format validation. By allowing everyone involved with data entry for a project to easily validate their data, it will reduce the validation and formatting bottlenecks that are commonly encountered when human-generated data files are

  19. The Semantic Automated Discovery and Integration (SADI Web service Design-Pattern, API and Reference Implementation

    Directory of Open Access Journals (Sweden)

    Wilkinson Mark D

    2011-10-01

    Full Text Available Abstract Background The complexity and inter-related nature of biological data poses a difficult challenge for data and tool integration. There has been a proliferation of interoperability standards and projects over the past decade, none of which has been widely adopted by the bioinformatics community. Recent attempts have focused on the use of semantics to assist integration, and Semantic Web technologies are being welcomed by this community. Description SADI - Semantic Automated Discovery and Integration - is a lightweight set of fully standards-compliant Semantic Web service design patterns that simplify the publication of services of the type commonly found in bioinformatics and other scientific domains. Using Semantic Web technologies at every level of the Web services "stack", SADI services consume and produce instances of OWL Classes following a small number of very straightforward best-practices. In addition, we provide codebases that support these best-practices, and plug-in tools to popular developer and client software that dramatically simplify deployment of services by providers, and the discovery and utilization of those services by their consumers. Conclusions SADI Services are fully compliant with, and utilize only foundational Web standards; are simple to create and maintain for service providers; and can be discovered and utilized in a very intuitive way by biologist end-users. In addition, the SADI design patterns significantly improve the ability of software to automatically discover appropriate services based on user-needs, and automatically chain these into complex analytical workflows. We show that, when resources are exposed through SADI, data compliant with a given ontological model can be automatically gathered, or generated, from these distributed, non-coordinating resources - a behaviour we have not observed in any other Semantic system. Finally, we show that, using SADI, data dynamically generated from Web services

  20. The Semantic Automated Discovery and Integration (SADI) Web service Design-Pattern, API and Reference Implementation

    Science.gov (United States)

    2011-01-01

    Background The complexity and inter-related nature of biological data poses a difficult challenge for data and tool integration. There has been a proliferation of interoperability standards and projects over the past decade, none of which has been widely adopted by the bioinformatics community. Recent attempts have focused on the use of semantics to assist integration, and Semantic Web technologies are being welcomed by this community. Description SADI - Semantic Automated Discovery and Integration - is a lightweight set of fully standards-compliant Semantic Web service design patterns that simplify the publication of services of the type commonly found in bioinformatics and other scientific domains. Using Semantic Web technologies at every level of the Web services "stack", SADI services consume and produce instances of OWL Classes following a small number of very straightforward best-practices. In addition, we provide codebases that support these best-practices, and plug-in tools to popular developer and client software that dramatically simplify deployment of services by providers, and the discovery and utilization of those services by their consumers. Conclusions SADI Services are fully compliant with, and utilize only foundational Web standards; are simple to create and maintain for service providers; and can be discovered and utilized in a very intuitive way by biologist end-users. In addition, the SADI design patterns significantly improve the ability of software to automatically discover appropriate services based on user-needs, and automatically chain these into complex analytical workflows. We show that, when resources are exposed through SADI, data compliant with a given ontological model can be automatically gathered, or generated, from these distributed, non-coordinating resources - a behaviour we have not observed in any other Semantic system. Finally, we show that, using SADI, data dynamically generated from Web services can be explored in a manner

  1. The Semantic Automated Discovery and Integration (SADI) Web service Design-Pattern, API and Reference Implementation.

    Science.gov (United States)

    Wilkinson, Mark D; Vandervalk, Benjamin; McCarthy, Luke

    2011-10-24

    The complexity and inter-related nature of biological data poses a difficult challenge for data and tool integration. There has been a proliferation of interoperability standards and projects over the past decade, none of which has been widely adopted by the bioinformatics community. Recent attempts have focused on the use of semantics to assist integration, and Semantic Web technologies are being welcomed by this community. SADI - Semantic Automated Discovery and Integration - is a lightweight set of fully standards-compliant Semantic Web service design patterns that simplify the publication of services of the type commonly found in bioinformatics and other scientific domains. Using Semantic Web technologies at every level of the Web services "stack", SADI services consume and produce instances of OWL Classes following a small number of very straightforward best-practices. In addition, we provide codebases that support these best-practices, and plug-in tools to popular developer and client software that dramatically simplify deployment of services by providers, and the discovery and utilization of those services by their consumers. SADI Services are fully compliant with, and utilize only foundational Web standards; are simple to create and maintain for service providers; and can be discovered and utilized in a very intuitive way by biologist end-users. In addition, the SADI design patterns significantly improve the ability of software to automatically discover appropriate services based on user-needs, and automatically chain these into complex analytical workflows. We show that, when resources are exposed through SADI, data compliant with a given ontological model can be automatically gathered, or generated, from these distributed, non-coordinating resources - a behaviour we have not observed in any other Semantic system. Finally, we show that, using SADI, data dynamically generated from Web services can be explored in a manner very similar to data housed in

  2. Enhanced reproducibility of SADI web service workflows with Galaxy and Docker.

    Science.gov (United States)

    Aranguren, Mikel Egaña; Wilkinson, Mark D

    2015-01-01

    Semantic Web technologies have been widely applied in the life sciences, for example by data providers such as OpenLifeData and through web services frameworks such as SADI. The recently reported OpenLifeData2SADI project offers access to the vast OpenLifeData data store through SADI services. This article describes how to merge data retrieved from OpenLifeData2SADI with other SADI services using the Galaxy bioinformatics analysis platform, thus making this semantic data more amenable to complex analyses. This is demonstrated using a working example, which is made distributable and reproducible through a Docker image that includes SADI tools, along with the data and workflows that constitute the demonstration. The combination of Galaxy and Docker offers a solution for faithfully reproducing and sharing complex data retrieval and analysis workflows based on the SADI Semantic web service design patterns.

  3. Using registries to integrate bioinformatics tools and services into workbench environments

    DEFF Research Database (Denmark)

    Ménager, Hervé; Kalaš, Matúš; Rapacki, Kristoffer

    2016-01-01

    The diversity and complexity of bioinformatics resources presents significant challenges to their localisation, deployment and use, creating a need for reliable systems that address these issues. Meanwhile, users demand increasingly usable and integrated ways to access and analyse data, especially......, a software component that will ease the integration of bioinformatics resources in a workbench environment, using their description provided by the existing ELIXIR Tools and Data Services Registry....

  4. Web Caching

    Indian Academy of Sciences (India)

    leveraged through Web caching technology. Specifically, Web caching becomes an ... Web routing can improve the overall performance of the Internet. Web caching is similar to memory system caching - a Web cache stores Web resources in ...

  5. Computational biology and bioinformatics in Nigeria.

    Science.gov (United States)

    Fatumo, Segun A; Adoga, Moses P; Ojo, Opeolu O; Oluwagbemi, Olugbenga; Adeoye, Tolulope; Ewejobi, Itunuoluwa; Adebiyi, Marion; Adebiyi, Ezekiel; Bewaji, Clement; Nashiru, Oyekanmi

    2014-04-01

    Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  6. Computational biology and bioinformatics in Nigeria.

    Directory of Open Access Journals (Sweden)

    Segun A Fatumo

    2014-04-01

    Full Text Available Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  7. p3d--Python module for structural bioinformatics.

    Science.gov (United States)

    Fufezan, Christian; Specht, Michael

    2009-08-21

    High-throughput bioinformatic analysis tools are needed to mine the large amount of structural data via knowledge based approaches. The development of such tools requires a robust interface to access the structural data in an easy way. For this the Python scripting language is the optimal choice since its philosophy is to write an understandable source code. p3d is an object oriented Python module that adds a simple yet powerful interface to the Python interpreter to process and analyse three dimensional protein structure files (PDB files). p3d's strength arises from the combination of a) very fast spatial access to the structural data due to the implementation of a binary space partitioning (BSP) tree, b) set theory and c) functions that allow to combine a and b and that use human readable language in the search queries rather than complex computer language. All these factors combined facilitate the rapid development of bioinformatic tools that can perform quick and complex analyses of protein structures. p3d is the perfect tool to quickly develop tools for structural bioinformatics using the Python scripting language.

  8. p3d – Python module for structural bioinformatics

    Directory of Open Access Journals (Sweden)

    Fufezan Christian

    2009-08-01

    Full Text Available Abstract Background High-throughput bioinformatic analysis tools are needed to mine the large amount of structural data via knowledge based approaches. The development of such tools requires a robust interface to access the structural data in an easy way. For this the Python scripting language is the optimal choice since its philosophy is to write an understandable source code. Results p3d is an object oriented Python module that adds a simple yet powerful interface to the Python interpreter to process and analyse three dimensional protein structure files (PDB files. p3d's strength arises from the combination of a very fast spatial access to the structural data due to the implementation of a binary space partitioning (BSP tree, b set theory and c functions that allow to combine a and b and that use human readable language in the search queries rather than complex computer language. All these factors combined facilitate the rapid development of bioinformatic tools that can perform quick and complex analyses of protein structures. Conclusion p3d is the perfect tool to quickly develop tools for structural bioinformatics using the Python scripting language.

  9. Open discovery: An integrated live Linux platform of Bioinformatics tools.

    Science.gov (United States)

    Vetrivel, Umashankar; Pilla, Kalabharath

    2008-01-01

    Historically, live linux distributions for Bioinformatics have paved way for portability of Bioinformatics workbench in a platform independent manner. Moreover, most of the existing live Linux distributions limit their usage to sequence analysis and basic molecular visualization programs and are devoid of data persistence. Hence, open discovery - a live linux distribution has been developed with the capability to perform complex tasks like molecular modeling, docking and molecular dynamics in a swift manner. Furthermore, it is also equipped with complete sequence analysis environment and is capable of running windows executable programs in Linux environment. Open discovery portrays the advanced customizable configuration of fedora, with data persistency accessible via USB drive or DVD. The Open Discovery is distributed free under Academic Free License (AFL) and can be downloaded from http://www.OpenDiscovery.org.in.

  10. When cloud computing meets bioinformatics: a review.

    Science.gov (United States)

    Zhou, Shuigeng; Liao, Ruiqi; Guan, Jihong

    2013-10-01

    In the past decades, with the rapid development of high-throughput technologies, biology research has generated an unprecedented amount of data. In order to store and process such a great amount of data, cloud computing and MapReduce were applied to many fields of bioinformatics. In this paper, we first introduce the basic concepts of cloud computing and MapReduce, and their applications in bioinformatics. We then highlight some problems challenging the applications of cloud computing and MapReduce to bioinformatics. Finally, we give a brief guideline for using cloud computing in biology research.

  11. Parallel evolutionary computation in bioinformatics applications.

    Science.gov (United States)

    Pinho, Jorge; Sobral, João Luis; Rocha, Miguel

    2013-05-01

    A large number of optimization problems within the field of Bioinformatics require methods able to handle its inherent complexity (e.g. NP-hard problems) and also demand increased computational efforts. In this context, the use of parallel architectures is a necessity. In this work, we propose ParJECoLi, a Java based library that offers a large set of metaheuristic methods (such as Evolutionary Algorithms) and also addresses the issue of its efficient execution on a wide range of parallel architectures. The proposed approach focuses on the easiness of use, making the adaptation to distinct parallel environments (multicore, cluster, grid) transparent to the user. Indeed, this work shows how the development of the optimization library can proceed independently of its adaptation for several architectures, making use of Aspect-Oriented Programming. The pluggable nature of parallelism related modules allows the user to easily configure its environment, adding parallelism modules to the base source code when needed. The performance of the platform is validated with two case studies within biological model optimization. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  12. Challenge: A Multidisciplinary Degree Program in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Mudasser Fraz Wyne

    2006-06-01

    Full Text Available Bioinformatics is a new field that is poorly served by any of the traditional science programs in Biology, Computer science or Biochemistry. Known to be a rapidly evolving discipline, Bioinformatics has emerged from experimental molecular biology and biochemistry as well as from the artificial intelligence, database, pattern recognition, and algorithms disciplines of computer science. While institutions are responding to this increased demand by establishing graduate programs in bioinformatics, entrance barriers for these programs are high, largely due to the significant prerequisite knowledge which is required, both in the fields of biochemistry and computer science. Although many schools currently have or are proposing graduate programs in bioinformatics, few are actually developing new undergraduate programs. In this paper I explore the blend of a multidisciplinary approach, discuss the response of academia and highlight challenges faced by this emerging field.

  13. Using EMBL-EBI Services via Web Interface and Programmatically via Web Services.

    Science.gov (United States)

    Lopez, Rodrigo; Cowley, Andrew; Li, Weizhong; McWilliam, Hamish

    2014-12-12

    The European Bioinformatics Institute (EMBL-EBI) provides access to a wide range of databases and analysis tools that are of key importance in bioinformatics. As well as providing Web interfaces to these resources, Web Services are available using SOAP and REST protocols that enable programmatic access to our resources and allow their integration into other applications and analytical workflows. This unit describes the various options available to a typical researcher or bioinformatician who wishes to use our resources via Web interface or programmatically via a range of programming languages. Copyright © 2014 John Wiley & Sons, Inc.

  14. Concepts and introduction to RNA bioinformatics

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Hofacker, Ivo L.; Ruzzo, Walter L.

    2014-01-01

    RNA bioinformatics and computational RNA biology have emerged from implementing methods for predicting the secondary structure of single sequences. The field has evolved to exploit multiple sequences to take evolutionary information into account, such as compensating (and structure preserving) base...... for interactions between RNA and proteins.Here, we introduce the basic concepts of predicting RNA secondary structure relevant to the further analyses of RNA sequences. We also provide pointers to methods addressing various aspects of RNA bioinformatics and computational RNA biology....

  15. JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing.

    Directory of Open Access Journals (Sweden)

    David K Brown

    Full Text Available Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS, a workflow management system and web interface for high performance computing (HPC. JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS.

  16. JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing.

    Science.gov (United States)

    Brown, David K; Penkler, David L; Musyoka, Thommas M; Bishop, Özlem Tastan

    2015-01-01

    Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS.

  17. JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing

    Science.gov (United States)

    Brown, David K.; Penkler, David L.; Musyoka, Thommas M.; Bishop, Özlem Tastan

    2015-01-01

    Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS. PMID:26280450

  18. What is bioinformatics? A proposed definition and overview of the field.

    Science.gov (United States)

    Luscombe, N M; Greenbaum, D; Gerstein, M

    2001-01-01

    The recent flood of data from genome sequences and functional genomics has given rise to new field, bioinformatics, which combines elements of biology and computer science. Here we propose a definition for this new field and review some of the research that is being pursued, particularly in relation to transcriptional regulatory systems. Our definition is as follows: Bioinformatics is conceptualizing biology in terms of macromolecules (in the sense of physical-chemistry) and then applying "informatics" techniques (derived from disciplines such as applied maths, computer science, and statistics) to understand and organize the information associated with these molecules, on a large-scale. Analyses in bioinformatics predominantly focus on three types of large datasets available in molecular biology: macromolecular structures, genome sequences, and the results of functional genomics experiments (e.g. expression data). Additional information includes the text of scientific papers and "relationship data" from metabolic pathways, taxonomy trees, and protein-protein interaction networks. Bioinformatics employs a wide range of computational techniques including sequence and structural alignment, database design and data mining, macromolecular geometry, phylogenetic tree construction, prediction of protein structure and function, gene finding, and expression data clustering. The emphasis is on approaches integrating a variety of computational methods and heterogeneous data sources. Finally, bioinformatics is a practical discipline. We survey some representative applications, such as finding homologues, designing drugs, and performing large-scale censuses. Additional information pertinent to the review is available over the web at http://bioinfo.mbb.yale.edu/what-is-it.

  19. Reactivity on the Web

    OpenAIRE

    Bailey, James; Bry, François; Eckert, Michael; Patrânjan, Paula Lavinia

    2005-01-01

    Reactivity, the ability to detect simple and composite events and respond in a timely manner, is an essential requirement in many present-day information systems. With the emergence of new, dynamic Web applications, reactivity on the Web is receiving increasing attention. Reactive Web-based systems need to detect and react not only to simple events but also to complex, real-life situations. This paper introduces XChange, a language for programming reactive behaviour on the Web,...

  20. A Secure Web Application Providing Public Access to High-Performance Data Intensive Scientific Resources - ScalaBLAST Web Application

    International Nuclear Information System (INIS)

    Curtis, Darren S.; Peterson, Elena S.; Oehmen, Chris S.

    2008-01-01

    This work presents the ScalaBLAST Web Application (SWA), a web based application implemented using the PHP script language, MySQL DBMS, and Apache web server under a GNU/Linux platform. SWA is an application built as part of the Data Intensive Computer for Complex Biological Systems (DICCBS) project at the Pacific Northwest National Laboratory (PNNL). SWA delivers accelerated throughput of bioinformatics analysis via high-performance computing through a convenient, easy-to-use web interface. This approach greatly enhances emerging fields of study in biology such as ontology-based homology, and multiple whole genome comparisons which, in the absence of a tool like SWA, require a heroic effort to overcome the computational bottleneck associated with genome analysis. The current version of SWA includes a user account management system, a web based user interface, and a backend process that generates the files necessary for the Internet scientific community to submit a ScalaBLAST parallel processing job on a dedicated cluster

  1. Cytoscape tools for the web age: D3.js and Cytoscape.js exporters.

    Science.gov (United States)

    Ono, Keiichiro; Demchak, Barry; Ideker, Trey

    2014-01-01

    In this paper we present new data export modules for Cytoscape 3 that can generate network files for Cytoscape.js and D3.js. Cytoscape.js exporter is implemented as a core feature of Cytoscape 3, and D3.js exporter is available as a Cytoscape 3 app. These modules enable users to seamlessly export network and table data sets generated in Cytoscape to popular JavaScript library readable formats. In addition, we implemented template web applications for browser-based interactive network visualization that can be used as basis for complex data visualization applications for bioinformatics research. Example web applications created with these tools demonstrate how Cytoscape works in modern data visualization workflows built with traditional desktop tools and emerging web-based technologies. This interactivity enables researchers more flexibility than with static images, thereby greatly improving the quality of insights researchers can gain from them.

  2. Personalized cloud-based bioinformatics services for research and education: use cases and the elasticHPC package.

    Science.gov (United States)

    El-Kalioby, Mohamed; Abouelhoda, Mohamed; Krüger, Jan; Giegerich, Robert; Sczyrba, Alexander; Wall, Dennis P; Tonellato, Peter

    2012-01-01

    Bioinformatics services have been traditionally provided in the form of a web-server that is hosted at institutional infrastructure and serves multiple users. This model, however, is not flexible enough to cope with the increasing number of users, increasing data size, and new requirements in terms of speed and availability of service. The advent of cloud computing suggests a new service model that provides an efficient solution to these problems, based on the concepts of "resources-on-demand" and "pay-as-you-go". However, cloud computing has not yet been introduced within bioinformatics servers due to the lack of usage scenarios and software layers that address the requirements of the bioinformatics domain. In this paper, we provide different use case scenarios for providing cloud computing based services, considering both the technical and financial aspects of the cloud computing service model. These scenarios are for individual users seeking computational power as well as bioinformatics service providers aiming at provision of personalized bioinformatics services to their users. We also present elasticHPC, a software package and a library that facilitates the use of high performance cloud computing resources in general and the implementation of the suggested bioinformatics scenarios in particular. Concrete examples that demonstrate the suggested use case scenarios with whole bioinformatics servers and major sequence analysis tools like BLAST are presented. Experimental results with large datasets are also included to show the advantages of the cloud model. Our use case scenarios and the elasticHPC package are steps towards the provision of cloud based bioinformatics services, which would help in overcoming the data challenge of recent biological research. All resources related to elasticHPC and its web-interface are available at http://www.elasticHPC.org.

  3. Development of Bioinformatics Infrastructure for Genomics Research.

    Science.gov (United States)

    Mulder, Nicola J; Adebiyi, Ezekiel; Adebiyi, Marion; Adeyemi, Seun; Ahmed, Azza; Ahmed, Rehab; Akanle, Bola; Alibi, Mohamed; Armstrong, Don L; Aron, Shaun; Ashano, Efejiro; Baichoo, Shakuntala; Benkahla, Alia; Brown, David K; Chimusa, Emile R; Fadlelmola, Faisal M; Falola, Dare; Fatumo, Segun; Ghedira, Kais; Ghouila, Amel; Hazelhurst, Scott; Isewon, Itunuoluwa; Jung, Segun; Kassim, Samar Kamal; Kayondo, Jonathan K; Mbiyavanga, Mamana; Meintjes, Ayton; Mohammed, Somia; Mosaku, Abayomi; Moussa, Ahmed; Muhammd, Mustafa; Mungloo-Dilmohamud, Zahra; Nashiru, Oyekanmi; Odia, Trust; Okafor, Adaobi; Oladipo, Olaleye; Osamor, Victor; Oyelade, Jellili; Sadki, Khalid; Salifu, Samson Pandam; Soyemi, Jumoke; Panji, Sumir; Radouani, Fouzia; Souiai, Oussama; Tastan Bishop, Özlem

    2017-06-01

    Although pockets of bioinformatics excellence have developed in Africa, generally, large-scale genomic data analysis has been limited by the availability of expertise and infrastructure. H3ABioNet, a pan-African bioinformatics network, was established to build capacity specifically to enable H3Africa (Human Heredity and Health in Africa) researchers to analyze their data in Africa. Since the inception of the H3Africa initiative, H3ABioNet's role has evolved in response to changing needs from the consortium and the African bioinformatics community. H3ABioNet set out to develop core bioinformatics infrastructure and capacity for genomics research in various aspects of data collection, transfer, storage, and analysis. Various resources have been developed to address genomic data management and analysis needs of H3Africa researchers and other scientific communities on the continent. NetMap was developed and used to build an accurate picture of network performance within Africa and between Africa and the rest of the world, and Globus Online has been rolled out to facilitate data transfer. A participant recruitment database was developed to monitor participant enrollment, and data is being harmonized through the use of ontologies and controlled vocabularies. The standardized metadata will be integrated to provide a search facility for H3Africa data and biospecimens. Because H3Africa projects are generating large-scale genomic data, facilities for analysis and interpretation are critical. H3ABioNet is implementing several data analysis platforms that provide a large range of bioinformatics tools or workflows, such as Galaxy, the Job Management System, and eBiokits. A set of reproducible, portable, and cloud-scalable pipelines to support the multiple H3Africa data types are also being developed and dockerized to enable execution on multiple computing infrastructures. In addition, new tools have been developed for analysis of the uniquely divergent African data and for

  4. Applying semantic web services to enterprise web

    OpenAIRE

    Hu, Y; Yang, Q P; Sun, X; Wei, P

    2008-01-01

    Enterprise Web provides a convenient, extendable, integrated platform for information sharing and knowledge management. However, it still has many drawbacks due to complexity and increasing information glut, as well as the heterogeneity of the information processed. Research in the field of Semantic Web Services has shown the possibility of adding higher level of semantic functionality onto the top of current Enterprise Web, enhancing usability and usefulness of resource, enabling decision su...

  5. Learning Vue.js 2 learn how to build amazing and complex reactive web applications easily with Vue.js

    CERN Document Server

    Filipova, Olga

    2016-01-01

    About This Book Learn how to propagate DOM changes across the website without writing extensive jQuery callbacks code. Learn how to achieve reactivity and easily compose views with Vue.js and understand what it does behind the scenes. Explore the core features of Vue.js with small examples, learn how to build dynamic content into preexisting web applications, and build Vue.js applications from scratch. Who This Book Is For This book is perfect for novice web developer seeking to learn new technologies or frameworks and also for webdev gurus eager to enrich their experience. Whatever your level of expertise, this book is a great introduction to the wonderful world of reactive web apps. What You Will Learn Build a fully functioning reactive web application in Vue.js from scratch. The importance of the MVVM architecture and how Vue.js compares with other frameworks such as Angular.js and React.js. How to bring reactivity to an existing static application using Vue.js. How to use p...

  6. BioShaDock: a community driven bioinformatics shared Docker-based tools registry.

    Science.gov (United States)

    Moreews, François; Sallou, Olivier; Ménager, Hervé; Le Bras, Yvan; Monjeaud, Cyril; Blanchet, Christophe; Collin, Olivier

    2015-01-01

    Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community.

  7. The EMBRACE web service collection

    DEFF Research Database (Denmark)

    Pettifer, S.; Ison, J.; Kalas, M.

    2010-01-01

    The EMBRACE (European Model for Bioinformatics Research and Community Education) web service collection is the culmination of a 5-year project that set out to investigate issues involved in developing and deploying web services for use in the life sciences. The project concluded that in order...... for web services to achieve widespread adoption, standards must be defined for the choice of web service technology, for semantically annotating both service function and the data exchanged, and a mechanism for discovering services must be provided. Building on this, the project developed: EDAM......, an ontology for describing life science web services; BioXSD, a schema for exchanging data between services; and a centralized registry (http://www.embraceregistry.net) that collects together around 1000 services developed by the consortium partners. This article presents the current status of the collection...

  8. Translational Bioinformatics and Clinical Research (Biomedical) Informatics.

    Science.gov (United States)

    Sirintrapun, S Joseph; Zehir, Ahmet; Syed, Aijazuddin; Gao, JianJiong; Schultz, Nikolaus; Cheng, Donavan T

    2015-06-01

    Translational bioinformatics and clinical research (biomedical) informatics are the primary domains related to informatics activities that support translational research. Translational bioinformatics focuses on computational techniques in genetics, molecular biology, and systems biology. Clinical research (biomedical) informatics involves the use of informatics in discovery and management of new knowledge relating to health and disease. This article details 3 projects that are hybrid applications of translational bioinformatics and clinical research (biomedical) informatics: The Cancer Genome Atlas, the cBioPortal for Cancer Genomics, and the Memorial Sloan Kettering Cancer Center clinical variants and results database, all designed to facilitate insights into cancer biology and clinical/therapeutic correlations. Copyright © 2015 Elsevier Inc. All rights reserved.

  9. Bioinformatics analysis of Brucella vaccines and vaccine targets using VIOLIN.

    Science.gov (United States)

    He, Yongqun; Xiang, Zuoshuang

    2010-09-27

    Brucella spp. are Gram-negative, facultative intracellular bacteria that cause brucellosis, one of the commonest zoonotic diseases found worldwide in humans and a variety of animal species. While several animal vaccines are available, there is no effective and safe vaccine for prevention of brucellosis in humans. VIOLIN (http://www.violinet.org) is a web-based vaccine database and analysis system that curates, stores, and analyzes published data of commercialized vaccines, and vaccines in clinical trials or in research. VIOLIN contains information for 454 vaccines or vaccine candidates for 73 pathogens. VIOLIN also contains many bioinformatics tools for vaccine data analysis, data integration, and vaccine target prediction. To demonstrate the applicability of VIOLIN for vaccine research, VIOLIN was used for bioinformatics analysis of existing Brucella vaccines and prediction of new Brucella vaccine targets. VIOLIN contains many literature mining programs (e.g., Vaxmesh) that provide in-depth analysis of Brucella vaccine literature. As a result of manual literature curation, VIOLIN contains information for 38 Brucella vaccines or vaccine candidates, 14 protective Brucella antigens, and 68 host response studies to Brucella vaccines from 97 peer-reviewed articles. These Brucella vaccines are classified in the Vaccine Ontology (VO) system and used for different ontological applications. The web-based VIOLIN vaccine target prediction program Vaxign was used to predict new Brucella vaccine targets. Vaxign identified 14 outer membrane proteins that are conserved in six virulent strains from B. abortus, B. melitensis, and B. suis that are pathogenic in humans. Of the 14 membrane proteins, two proteins (Omp2b and Omp31-1) are not present in B. ovis, a Brucella species that is not pathogenic in humans. Brucella vaccine data stored in VIOLIN were compared and analyzed using the VIOLIN query system. Bioinformatics curation and ontological representation of Brucella vaccines

  10. dAPE: a web server to detect homorepeats and follow their evolution.

    Science.gov (United States)

    Mier, Pablo; Andrade-Navarro, Miguel A

    2017-04-15

    Homorepeats are low complexity regions consisting of repetitions of a single amino acid residue. There is no current consensus on the minimum number of residues needed to define a functional homorepeat, nor even if mismatches are allowed. Here we present dAPE, a web server that helps following the evolution of homorepeats based on orthology information, using a sensitive but tunable cutoff to help in the identification of emerging homorepeats. dAPE can be accessed from http://cbdm-01.zdv.uni-mainz.de/∼munoz/polyx . munoz@uni-mainz.de. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  11. Complexity

    Indian Academy of Sciences (India)

    Rahul Pandit

    2008-10-31

    Oct 31, 2008 ... Centre for Condensed Matter Theory. Department of Physics. Indian Institute ... Interactions between a system's components are important role. ... Scale-free networks in, say, social networks or the world-wide web. ▻ A system ...

  12. DelPhi Web Server: A comprehensive online suite for electrostatic calculations of biological macromolecules and their complexes

    Science.gov (United States)

    Sarkar, Subhra; Witham, Shawn; Zhang, Jie; Zhenirovskyy, Maxim; Rocchia, Walter; Alexov, Emil

    2011-01-01

    Here we report a web server, the DelPhi web server, which utilizes DelPhi program to calculate electrostatic energies and the corresponding electrostatic potential and ionic distributions, and dielectric map. The server provides extra services to fix structural defects, as missing atoms in the structural file and allows for generation of missing hydrogen atoms. The hydrogen placement and the corresponding DelPhi calculations can be done with user selected force field parameters being either Charmm22, Amber98 or OPLS. Upon completion of the calculations, the user is given option to download fixed and protonated structural file, together with the parameter and Delphi output files for further analysis. Utilizing Jmol viewer, the user can see the corresponding structural file, to manipulate it and to change the presentation. In addition, if the potential map is requested to be calculated, the potential can be mapped onto the molecule surface. The DelPhi web server is available from http://compbio.clemson.edu/delphi_webserver. PMID:24683424

  13. EpiCollect+: linking smartphones to web applications for complex data collection projects [v1; ref status: indexed, http://f1000r.es/3vk

    Directory of Open Access Journals (Sweden)

    David M. Aanensen

    2014-08-01

    Full Text Available Previously, we have described the development of the generic mobile phone data gathering tool, EpiCollect, and an associated web application, providing two-way communication between multiple data gatherers and a project database. This software only allows data collection on the phone using a single questionnaire form that is tailored to the needs of the user (including a single GPS point and photo per entry, whereas many applications require a more complex structure, allowing users to link a series of forms in a linear or branching hierarchy, along with the addition of any number of media types accessible from smartphones and/or tablet devices (e.g., GPS, photos, videos, sound clips and barcode scanning. A much enhanced version of EpiCollect has been developed (EpiCollect+. The individual data collection forms in EpiCollect+ provide more design complexity than the single form used in EpiCollect, and the software allows the generation of complex data collection projects through the ability to link many forms together in a linear (or branching hierarchy. Furthermore, EpiCollect+ allows the collection of multiple media types as well as standard text fields, increased data validation and form logic. The entire process of setting up a complex mobile phone data collection project to the specification of a user (project and form definitions can be undertaken at the EpiCollect+ website using a simple ‘drag and drop’ procedure, with visualisation of the data gathered using Google Maps and charts at the project website. EpiCollect+ is suitable for situations where multiple users transmit complex data by mobile phone (or other Android devices to a single project web database and is already being used for a range of field projects, particularly public health projects in sub-Saharan Africa. However, many uses can be envisaged from education, ecology and epidemiology to citizen science.

  14. Bioinformatic tools for PCR Primer design

    African Journals Online (AJOL)

    ES

    reaction (PCR), oligo hybridization and DNA sequencing. Proper primer design is actually one of the most important factors/steps in successful DNA sequencing. Various bioinformatics programs are available for selection of primer pairs from a template sequence. The plethora programs for PCR primer design reflects the.

  15. "Extreme Programming" in a Bioinformatics Class

    Science.gov (United States)

    Kelley, Scott; Alger, Christianna; Deutschman, Douglas

    2009-01-01

    The importance of Bioinformatics tools and methodology in modern biological research underscores the need for robust and effective courses at the college level. This paper describes such a course designed on the principles of cooperative learning based on a computer software industry production model called "Extreme Programming" (EP).…

  16. Bioinformatics: A History of Evolution "In Silico"

    Science.gov (United States)

    Ondrej, Vladan; Dvorak, Petr

    2012-01-01

    Bioinformatics, biological databases, and the worldwide use of computers have accelerated biological research in many fields, such as evolutionary biology. Here, we describe a primer of nucleotide sequence management and the construction of a phylogenetic tree with two examples; the two selected are from completely different groups of organisms:…

  17. Protein raftophilicity. How bioinformatics can help membranologists

    DEFF Research Database (Denmark)

    Nielsen, Henrik; Sperotto, Maria Maddalena

    )-based bioinformatics approach. The ANN was trained to recognize feature-based patterns in proteins that are considered to be associated with lipid rafts. The trained ANN was then used to predict protein raftophilicity. We found that, in the case of α-helical membrane proteins, their hydrophobic length does not affect...

  18. Bioinformatics in Undergraduate Education: Practical Examples

    Science.gov (United States)

    Boyle, John A.

    2004-01-01

    Bioinformatics has emerged as an important research tool in recent years. The ability to mine large databases for relevant information has become increasingly central to many different aspects of biochemistry and molecular biology. It is important that undergraduates be introduced to the available information and methodologies. We present a…

  19. Implementing bioinformatic workflows within the bioextract server

    Science.gov (United States)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows typically require the integrated use of multiple, distributed data sources and analytic tools. The BioExtract Server (http://bioextract.org) is a distributed servi...

  20. Privacy Preserving PCA on Distributed Bioinformatics Datasets

    Science.gov (United States)

    Li, Xin

    2011-01-01

    In recent years, new bioinformatics technologies, such as gene expression microarray, genome-wide association study, proteomics, and metabolomics, have been widely used to simultaneously identify a huge number of human genomic/genetic biomarkers, generate a tremendously large amount of data, and dramatically increase the knowledge on human…

  1. Development and implementation of a bioinformatics online ...

    African Journals Online (AJOL)

    Thus, there is the need for appropriate strategies of introducing the basic components of this emerging scientific field to part of the African populace through the development of an online distance education learning tool. This study involved the design of a bioinformatics online distance educative tool an implementation of ...

  2. SPECIES DATABASES AND THE BIOINFORMATICS REVOLUTION.

    Science.gov (United States)

    Biological databases are having a growth spurt. Much of this results from research in genetics and biodiversity, coupled with fast-paced developments in information technology. The revolution in bioinformatics, defined by Sugden and Pennisi (2000) as the "tools and techniques for...

  3. Fractal Hypothesis of the Pelagic Microbial Ecosystem—Can Simple Ecological Principles Lead to Self-Similar Complexity in the Pelagic Microbial Food Web?

    Science.gov (United States)

    Våge, Selina; Thingstad, T. Frede

    2015-01-01

    Trophic interactions are highly complex and modern sequencing techniques reveal enormous biodiversity across multiple scales in marine microbial communities. Within the chemically and physically relatively homogeneous pelagic environment, this calls for an explanation beyond spatial and temporal heterogeneity. Based on observations of simple parasite-host and predator-prey interactions occurring at different trophic levels and levels of phylogenetic resolution, we present a theoretical perspective on this enormous biodiversity, discussing in particular self-similar aspects of pelagic microbial food web organization. Fractal methods have been used to describe a variety of natural phenomena, with studies of habitat structures being an application in ecology. In contrast to mathematical fractals where pattern generating rules are readily known, however, identifying mechanisms that lead to natural fractals is not straight-forward. Here we put forward the hypothesis that trophic interactions between pelagic microbes may be organized in a fractal-like manner, with the emergent network resembling the structure of the Sierpinski triangle. We discuss a mechanism that could be underlying the formation of repeated patterns at different trophic levels and discuss how this may help understand characteristic biomass size-spectra that hint at scale-invariant properties of the pelagic environment. If the idea of simple underlying principles leading to a fractal-like organization of the pelagic food web could be formalized, this would extend an ecologists mindset on how biological complexity could be accounted for. It may furthermore benefit ecosystem modeling by facilitating adequate model resolution across multiple scales. PMID:26648929

  4. Scalable web services for the PSIPRED Protein Analysis Workbench.

    Science.gov (United States)

    Buchan, Daniel W A; Minneci, Federico; Nugent, Tim C O; Bryson, Kevin; Jones, David T

    2013-07-01

    Here, we present the new UCL Bioinformatics Group's PSIPRED Protein Analysis Workbench. The Workbench unites all of our previously available analysis methods into a single web-based framework. The new web portal provides a greatly streamlined user interface with a number of new features to allow users to better explore their results. We offer a number of additional services to enable computationally scalable execution of our prediction methods; these include SOAP and XML-RPC web server access and new HADOOP packages. All software and services are available via the UCL Bioinformatics Group website at http://bioinf.cs.ucl.ac.uk/.

  5. Video Bioinformatics Analysis of Human Embryonic Stem Cell Colony Growth

    Science.gov (United States)

    Lin, Sabrina; Fonteno, Shawn; Satish, Shruthi; Bhanu, Bir; Talbot, Prue

    2010-01-01

    Because video data are complex and are comprised of many images, mining information from video material is difficult to do without the aid of computer software. Video bioinformatics is a powerful quantitative approach for extracting spatio-temporal data from video images using computer software to perform dating mining and analysis. In this article, we introduce a video bioinformatics method for quantifying the growth of human embryonic stem cells (hESC) by analyzing time-lapse videos collected in a Nikon BioStation CT incubator equipped with a camera for video imaging. In our experiments, hESC colonies that were attached to Matrigel were filmed for 48 hours in the BioStation CT. To determine the rate of growth of these colonies, recipes were developed using CL-Quant software which enables users to extract various types of data from video images. To accurately evaluate colony growth, three recipes were created. The first segmented the image into the colony and background, the second enhanced the image to define colonies throughout the video sequence accurately, and the third measured the number of pixels in the colony over time. The three recipes were run in sequence on video data collected in a BioStation CT to analyze the rate of growth of individual hESC colonies over 48 hours. To verify the truthfulness of the CL-Quant recipes, the same data were analyzed manually using Adobe Photoshop software. When the data obtained using the CL-Quant recipes and Photoshop were compared, results were virtually identical, indicating the CL-Quant recipes were truthful. The method described here could be applied to any video data to measure growth rates of hESC or other cells that grow in colonies. In addition, other video bioinformatics recipes can be developed in the future for other cell processes such as migration, apoptosis, and cell adhesion. PMID:20495527

  6. Automatically exposing OpenLifeData via SADI semantic Web Services.

    Science.gov (United States)

    González, Alejandro Rodríguez; Callahan, Alison; Cruz-Toledo, José; Garcia, Adrian; Egaña Aranguren, Mikel; Dumontier, Michel; Wilkinson, Mark D

    2014-01-01

    Two distinct trends are emerging with respect to how data is shared, collected, and analyzed within the bioinformatics community. First, Linked Data, exposed as SPARQL endpoints, promises to make data easier to collect and integrate by moving towards the harmonization of data syntax, descriptive vocabularies, and identifiers, as well as providing a standardized mechanism for data access. Second, Web Services, often linked together into workflows, normalize data access and create transparent, reproducible scientific methodologies that can, in principle, be re-used and customized to suit new scientific questions. Constructing queries that traverse semantically-rich Linked Data requires substantial expertise, yet traditional RESTful or SOAP Web Services cannot adequately describe the content of a SPARQL endpoint. We propose that content-driven Semantic Web Services can enable facile discovery of Linked Data, independent of their location. We use a well-curated Linked Dataset - OpenLifeData - and utilize its descriptive metadata to automatically configure a series of more than 22,000 Semantic Web Services that expose all of its content via the SADI set of design principles. The OpenLifeData SADI services are discoverable via queries to the SHARE registry and easy to integrate into new or existing bioinformatics workflows and analytical pipelines. We demonstrate the utility of this system through comparison of Web Service-mediated data access with traditional SPARQL, and note that this approach not only simplifies data retrieval, but simultaneously provides protection against resource-intensive queries. We show, through a variety of different clients and examples of varying complexity, that data from the myriad OpenLifeData can be recovered without any need for prior-knowledge of the content or structure of the SPARQL endpoints. We also demonstrate that, via clients such as SHARE, the complexity of federated SPARQL queries is dramatically reduced.

  7. Development of Web GIS for complex processing and visualization of climate geospatial datasets as an integral part of dedicated Virtual Research Environment

    Science.gov (United States)

    Gordov, Evgeny; Okladnikov, Igor; Titov, Alexander

    2017-04-01

    For comprehensive usage of large geospatial meteorological and climate datasets it is necessary to create a distributed software infrastructure based on the spatial data infrastructure (SDI) approach. Currently, it is generally accepted that the development of client applications as integrated elements of such infrastructure should be based on the usage of modern web and GIS technologies. The paper describes the Web GIS for complex processing and visualization of geospatial (mainly in NetCDF and PostGIS formats) datasets as an integral part of the dedicated Virtual Research Environment for comprehensive study of ongoing and possible future climate change, and analysis of their implications, providing full information and computing support for the study of economic, political and social consequences of global climate change at the global and regional levels. The Web GIS consists of two basic software parts: 1. Server-side part representing PHP applications of the SDI geoportal and realizing the functionality of interaction with computational core backend, WMS/WFS/WPS cartographical services, as well as implementing an open API for browser-based client software. Being the secondary one, this part provides a limited set of procedures accessible via standard HTTP interface. 2. Front-end part representing Web GIS client developed according to a "single page application" technology based on JavaScript libraries OpenLayers (http://openlayers.org/), ExtJS (https://www.sencha.com/products/extjs), GeoExt (http://geoext.org/). It implements application business logic and provides intuitive user interface similar to the interface of such popular desktop GIS applications, as uDIG, QuantumGIS etc. Boundless/OpenGeo architecture was used as a basis for Web-GIS client development. According to general INSPIRE requirements to data visualization Web GIS provides such standard functionality as data overview, image navigation, scrolling, scaling and graphical overlay, displaying map

  8. Nanoinformatics: an emerging area of information technology at the intersection of bioinformatics, computational chemistry and nanobiotechnology

    Directory of Open Access Journals (Sweden)

    Fernando González-Nilo

    2011-01-01

    Full Text Available After the progress made during the genomics era, bioinformatics was tasked with supporting the flow of information generated by nanobiotechnology efforts. This challenge requires adapting classical bioinformatic and computational chemistry tools to store, standardize, analyze, and visualize nanobiotechnological information. Thus, old and new bioinformatic and computational chemistry tools have been merged into a new sub-discipline: nanoinformatics. This review takes a second look at the development of this new and exciting area as seen from the perspective of the evolution of nanobiotechnology applied to the life sciences. The knowledge obtained at the nano-scale level implies answers to new questions and the development of new concepts in different fields. The rapid convergence of technologies around nanobiotechnologies has spun off collaborative networks and web platforms created for sharing and discussing the knowledge generated in nanobiotechnology. The implementation of new database schemes suitable for storage, processing and integrating physical, chemical, and biological properties of nanoparticles will be a key element in achieving the promises in this convergent field. In this work, we will review some applications of nanobiotechnology to life sciences in generating new requirements for diverse scientific fields, such as bioinformatics and computational chemistry.

  9. Semantic Web meets Integrative Biology: a survey.

    Science.gov (United States)

    Chen, Huajun; Yu, Tong; Chen, Jake Y

    2013-01-01

    Integrative Biology (IB) uses experimental or computational quantitative technologies to characterize biological systems at the molecular, cellular, tissue and population levels. IB typically involves the integration of the data, knowledge and capabilities across disciplinary boundaries in order to solve complex problems. We identify a series of bioinformatics problems posed by interdisciplinary integration: (i) data integration that interconnects structured data across related biomedical domains; (ii) ontology integration that brings jargons, terminologies and taxonomies from various disciplines into a unified network of ontologies; (iii) knowledge integration that integrates disparate knowledge elements from multiple sources; (iv) service integration that build applications out of services provided by different vendors. We argue that IB can benefit significantly from the integration solutions enabled by Semantic Web (SW) technologies. The SW enables scientists to share content beyond the boundaries of applications and websites, resulting into a web of data that is meaningful and understandable to any computers. In this review, we provide insight into how SW technologies can be used to build open, standardized and interoperable solutions for interdisciplinary integration on a global basis. We present a rich set of case studies in system biology, integrative neuroscience, bio-pharmaceutics and translational medicine, to highlight the technical features and benefits of SW applications in IB.

  10. Component-Based Approach for Educating Students in Bioinformatics

    Science.gov (United States)

    Poe, D.; Venkatraman, N.; Hansen, C.; Singh, G.

    2009-01-01

    There is an increasing need for an effective method of teaching bioinformatics. Increased progress and availability of computer-based tools for educating students have led to the implementation of a computer-based system for teaching bioinformatics as described in this paper. Bioinformatics is a recent, hybrid field of study combining elements of…

  11. Toward the prediction of class I and II mouse major histocompatibility complex-peptide-binding affinity: in silico bioinformatic step-by-step guide using quantitative structure-activity relationships.

    Science.gov (United States)

    Hattotuwagama, Channa K; Doytchinova, Irini A; Flower, Darren R

    2007-01-01

    Quantitative structure-activity relationship (QSAR) analysis is a cornerstone of modern informatics. Predictive computational models of peptide-major histocompatibility complex (MHC)-binding affinity based on QSAR technology have now become important components of modern computational immunovaccinology. Historically, such approaches have been built around semiqualitative, classification methods, but these are now giving way to quantitative regression methods. We review three methods--a 2D-QSAR additive-partial least squares (PLS) and a 3D-QSAR comparative molecular similarity index analysis (CoMSIA) method--which can identify the sequence dependence of peptide-binding specificity for various class I MHC alleles from the reported binding affinities (IC50) of peptide sets. The third method is an iterative self-consistent (ISC) PLS-based additive method, which is a recently developed extension to the additive method for the affinity prediction of class II peptides. The QSAR methods presented here have established themselves as immunoinformatic techniques complementary to existing methodology, useful in the quantitative prediction of binding affinity: current methods for the in silico identification of T-cell epitopes (which form the basis of many vaccines, diagnostics, and reagents) rely on the accurate computational prediction of peptide-MHC affinity. We have reviewed various human and mouse class I and class II allele models. Studied alleles comprise HLA-A*0101, HLA-A*0201, HLA-A*0202, HLA-A*0203, HLA-A*0206, HLA-A*0301, HLA-A*1101, HLA-A*3101, HLA-A*6801, HLA-A*6802, HLA-B*3501, H2-K(k), H2-K(b), H2-D(b) HLA-DRB1*0101, HLA-DRB1*0401, HLA-DRB1*0701, I-A(b), I-A(d), I-A(k), I-A(S), I-E(d), and I-E(k). In this chapter we show a step-by-step guide into predicting the reliability and the resulting models to represent an advance on existing methods. The peptides used in this study are available from the AntiJen database (http://www.jenner.ac.uk/AntiJen). The PLS method

  12. Bioinformatics and systems biology research update from the 15th International Conference on Bioinformatics (InCoB2016).

    Science.gov (United States)

    Schönbach, Christian; Verma, Chandra; Bond, Peter J; Ranganathan, Shoba

    2016-12-22

    The International Conference on Bioinformatics (InCoB) has been publishing peer-reviewed conference papers in BMC Bioinformatics since 2006. Of the 44 articles accepted for publication in supplement issues of BMC Bioinformatics, BMC Genomics, BMC Medical Genomics and BMC Systems Biology, 24 articles with a bioinformatics or systems biology focus are reviewed in this editorial. InCoB2017 is scheduled to be held in Shenzen, China, September 20-22, 2017.

  13. XML schemas for common bioinformatic data types and their application in workflow systems.

    Science.gov (United States)

    Seibel, Philipp N; Krüger, Jan; Hartmeier, Sven; Schwarzer, Knut; Löwenthal, Kai; Mersch, Henning; Dandekar, Thomas; Giegerich, Robert

    2006-11-06

    Today, there is a growing need in bioinformatics to combine available software tools into chains, thus building complex applications from existing single-task tools. To create such workflows, the tools involved have to be able to work with each other's data--therefore, a common set of well-defined data formats is needed. Unfortunately, current bioinformatic tools use a great variety of heterogeneous formats. Acknowledging the need for common formats, the Helmholtz Open BioInformatics Technology network (HOBIT) identified several basic data types used in bioinformatics and developed appropriate format descriptions, formally defined by XML schemas, and incorporated them in a Java library (BioDOM). These schemas currently cover sequence, sequence alignment, RNA secondary structure and RNA secondary structure alignment formats in a form that is independent of any specific program, thus enabling seamless interoperation of different tools. All XML formats are available at http://bioschemas.sourceforge.net, the BioDOM library can be obtained at http://biodom.sourceforge.net. The HOBIT XML schemas and the BioDOM library simplify adding XML support to newly created and existing bioinformatic tools, enabling these tools to interoperate seamlessly in workflow scenarios.

  14. XML schemas for common bioinformatic data types and their application in workflow systems

    Science.gov (United States)

    Seibel, Philipp N; Krüger, Jan; Hartmeier, Sven; Schwarzer, Knut; Löwenthal, Kai; Mersch, Henning; Dandekar, Thomas; Giegerich, Robert

    2006-01-01

    Background Today, there is a growing need in bioinformatics to combine available software tools into chains, thus building complex applications from existing single-task tools. To create such workflows, the tools involved have to be able to work with each other's data – therefore, a common set of well-defined data formats is needed. Unfortunately, current bioinformatic tools use a great variety of heterogeneous formats. Results Acknowledging the need for common formats, the Helmholtz Open BioInformatics Technology network (HOBIT) identified several basic data types used in bioinformatics and developed appropriate format descriptions, formally defined by XML schemas, and incorporated them in a Java library (BioDOM). These schemas currently cover sequence, sequence alignment, RNA secondary structure and RNA secondary structure alignment formats in a form that is independent of any specific program, thus enabling seamless interoperation of different tools. All XML formats are available at , the BioDOM library can be obtained at . Conclusion The HOBIT XML schemas and the BioDOM library simplify adding XML support to newly created and existing bioinformatic tools, enabling these tools to interoperate seamlessly in workflow scenarios. PMID:17087823

  15. Promoting synergistic research and education in genomics and bioinformatics.

    Science.gov (United States)

    Yang, Jack Y; Yang, Mary Qu; Zhu, Mengxia Michelle; Arabnia, Hamid R; Deng, Youping

    2008-01-01

    Bioinformatics and Genomics are closely related disciplines that hold great promises for the advancement of research and development in complex biomedical systems, as well as public health, drug design, comparative genomics, personalized medicine and so on. Research and development in these two important areas are impacting the science and technology.High throughput sequencing and molecular imaging technologies marked the beginning of a new era for modern translational medicine and personalized healthcare. The impact of having the human sequence and personalized digital images in hand has also created tremendous demands of developing powerful supercomputing, statistical learning and artificial intelligence approaches to handle the massive bioinformatics and personalized healthcare data, which will obviously have a profound effect on how biomedical research will be conducted toward the improvement of human health and prolonging of human life in the future. The International Society of Intelligent Biological Medicine (http://www.isibm.org) and its official journals, the International Journal of Functional Informatics and Personalized Medicine (http://www.inderscience.com/ijfipm) and the International Journal of Computational Biology and Drug Design (http://www.inderscience.com/ijcbdd) in collaboration with International Conference on Bioinformatics and Computational Biology (Biocomp), touch tomorrow's bioinformatics and personalized medicine throughout today's efforts in promoting the research, education and awareness of the upcoming integrated inter/multidisciplinary field. The 2007 international conference on Bioinformatics and Computational Biology (BIOCOMP07) was held in Las Vegas, the United States of American on June 25-28, 2007. The conference attracted over 400 papers, covering broad research areas in the genomics, biomedicine and bioinformatics. The Biocomp 2007 provides a common platform for the cross fertilization of ideas, and to help shape knowledge and

  16. Computational Lipidomics and Lipid Bioinformatics: Filling In the Blanks.

    Science.gov (United States)

    Pauling, Josch; Klipp, Edda

    2016-12-22

    Lipids are highly diverse metabolites of pronounced importance in health and disease. While metabolomics is a broad field under the omics umbrella that may also relate to lipids, lipidomics is an emerging field which specializes in the identification, quantification and functional interpretation of complex lipidomes. Today, it is possible to identify and distinguish lipids in a high-resolution, high-throughput manner and simultaneously with a lot of structural detail. However, doing so may produce thousands of mass spectra in a single experiment which has created a high demand for specialized computational support to analyze these spectral libraries. The computational biology and bioinformatics community has so far established methodology in genomics, transcriptomics and proteomics but there are many (combinatorial) challenges when it comes to structural diversity of lipids and their identification, quantification and interpretation. This review gives an overview and outlook on lipidomics research and illustrates ongoing computational and bioinformatics efforts. These efforts are important and necessary steps to advance the lipidomics field alongside analytic, biochemistry, biomedical and biology communities and to close the gap in available computational methodology between lipidomics and other omics sub-branches.

  17. A review of bioinformatic methods for forensic DNA analyses.

    Science.gov (United States)

    Liu, Yao-Yuan; Harbison, SallyAnn

    2018-03-01

    Short tandem repeats, single nucleotide polymorphisms, and whole mitochondrial analyses are three classes of markers which will play an important role in the future of forensic DNA typing. The arrival of massively parallel sequencing platforms in forensic science reveals new information such as insights into the complexity and variability of the markers that were previously unseen, along with amounts of data too immense for analyses by manual means. Along with the sequencing chemistries employed, bioinformatic methods are required to process and interpret this new and extensive data. As more is learnt about the use of these new technologies for forensic applications, development and standardization of efficient, favourable tools for each stage of data processing is being carried out, and faster, more accurate methods that improve on the original approaches have been developed. As forensic laboratories search for the optimal pipeline of tools, sequencer manufacturers have incorporated pipelines into sequencer software to make analyses convenient. This review explores the current state of bioinformatic methods and tools used for the analyses of forensic markers sequenced on the massively parallel sequencing (MPS) platforms currently most widely used. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. Agonist Binding to Chemosensory Receptors: A Systematic Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Fabrizio Fierro

    2017-09-01

    Full Text Available Human G-protein coupled receptors (hGPCRs constitute a large and highly pharmaceutically relevant membrane receptor superfamily. About half of the hGPCRs' family members are chemosensory receptors, involved in bitter taste and olfaction, along with a variety of other physiological processes. Hence these receptors constitute promising targets for pharmaceutical intervention. Molecular modeling has been so far the most important tool to get insights on agonist binding and receptor activation. Here we investigate both aspects by bioinformatics-based predictions across all bitter taste and odorant receptors for which site-directed mutagenesis data are available. First, we observe that state-of-the-art homology modeling combined with previously used docking procedures turned out to reproduce only a limited fraction of ligand/receptor interactions inferred by experiments. This is most probably caused by the low sequence identity with available structural templates, which limits the accuracy of the protein model and in particular of the side-chains' orientations. Methods which transcend the limited sampling of the conformational space of docking may improve the predictions. As an example corroborating this, we review here multi-scale simulations from our lab and show that, for the three complexes studied so far, they significantly enhance the predictive power of the computational approach. Second, our bioinformatics analysis provides support to previous claims that several residues, including those at positions 1.50, 2.50, and 7.52, are involved in receptor activation.

  19. Bioinformatics in New Generation Flavivirus Vaccines

    Directory of Open Access Journals (Sweden)

    Penelope Koraka

    2010-01-01

    Full Text Available Flavivirus infections are the most prevalent arthropod-borne infections world wide, often causing severe disease especially among children, the elderly, and the immunocompromised. In the absence of effective antiviral treatment, prevention through vaccination would greatly reduce morbidity and mortality associated with flavivirus infections. Despite the success of the empirically developed vaccines against yellow fever virus, Japanese encephalitis virus and tick-borne encephalitis virus, there is an increasing need for a more rational design and development of safe and effective vaccines. Several bioinformatic tools are available to support such rational vaccine design. In doing so, several parameters have to be taken into account, such as safety for the target population, overall immunogenicity of the candidate vaccine, and efficacy and longevity of the immune responses triggered. Examples of how bio-informatics is applied to assist in the rational design and improvements of vaccines, particularly flavivirus vaccines, are presented and discussed.

  20. Penalized feature selection and classification in bioinformatics

    OpenAIRE

    Ma, Shuangge; Huang, Jian

    2008-01-01

    In bioinformatics studies, supervised classification with high-dimensional input variables is frequently encountered. Examples routinely arise in genomic, epigenetic and proteomic studies. Feature selection can be employed along with classifier construction to avoid over-fitting, to generate more reliable classifier and to provide more insights into the underlying causal relationships. In this article, we provide a review of several recently developed penalized feature selection and classific...

  1. Bioinformatics Training: A Review of Challenges, Actions and Support Requirements

    DEFF Research Database (Denmark)

    Schneider, M.V.; Watson, J.; Attwood, T.

    2010-01-01

    As bioinformatics becomes increasingly central to research in the molecular life sciences, the need to train non-bioinformaticians to make the most of bioinformatics resources is growing. Here, we review the key challenges and pitfalls to providing effective training for users of bioinformatics...... services, and discuss successful training strategies shared by a diverse set of bioinformatics trainers. We also identify steps that trainers in bioinformatics could take together to advance the state of the art in current training practices. The ideas presented in this article derive from the first...

  2. Bioinformatics on the Cloud Computing Platform Azure

    Science.gov (United States)

    Shanahan, Hugh P.; Owen, Anne M.; Harrison, Andrew P.

    2014-01-01

    We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development. PMID:25050811

  3. Application of Bioinformatics in Chronobiology Research

    Directory of Open Access Journals (Sweden)

    Robson da Silva Lopes

    2013-01-01

    Full Text Available Bioinformatics and other well-established sciences, such as molecular biology, genetics, and biochemistry, provide a scientific approach for the analysis of data generated through “omics” projects that may be used in studies of chronobiology. The results of studies that apply these techniques demonstrate how they significantly aided the understanding of chronobiology. However, bioinformatics tools alone cannot eliminate the need for an understanding of the field of research or the data to be considered, nor can such tools replace analysts and researchers. It is often necessary to conduct an evaluation of the results of a data mining effort to determine the degree of reliability. To this end, familiarity with the field of investigation is necessary. It is evident that the knowledge that has been accumulated through chronobiology and the use of tools derived from bioinformatics has contributed to the recognition and understanding of the patterns and biological rhythms found in living organisms. The current work aims to develop new and important applications in the near future through chronobiology research.

  4. Chapter 16: text mining for translational bioinformatics.

    Science.gov (United States)

    Cohen, K Bretonnel; Hunter, Lawrence E

    2013-04-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.

  5. Development of a Web-Enabled Informatics Platform for Manipulation of Gene Expression Data

    National Research Council Canada - National Science Library

    Peel, Sheila A; Kopydlowski, Karen; Carel, Roland

    2004-01-01

    .... The Array Repository Data Analysis System (ARDAS 1.0) at Walter Reed Army Institute of Research is a web-enabled bioinformatic platform consisting of a Laboratory Information Management System (LIMS...

  6. Workflows in bioinformatics: meta-analysis and prototype implementation of a workflow generator

    Directory of Open Access Journals (Sweden)

    Thoraval Samuel

    2005-04-01

    Full Text Available Abstract Background Computational methods for problem solving need to interleave information access and algorithm execution in a problem-specific workflow. The structures of these workflows are defined by a scaffold of syntactic, semantic and algebraic objects capable of representing them. Despite the proliferation of GUIs (Graphic User Interfaces in bioinformatics, only some of them provide workflow capabilities; surprisingly, no meta-analysis of workflow operators and components in bioinformatics has been reported. Results We present a set of syntactic components and algebraic operators capable of representing analytical workflows in bioinformatics. Iteration, recursion, the use of conditional statements, and management of suspend/resume tasks have traditionally been implemented on an ad hoc basis and hard-coded; by having these operators properly defined it is possible to use and parameterize them as generic re-usable components. To illustrate how these operations can be orchestrated, we present GPIPE, a prototype graphic pipeline generator for PISE that allows the definition of a pipeline, parameterization of its component methods, and storage of metadata in XML formats. This implementation goes beyond the macro capacities currently in PISE. As the entire analysis protocol is defined in XML, a complete bioinformatic experiment (linked sets of methods, parameters and results can be reproduced or shared among users. Availability: http://if-web1.imb.uq.edu.au/Pise/5.a/gpipe.html (interactive, ftp://ftp.pasteur.fr/pub/GenSoft/unix/misc/Pise/ (download. Conclusion From our meta-analysis we have identified syntactic structures and algebraic operators common to many workflows in bioinformatics. The workflow components and algebraic operators can be assimilated into re-usable software components. GPIPE, a prototype implementation of this framework, provides a GUI builder to facilitate the generation of workflows and integration of heterogeneous

  7. Web tools for predictive toxicology model building.

    Science.gov (United States)

    Jeliazkova, Nina

    2012-07-01

    The development and use of web tools in chemistry has accumulated more than 15 years of history already. Powered by the advances in the Internet technologies, the current generation of web systems are starting to expand into areas, traditional for desktop applications. The web platforms integrate data storage, cheminformatics and data analysis tools. The ease of use and the collaborative potential of the web is compelling, despite the challenges. The topic of this review is a set of recently published web tools that facilitate predictive toxicology model building. The focus is on software platforms, offering web access to chemical structure-based methods, although some of the frameworks could also provide bioinformatics or hybrid data analysis functionalities. A number of historical and current developments are cited. In order to provide comparable assessment, the following characteristics are considered: support for workflows, descriptor calculations, visualization, modeling algorithms, data management and data sharing capabilities, availability of GUI or programmatic access and implementation details. The success of the Web is largely due to its highly decentralized, yet sufficiently interoperable model for information access. The expected future convergence between cheminformatics and bioinformatics databases provides new challenges toward management and analysis of large data sets. The web tools in predictive toxicology will likely continue to evolve toward the right mix of flexibility, performance, scalability, interoperability, sets of unique features offered, friendly user interfaces, programmatic access for advanced users, platform independence, results reproducibility, curation and crowdsourcing utilities, collaborative sharing and secure access.

  8. Designing Effective Web Forms for Older Web Users

    Science.gov (United States)

    Li, Hui; Rau, Pei-Luen Patrick; Fujimura, Kaori; Gao, Qin; Wang, Lin

    2012-01-01

    This research aims to provide insight for web form design for older users. The effects of task complexity and information structure of web forms on older users' performance were examined. Forty-eight older participants with abundant computer and web experience were recruited. The results showed significant differences in task time and error rate…

  9. WebCom: A Model for Understanding Web Site Communication

    DEFF Research Database (Denmark)

    Godsk, Mikkel; Petersen, Anja Bechmann

    2008-01-01

    of the approaches' strengths. Furthermore, it is discussed and shortly demonstrated how WebCom can be used for analytical and design purposes with YouTube as an example. The chapter concludes that WebCom is able to serve as a theoretically-based model for understanding complex Web site communication situations...

  10. Achievements and challenges in structural bioinformatics and computational biophysics.

    Science.gov (United States)

    Samish, Ilan; Bourne, Philip E; Najmanovich, Rafael J

    2015-01-01

    The field of structural bioinformatics and computational biophysics has undergone a revolution in the last 10 years. Developments that are captured annually through the 3DSIG meeting, upon which this article reflects. An increase in the accessible data, computational resources and methodology has resulted in an increase in the size and resolution of studied systems and the complexity of the questions amenable to research. Concomitantly, the parameterization and efficiency of the methods have markedly improved along with their cross-validation with other computational and experimental results. The field exhibits an ever-increasing integration with biochemistry, biophysics and other disciplines. In this article, we discuss recent achievements along with current challenges within the field. © The Author 2014. Published by Oxford University Press.

  11. Establishing a master's degree programme in bioinformatics: challenges and opportunities.

    Science.gov (United States)

    Sahinidis, N V; Harandi, M T; Heath, M T; Murphy, L; Snir, M; Wheeler, R P; Zukoski, C F

    2005-12-01

    The development of the Bioinformatics MS degree program at the University of Illinois, the challenges and opportunities associated with such a process, and the current structure of the program is described. This program has departed from earlier University practice in significant ways. Despite the existence of several interdisciplinary programs at the University, a few of which grant degrees, this is the first interdisciplinary program that grants degrees and formally recognises departmental specialisation areas. The program, which is not owned by any particular department but by the Graduate College itself, is operated in a franchise-like fashion via several departmental concentrations. With four different colleges and many more departments involved in establishing and operating the program, the logistics of the operation are of considerable complexity but result in significant interactions across the entire campus.

  12. Single-Cell Transcriptomics Bioinformatics and Computational Challenges

    Directory of Open Access Journals (Sweden)

    Lana Garmire

    2016-09-01

    Full Text Available The emerging single-cell RNA-Seq (scRNA-Seq technology holds the promise to revolutionize our understanding of diseases and associated biological processes at an unprecedented resolution. It opens the door to reveal the intercellular heterogeneity and has been employed to a variety of applications, ranging from characterizing cancer cells subpopulations to elucidating tumor resistance mechanisms. Parallel to improving experimental protocols to deal with technological issues, deriving new analytical methods to reveal the complexity in scRNA-Seq data is just as challenging. Here we review the current state-of-the-art bioinformatics tools and methods for scRNA-Seq analysis, as well as addressing some critical analytical challenges that the field faces.

  13. A bioinformatics roadmap for the human vaccines project.

    Science.gov (United States)

    Scheuermann, Richard H; Sinkovits, Robert S; Schenkelberg, Theodore; Koff, Wayne C

    2017-06-01

    Biomedical research has become a data intensive science in which high throughput experimentation is producing comprehensive data about biological systems at an ever-increasing pace. The Human Vaccines Project is a new public-private partnership, with the goal of accelerating development of improved vaccines and immunotherapies for global infectious diseases and cancers by decoding the human immune system. To achieve its mission, the Project is developing a Bioinformatics Hub as an open-source, multidisciplinary effort with the overarching goal of providing an enabling infrastructure to support the data processing, analysis and knowledge extraction procedures required to translate high throughput, high complexity human immunology research data into biomedical knowledge, to determine the core principles driving specific and durable protective immune responses.

  14. Web Accessibility and Guidelines

    Science.gov (United States)

    Harper, Simon; Yesilada, Yeliz

    Access to, and movement around, complex online environments, of which the World Wide Web (Web) is the most popular example, has long been considered an important and major issue in the Web design and usability field. The commonly used slang phrase ‘surfing the Web’ implies rapid and free access, pointing to its importance among designers and users alike. It has also been long established that this potentially complex and difficult access is further complicated, and becomes neither rapid nor free, if the user is disabled. There are millions of people who have disabilities that affect their use of the Web. Web accessibility aims to help these people to perceive, understand, navigate, and interact with, as well as contribute to, the Web, and thereby the society in general. This accessibility is, in part, facilitated by the Web Content Accessibility Guidelines (WCAG) currently moving from version one to two. These guidelines are intended to encourage designers to make sure their sites conform to specifications, and in that conformance enable the assistive technologies of disabled users to better interact with the page content. In this way, it was hoped that accessibility could be supported. While this is in part true, guidelines do not solve all problems and the new WCAG version two guidelines are surrounded by controversy and intrigue. This chapter aims to establish the published literature related to Web accessibility and Web accessibility guidelines, and discuss limitations of the current guidelines and future directions.

  15. Interactive metagenomic visualization in a Web browser

    Directory of Open Access Journals (Sweden)

    Phillippy Adam M

    2011-09-01

    Full Text Available Abstract Background A critical output of metagenomic studies is the estimation of abundances of taxonomical or functional groups. The inherent uncertainty in assignments to these groups makes it important to consider both their hierarchical contexts and their prediction confidence. The current tools for visualizing metagenomic data, however, omit or distort quantitative hierarchical relationships and lack the facility for displaying secondary variables. Results Here we present Krona, a new visualization tool that allows intuitive exploration of relative abundances and confidences within the complex hierarchies of metagenomic classifications. Krona combines a variant of radial, space-filling displays with parametric coloring and interactive polar-coordinate zooming. The HTML5 and JavaScript implementation enables fully interactive charts that can be explored with any modern Web browser, without the need for installed software or plug-ins. This Web-based architecture also allows each chart to be an independent document, making them easy to share via e-mail or post to a standard Web server. To illustrate Krona's utility, we describe its application to various metagenomic data sets and its compatibility with popular metagenomic analysis tools. Conclusions Krona is both a powerful metagenomic visualization tool and a demonstration of the potential of HTML5 for highly accessible bioinformatic visualizations. Its rich and interactive displays facilitate more informed interpretations of metagenomic analyses, while its implementation as a browser-based application makes it extremely portable and easily adopted into existing analysis packages. Both the Krona rendering code and conversion tools are freely available under a BSD open-source license, and available from: http://krona.sourceforge.net.

  16. A generally applicable lightweight method for calculating a value structure for tools and services in bioinformatics infrastructure projects.

    Science.gov (United States)

    Mayer, Gerhard; Quast, Christian; Felden, Janine; Lange, Matthias; Prinz, Manuel; Pühler, Alfred; Lawerenz, Chris; Scholz, Uwe; Glöckner, Frank Oliver; Müller, Wolfgang; Marcus, Katrin; Eisenacher, Martin

    2017-10-30

    Sustainable noncommercial bioinformatics infrastructures are a prerequisite to use and take advantage of the potential of big data analysis for research and economy. Consequently, funders, universities and institutes as well as users ask for a transparent value model for the tools and services offered. In this article, a generally applicable lightweight method is described by which bioinformatics infrastructure projects can estimate the value of tools and services offered without determining exactly the total costs of ownership. Five representative scenarios for value estimation from a rough estimation to a detailed breakdown of costs are presented. To account for the diversity in bioinformatics applications and services, the notion of service-specific 'service provision units' is introduced together with the factors influencing them and the main underlying assumptions for these 'value influencing factors'. Special attention is given on how to handle personnel costs and indirect costs such as electricity. Four examples are presented for the calculation of the value of tools and services provided by the German Network for Bioinformatics Infrastructure (de.NBI): one for tool usage, one for (Web-based) database analyses, one for consulting services and one for bioinformatics training events. Finally, from the discussed values, the costs of direct funding and the costs of payment of services by funded projects are calculated and compared. © The Author 2017. Published by Oxford University Press.

  17. RAP: RNA-Seq Analysis Pipeline, a new cloud-based NGS web application.

    Science.gov (United States)

    D'Antonio, Mattia; D'Onorio De Meo, Paolo; Pallocca, Matteo; Picardi, Ernesto; D'Erchia, Anna Maria; Calogero, Raffaele A; Castrignanò, Tiziana; Pesole, Graziano

    2015-01-01

    The study of RNA has been dramatically improved by the introduction of Next Generation Sequencing platforms allowing massive and cheap sequencing of selected RNA fractions, also providing information on strand orientation (RNA-Seq). The complexity of transcriptomes and of their regulative pathways make RNA-Seq one of most complex field of NGS applications, addressing several aspects of the expression process (e.g. identification and quantification of expressed genes and transcripts, alternative splicing and polyadenylation, fusion genes and trans-splicing, post-transcriptional events, etc.). In order to provide researchers with an effective and friendly resource for analyzing RNA-Seq data, we present here RAP (RNA-Seq Analysis Pipeline), a cloud computing web application implementing a complete but modular analysis workflow. This pipeline integrates both state-of-the-art bioinformatics tools for RNA-Seq analysis and in-house developed scripts to offer to the user a comprehensive strategy for data analysis. RAP is able to perform quality checks (adopting FastQC and NGS QC Toolkit), identify and quantify expressed genes and transcripts (with Tophat, Cufflinks and HTSeq), detect alternative splicing events (using SpliceTrap) and chimeric transcripts (with ChimeraScan). This pipeline is also able to identify splicing junctions and constitutive or alternative polyadenylation sites (implementing custom analysis modules) and call for statistically significant differences in genes and transcripts expression, splicing pattern and polyadenylation site usage (using Cuffdiff2 and DESeq). Through a user friendly web interface, the RAP workflow can be suitably customized by the user and it is automatically executed on our cloud computing environment. This strategy allows to access to bioinformatics tools and computational resources without specific bioinformatics and IT skills. RAP provides a set of tabular and graphical results that can be helpful to browse, filter and export

  18. Modern bioinformatics meets traditional Chinese medicine.

    Science.gov (United States)

    Gu, Peiqin; Chen, Huajun

    2014-11-01

    Traditional Chinese medicine (TCM) is gaining increasing attention with the emergence of integrative medicine and personalized medicine, characterized by pattern differentiation on individual variance and treatments based on natural herbal synergism. Investigating the effectiveness and safety of the potential mechanisms of TCM and the combination principles of drug therapies will bridge the cultural gap with Western medicine and improve the development of integrative medicine. Dealing with rapidly growing amounts of biomedical data and their heterogeneous nature are two important tasks among modern biomedical communities. Bioinformatics, as an emerging interdisciplinary field of computer science and biology, has become a useful tool for easing the data deluge pressure by automating the computation processes with informatics methods. Using these methods to retrieve, store and analyze the biomedical data can effectively reveal the associated knowledge hidden in the data, and thus promote the discovery of integrated information. Recently, these techniques of bioinformatics have been used for facilitating the interactional effects of both Western medicine and TCM. The analysis of TCM data using computational technologies provides biological evidence for the basic understanding of TCM mechanisms, safety and efficacy of TCM treatments. At the same time, the carrier and targets associated with TCM remedies can inspire the rethinking of modern drug development. This review summarizes the significant achievements of applying bioinformatics techniques to many aspects of the research in TCM, such as analysis of TCM-related '-omics' data and techniques for analyzing biological processes and pharmaceutical mechanisms of TCM, which have shown certain potential of bringing new thoughts to both sides. © The Author 2013. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  19. EVpedia: a community web portal for extracellular vesicles research.

    Science.gov (United States)

    Kim, Dae-Kyum; Lee, Jaewook; Kim, Sae Rom; Choi, Dong-Sic; Yoon, Yae Jin; Kim, Ji Hyun; Go, Gyeongyun; Nhung, Dinh; Hong, Kahye; Jang, Su Chul; Kim, Si-Hyun; Park, Kyong-Su; Kim, Oh Youn; Park, Hyun Taek; Seo, Ji Hye; Aikawa, Elena; Baj-Krzyworzeka, Monika; van Balkom, Bas W M; Belting, Mattias; Blanc, Lionel; Bond, Vincent; Bongiovanni, Antonella; Borràs, Francesc E; Buée, Luc; Buzás, Edit I; Cheng, Lesley; Clayton, Aled; Cocucci, Emanuele; Dela Cruz, Charles S; Desiderio, Dominic M; Di Vizio, Dolores; Ekström, Karin; Falcon-Perez, Juan M; Gardiner, Chris; Giebel, Bernd; Greening, David W; Gross, Julia Christina; Gupta, Dwijendra; Hendrix, An; Hill, Andrew F; Hill, Michelle M; Nolte-'t Hoen, Esther; Hwang, Do Won; Inal, Jameel; Jagannadham, Medicharla V; Jayachandran, Muthuvel; Jee, Young-Koo; Jørgensen, Malene; Kim, Kwang Pyo; Kim, Yoon-Keun; Kislinger, Thomas; Lässer, Cecilia; Lee, Dong Soo; Lee, Hakmo; van Leeuwen, Johannes; Lener, Thomas; Liu, Ming-Lin; Lötvall, Jan; Marcilla, Antonio; Mathivanan, Suresh; Möller, Andreas; Morhayim, Jess; Mullier, François; Nazarenko, Irina; Nieuwland, Rienk; Nunes, Diana N; Pang, Ken; Park, Jaesung; Patel, Tushar; Pocsfalvi, Gabriella; Del Portillo, Hernando; Putz, Ulrich; Ramirez, Marcel I; Rodrigues, Marcio L; Roh, Tae-Young; Royo, Felix; Sahoo, Susmita; Schiffelers, Raymond; Sharma, Shivani; Siljander, Pia; Simpson, Richard J; Soekmadji, Carolina; Stahl, Philip; Stensballe, Allan; Stępień, Ewa; Tahara, Hidetoshi; Trummer, Arne; Valadi, Hadi; Vella, Laura J; Wai, Sun Nyunt; Witwer, Kenneth; Yáñez-Mó, María; Youn, Hyewon; Zeidler, Reinhard; Gho, Yong Song

    2015-03-15

    Extracellular vesicles (EVs) are spherical bilayered proteolipids, harboring various bioactive molecules. Due to the complexity of the vesicular nomenclatures and components, online searches for EV-related publications and vesicular components are currently challenging. We present an improved version of EVpedia, a public database for EVs research. This community web portal contains a database of publications and vesicular components, identification of orthologous vesicular components, bioinformatic tools and a personalized function. EVpedia includes 6879 publications, 172 080 vesicular components from 263 high-throughput datasets, and has been accessed more than 65 000 times from more than 750 cities. In addition, about 350 members from 73 international research groups have participated in developing EVpedia. This free web-based database might serve as a useful resource to stimulate the emerging field of EV research. The web site was implemented in PHP, Java, MySQL and Apache, and is freely available at http://evpedia.info. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  20. Multiobjective optimization in bioinformatics and computational biology.

    Science.gov (United States)

    Handl, Julia; Kell, Douglas B; Knowles, Joshua

    2007-01-01

    This paper reviews the application of multiobjective optimization in the fields of bioinformatics and computational biology. A survey of existing work, organized by application area, forms the main body of the review, following an introduction to the key concepts in multiobjective optimization. An original contribution of the review is the identification of five distinct "contexts," giving rise to multiple objectives: These are used to explain the reasons behind the use of multiobjective optimization in each application area and also to point the way to potential future uses of the technique.

  1. Robust Bioinformatics Recognition with VLSI Biochip Microsystem

    Science.gov (United States)

    Lue, Jaw-Chyng L.; Fang, Wai-Chi

    2006-01-01

    A microsystem architecture for real-time, on-site, robust bioinformatic patterns recognition and analysis has been proposed. This system is compatible with on-chip DNA analysis means such as polymerase chain reaction (PCR)amplification. A corresponding novel artificial neural network (ANN) learning algorithm using new sigmoid-logarithmic transfer function based on error backpropagation (EBP) algorithm is invented. Our results show the trained new ANN can recognize low fluorescence patterns better than the conventional sigmoidal ANN does. A differential logarithmic imaging chip is designed for calculating logarithm of relative intensities of fluorescence signals. The single-rail logarithmic circuit and a prototype ANN chip are designed, fabricated and characterized.

  2. Introducing bioinformatics, the biosciences' genomic revolution

    CERN Document Server

    Zanella, Paolo

    1999-01-01

    The general audience for these lectures is mainly physicists, computer scientists, engineers or the general public wanting to know more about what’s going on in the biosciences. What’s bioinformatics and why is all this fuss being made about it ? What’s this revolution triggered by the human genome project ? Are there any results yet ? What are the problems ? What new avenues of research have been opened up ? What about the technology ? These new developments will be compared with what happened at CERN earlier in its evolution, and it is hoped that the similiraties and contrasts will stimulate new curiosity and provoke new thoughts.

  3. clubber: removing the bioinformatics bottleneck in big data analyses

    Science.gov (United States)

    Miller, Maximilian; Zhu, Chengsheng; Bromberg, Yana

    2018-01-01

    With the advent of modern day high-throughput technologies, the bottleneck in biological discovery has shifted from the cost of doing experiments to that of analyzing results. clubber is our automated cluster-load balancing system developed for optimizing these “big data” analyses. Its plug-and-play framework encourages re-use of existing solutions for bioinformatics problems. clubber’s goals are to reduce computation times and to facilitate use of cluster computing. The first goal is achieved by automating the balance of parallel submissions across available high performance computing (HPC) resources. Notably, the latter can be added on demand, including cloud-based resources, and/or featuring heterogeneous environments. The second goal of making HPCs user-friendly is facilitated by an interactive web interface and a RESTful API, allowing for job monitoring and result retrieval. We used clubber to speed up our pipeline for annotating molecular functionality of metagenomes. Here, we analyzed the Deepwater Horizon oil-spill study data to quantitatively show that the beach sands have not yet entirely recovered. Further, our analysis of the CAMI-challenge data revealed that microbiome taxonomic shifts do not necessarily correlate with functional shifts. These examples (21 metagenomes processed in 172 min) clearly illustrate the importance of clubber in the everyday computational biology environment. PMID:28609295

  4. clubber: removing the bioinformatics bottleneck in big data analyses.

    Science.gov (United States)

    Miller, Maximilian; Zhu, Chengsheng; Bromberg, Yana

    2017-06-13

    With the advent of modern day high-throughput technologies, the bottleneck in biological discovery has shifted from the cost of doing experiments to that of analyzing results. clubber is our automated cluster-load balancing system developed for optimizing these "big data" analyses. Its plug-and-play framework encourages re-use of existing solutions for bioinformatics problems. clubber's goals are to reduce computation times and to facilitate use of cluster computing. The first goal is achieved by automating the balance of parallel submissions across available high performance computing (HPC) resources. Notably, the latter can be added on demand, including cloud-based resources, and/or featuring heterogeneous environments. The second goal of making HPCs user-friendly is facilitated by an interactive web interface and a RESTful API, allowing for job monitoring and result retrieval. We used clubber to speed up our pipeline for annotating molecular functionality of metagenomes. Here, we analyzed the Deepwater Horizon oil-spill study data to quantitatively show that the beach sands have not yet entirely recovered. Further, our analysis of the CAMI-challenge data revealed that microbiome taxonomic shifts do not necessarily correlate with functional shifts. These examples (21 metagenomes processed in 172 min) clearly illustrate the importance of clubber in the everyday computational biology environment.

  5. clubber: removing the bioinformatics bottleneck in big data analyses

    Directory of Open Access Journals (Sweden)

    Miller Maximilian

    2017-06-01

    Full Text Available With the advent of modern day high-throughput technologies, the bottleneck in biological discovery has shifted from the cost of doing experiments to that of analyzing results. clubber is our automated cluster-load balancing system developed for optimizing these “big data” analyses. Its plug-and-play framework encourages re-use of existing solutions for bioinformatics problems. clubber’s goals are to reduce computation times and to facilitate use of cluster computing. The first goal is achieved by automating the balance of parallel submissions across available high performance computing (HPC resources. Notably, the latter can be added on demand, including cloud-based resources, and/or featuring heterogeneous environments. The second goal of making HPCs user-friendly is facilitated by an interactive web interface and a RESTful API, allowing for job monitoring and result retrieval. We used clubber to speed up our pipeline for annotating molecular functionality of metagenomes. Here, we analyzed the Deepwater Horizon oil-spill study data to quantitatively show that the beach sands have not yet entirely recovered. Further, our analysis of the CAMI-challenge data revealed that microbiome taxonomic shifts do not necessarily correlate with functional shifts. These examples (21 metagenomes processed in 172 min clearly illustrate the importance of clubber in the everyday computational biology environment.

  6. ncRNA-class Web Tool: Non-coding RNA feature extraction and pre-miRNA classification web tool

    KAUST Repository

    Kleftogiannis, Dimitrios A.; Theofilatos, Konstantinos A.; Papadimitriou, Stergios; Tsakalidis, Athanasios K.; Likothanassis, Spiridon D.; Mavroudi, Seferina P.

    2012-01-01

    Until recently, it was commonly accepted that most genetic information is transacted by proteins. Recent evidence suggests that the majority of the genomes of mammals and other complex organisms are in fact transcribed into non-coding RNAs (ncRNAs), many of which are alternatively spliced and/or processed into smaller products. Non coding RNA genes analysis requires the calculation of several sequential, thermodynamical and structural features. Many independent tools have already been developed for the efficient calculation of such features but to the best of our knowledge there does not exist any integrative approach for this task. The most significant amount of existing work is related to the miRNA class of non-coding RNAs. MicroRNAs (miRNAs) are small non-coding RNAs that play a significant role in gene regulation and their prediction is a challenging bioinformatics problem. Non-coding RNA feature extraction and pre-miRNA classification Web Tool (ncRNA-class Web Tool) is a publicly available web tool ( http://150.140.142.24:82/Default.aspx ) which provides a user friendly and efficient environment for the effective calculation of a set of 58 sequential, thermodynamical and structural features of non-coding RNAs, plus a tool for the accurate prediction of miRNAs. © 2012 IFIP International Federation for Information Processing.

  7. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    Science.gov (United States)

    Magana, Alejandra J.; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the…

  8. LECTINPred: web Server that Uses Complex Networks of Protein Structure for Prediction of Lectins with Potential Use as Cancer Biomarkers or in Parasite Vaccine Design.

    Science.gov (United States)

    Munteanu, Cristian R; Pedreira, Nieves; Dorado, Julián; Pazos, Alejandro; Pérez-Montoto, Lázaro G; Ubeira, Florencio M; González-Díaz, Humberto

    2014-04-01

    Lectins (Ls) play an important role in many diseases such as different types of cancer, parasitic infections and other diseases. Interestingly, the Protein Data Bank (PDB) contains +3000 protein 3D structures with unknown function. Thus, we can in principle, discover new Ls mining non-annotated structures from PDB or other sources. However, there are no general models to predict new biologically relevant Ls based on 3D chemical structures. We used the MARCH-INSIDE software to calculate the Markov-Shannon 3D electrostatic entropy parameters for the complex networks of protein structure of 2200 different protein 3D structures, including 1200 Ls. We have performed a Linear Discriminant Analysis (LDA) using these parameters as inputs in order to seek a new Quantitative Structure-Activity Relationship (QSAR) model, which is able to discriminate 3D structure of Ls from other proteins. We implemented this predictor in the web server named LECTINPred, freely available at http://bio-aims.udc.es/LECTINPred.php. This web server showed the following goodness-of-fit statistics: Sensitivity=96.7 % (for Ls), Specificity=87.6 % (non-active proteins), and Accuracy=92.5 % (for all proteins), considering altogether both the training and external prediction series. In mode 2, users can carry out an automatic retrieval of protein structures from PDB. We illustrated the use of this server, in operation mode 1, performing a data mining of PDB. We predicted Ls scores for +2000 proteins with unknown function and selected the top-scored ones as possible lectins. In operation mode 2, LECTINPred can also upload 3D structural models generated with structure-prediction tools like LOMETS or PHYRE2. The new Ls are expected to be of relevance as cancer biomarkers or useful in parasite vaccine design. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  9. Web Mining

    Science.gov (United States)

    Fürnkranz, Johannes

    The World-Wide Web provides every internet citizen with access to an abundance of information, but it becomes increasingly difficult to identify the relevant pieces of information. Research in web mining tries to address this problem by applying techniques from data mining and machine learning to Web data and documents. This chapter provides a brief overview of web mining techniques and research areas, most notably hypertext classification, wrapper induction, recommender systems and web usage mining.

  10. BioShaDock: a community driven bioinformatics shared Docker-based tools registry [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    François Moreews

    2015-12-01

    Full Text Available Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community.

  11. OpenHelix: bioinformatics education outside of a different box.

    Science.gov (United States)

    Williams, Jennifer M; Mangan, Mary E; Perreault-Micale, Cynthia; Lathe, Scott; Sirohi, Neeraj; Lathe, Warren C

    2010-11-01

    The amount of biological data is increasing rapidly, and will continue to increase as new rapid technologies are developed. Professionals in every area of bioscience will have data management needs that require publicly available bioinformatics resources. Not all scientists desire a formal bioinformatics education but would benefit from more informal educational sources of learning. Effective bioinformatics education formats will address a broad range of scientific needs, will be aimed at a variety of user skill levels, and will be delivered in a number of different formats to address different learning styles. Informal sources of bioinformatics education that are effective are available, and will be explored in this review.

  12. Community annotation and bioinformatics workforce development in concert--Little Skate Genome Annotation Workshops and Jamborees.

    Science.gov (United States)

    Wang, Qinghua; Arighi, Cecilia N; King, Benjamin L; Polson, Shawn W; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F; Page, Shallee T; Rendino, Marc Farnum; Thomas, William Kelley; Udwary, Daniel W; Wu, Cathy H

    2012-01-01

    Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome.

  13. Community annotation and bioinformatics workforce development in concert—Little Skate Genome Annotation Workshops and Jamborees

    Science.gov (United States)

    Wang, Qinghua; Arighi, Cecilia N.; King, Benjamin L.; Polson, Shawn W.; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F.; Page, Shallee T.; Farnum Rendino, Marc; Thomas, William Kelley; Udwary, Daniel W.; Wu, Cathy H.

    2012-01-01

    Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome. PMID:22434832

  14. Integration of Proteomics, Bioinformatics, and Systems Biology in Traumatic Brain Injury Biomarker Discovery

    Science.gov (United States)

    Guingab-Cagmat, J.D.; Cagmat, E.B.; Hayes, R.L.; Anagli, J.

    2013-01-01

    Traumatic brain injury (TBI) is a major medical crisis without any FDA-approved pharmacological therapies that have been demonstrated to improve functional outcomes. It has been argued that discovery of disease-relevant biomarkers might help to guide successful clinical trials for TBI. Major advances in mass spectrometry (MS) have revolutionized the field of proteomic biomarker discovery and facilitated the identification of several candidate markers that are being further evaluated for their efficacy as TBI biomarkers. However, several hurdles have to be overcome even during the discovery phase which is only the first step in the long process of biomarker development. The high-throughput nature of MS-based proteomic experiments generates a massive amount of mass spectral data presenting great challenges in downstream interpretation. Currently, different bioinformatics platforms are available for functional analysis and data mining of MS-generated proteomic data. These tools provide a way to convert data sets to biologically interpretable results and functional outcomes. A strategy that has promise in advancing biomarker development involves the triad of proteomics, bioinformatics, and systems biology. In this review, a brief overview of how bioinformatics and systems biology tools analyze, transform, and interpret complex MS datasets into biologically relevant results is discussed. In addition, challenges and limitations of proteomics, bioinformatics, and systems biology in TBI biomarker discovery are presented. A brief survey of researches that utilized these three overlapping disciplines in TBI biomarker discovery is also presented. Finally, examples of TBI biomarkers and their applications are discussed. PMID:23750150

  15. LDAP: a web server for lncRNA-disease association prediction.

    Science.gov (United States)

    Lan, Wei; Li, Min; Zhao, Kaijie; Liu, Jin; Wu, Fang-Xiang; Pan, Yi; Wang, Jianxin

    2017-02-01

    Increasing evidences have demonstrated that long noncoding RNAs (lncRNAs) play important roles in many human diseases. Therefore, predicting novel lncRNA-disease associations would contribute to dissect the complex mechanisms of disease pathogenesis. Some computational methods have been developed to infer lncRNA-disease associations. However, most of these methods infer lncRNA-disease associations only based on single data resource. In this paper, we propose a new computational method to predict lncRNA-disease associations by integrating multiple biological data resources. Then, we implement this method as a web server for lncRNA-disease association prediction (LDAP). The input of the LDAP server is the lncRNA sequence. The LDAP predicts potential lncRNA-disease associations by using a bagging SVM classifier based on lncRNA similarity and disease similarity. The web server is available at http://bioinformatics.csu.edu.cn/ldap jxwang@mail.csu.edu.cn. Supplementary data are available at Bioinformatics online.

  16. Web-based services for drug design and discovery.

    Science.gov (United States)

    Frey, Jeremy G; Bird, Colin L

    2011-09-01

    Reviews of the development of drug discovery through the 20(th) century recognised the importance of chemistry and increasingly bioinformatics, but had relatively little to say about the importance of computing and networked computing in particular. However, the design and discovery of new drugs is arguably the most significant single application of bioinformatics and cheminformatics to have benefitted from the increases in the range and power of the computational techniques since the emergence of the World Wide Web, commonly now referred to as simply 'the Web'. Web services have enabled researchers to access shared resources and to deploy standardized calculations in their search for new drugs. This article first considers the fundamental principles of Web services and workflows, and then explores the facilities and resources that have evolved to meet the specific needs of chem- and bio-informatics. This strategy leads to a more detailed examination of the basic components that characterise molecules and the essential predictive techniques, followed by a discussion of the emerging networked services that transcend the basic provisions, and the growing trend towards embracing modern techniques, in particular the Semantic Web. In the opinion of the authors, the issues that require community action are: increasing the amount of chemical data available for open access; validating the data as provided; and developing more efficient links between the worlds of cheminformatics and bioinformatics. The goal is to create ever better drug design services.

  17. Microbase2.0: A Generic Framework for Computationally Intensive Bioinformatics Workflows in the Cloud

    OpenAIRE

    Flanagan Keith; Nakjang Sirintra; Hallinan Jennifer; Harwood Colin; Hirt Robert P.; Pocock Matthew R.; Wipat Anil

    2012-01-01

    As bioinformatics datasets grow ever larger, and analyses become increasingly complex, there is a need for data handling infrastructures to keep pace with developing technology. One solution is to apply Grid and Cloud technologies to address the computational requirements of analysing high throughput datasets. We present an approach for writing new, or wrapping existing applications, and a reference implementation of a framework, Microbase2.0, for executing those applications using Grid and C...

  18. Bioinformatic Analysis of Strawberry GSTF12 Gene

    Science.gov (United States)

    Wang, Xiran; Jiang, Leiyu; Tang, Haoru

    2018-01-01

    GSTF12 has always been known as a key factor of proanthocyanins accumulate in plant testa. Through bioinformatics analysis of the nucleotide and encoded protein sequence of GSTF12, it is more advantageous to the study of genes related to anthocyanin biosynthesis accumulation pathway. Therefore, we chosen GSTF12 gene of 11 kinds species, downloaded their nucleotide and protein sequence from NCBI as the research object, found strawberry GSTF12 gene via bioinformation analyse, constructed phylogenetic tree. At the same time, we analysed the strawberry GSTF12 gene of physical and chemical properties and its protein structure and so on. The phylogenetic tree showed that Strawberry and petunia were closest relative. By the protein prediction, we found that the protein owed one proper signal peptide without obvious transmembrane regions.

  19. Bioinformatics for Next Generation Sequencing Data

    Directory of Open Access Journals (Sweden)

    Alberto Magi

    2010-09-01

    Full Text Available The emergence of next-generation sequencing (NGS platforms imposes increasing demands on statistical methods and bioinformatic tools for the analysis and the management of the huge amounts of data generated by these technologies. Even at the early stages of their commercial availability, a large number of softwares already exist for analyzing NGS data. These tools can be fit into many general categories including alignment of sequence reads to a reference, base-calling and/or polymorphism detection, de novo assembly from paired or unpaired reads, structural variant detection and genome browsing. This manuscript aims to guide readers in the choice of the available computational tools that can be used to face the several steps of the data analysis workflow.

  20. Combining multiple decisions: applications to bioinformatics

    International Nuclear Information System (INIS)

    Yukinawa, N; Ishii, S; Takenouchi, T; Oba, S

    2008-01-01

    Multi-class classification is one of the fundamental tasks in bioinformatics and typically arises in cancer diagnosis studies by gene expression profiling. This article reviews two recent approaches to multi-class classification by combining multiple binary classifiers, which are formulated based on a unified framework of error-correcting output coding (ECOC). The first approach is to construct a multi-class classifier in which each binary classifier to be aggregated has a weight value to be optimally tuned based on the observed data. In the second approach, misclassification of each binary classifier is formulated as a bit inversion error with a probabilistic model by making an analogy to the context of information transmission theory. Experimental studies using various real-world datasets including cancer classification problems reveal that both of the new methods are superior or comparable to other multi-class classification methods

  1. Data mining in bioinformatics using Weka.

    Science.gov (United States)

    Frank, Eibe; Hall, Mark; Trigg, Len; Holmes, Geoffrey; Witten, Ian H

    2004-10-12

    The Weka machine learning workbench provides a general-purpose environment for automatic classification, regression, clustering and feature selection-common data mining problems in bioinformatics research. It contains an extensive collection of machine learning algorithms and data pre-processing methods complemented by graphical user interfaces for data exploration and the experimental comparison of different machine learning techniques on the same problem. Weka can process data given in the form of a single relational table. Its main objectives are to (a) assist users in extracting useful information from data and (b) enable them to easily identify a suitable algorithm for generating an accurate predictive model from it. http://www.cs.waikato.ac.nz/ml/weka.

  2. Bioinformatic and Biometric Methods in Plant Morphology

    Directory of Open Access Journals (Sweden)

    Surangi W. Punyasena

    2014-08-01

    Full Text Available Recent advances in microscopy, imaging, and data analyses have permitted both the greater application of quantitative methods and the collection of large data sets that can be used to investigate plant morphology. This special issue, the first for Applications in Plant Sciences, presents a collection of papers highlighting recent methods in the quantitative study of plant form. These emerging biometric and bioinformatic approaches to plant sciences are critical for better understanding how morphology relates to ecology, physiology, genotype, and evolutionary and phylogenetic history. From microscopic pollen grains and charcoal particles, to macroscopic leaves and whole root systems, the methods presented include automated classification and identification, geometric morphometrics, and skeleton networks, as well as tests of the limits of human assessment. All demonstrate a clear need for these computational and morphometric approaches in order to increase the consistency, objectivity, and throughput of plant morphological studies.

  3. Academic Training - Bioinformatics: Decoding the Genome

    CERN Multimedia

    Chris Jones

    2006-01-01

    ACADEMIC TRAINING LECTURE SERIES 27, 28 February 1, 2, 3 March 2006 from 11:00 to 12:00 - Auditorium, bldg. 500 Decoding the Genome A special series of 5 lectures on: Recent extraordinary advances in the life sciences arising through new detection technologies and bioinformatics The past five years have seen an extraordinary change in the information and tools available in the life sciences. The sequencing of the human genome, the discovery that we possess far fewer genes than foreseen, the measurement of the tiny changes in the genomes that differentiate us, the sequencing of the genomes of many pathogens that lead to diseases such as malaria are all examples of completely new information that is now available in the quest for improved healthcare. New tools have allowed similar strides in the discovery of the associated protein structures, providing invaluable information for those searching for new drugs. New DNA microarray chips permit simultaneous measurement of the state of expression of tens...

  4. Comprehensive Metabolite Identification Strategy Using Multiple Two-Dimensional NMR Spectra of a Complex Mixture Implemented in the COLMARm Web Server

    Energy Technology Data Exchange (ETDEWEB)

    Bingol, Kerem [Environmental; Li, Da-Wei; Zhang, Bo; Brüschweiler, Rafael

    2016-12-06

    Identification of metabolites in complex mixtures represents a key step in metabolomics. A new strategy is introduced, which is implemented in a new public web server, COLMARm, that permits the co-analysis of up to three 2D NMR spectra, namely 13C-1H HSQC, 1H-1H TOCSY, and 13C-1H HSQC-TOCSY for the comprehensive, accurate, and efficient performance of this task. The highly versatile and interactive nature of COLMARm permits its application to a wide range of metabolomics samples independent of the magnetic field. Database query is performed using the HSQC spectrum and the top metabolite hits are then validated against the TOCSY-type experiment(s) by superimposing the expected cross-peaks on the mixture spectrum. In this way the user can directly accept or reject candidate metabolites by taking advantage of the complementary spectral information offered by these experiments and their different sensitivities. The power of COLMARm is demonstrated for a human serum sample uncovering the existence of 14 metabolites that hitherto were not identified by NMR.

  5. A Web-Based Tool for Automatic Data Collection, Curation, and Visualization of Complex Healthcare Survey Studies including Social Network Analysis

    Directory of Open Access Journals (Sweden)

    José Alberto Benítez

    2017-01-01

    Full Text Available There is a great concern nowadays regarding alcohol consumption and drug abuse, especially in young people. Analyzing the social environment where these adolescents are immersed, as well as a series of measures determining the alcohol abuse risk or personal situation and perception using a number of questionnaires like AUDIT, FAS, KIDSCREEN, and others, it is possible to gain insight into the current situation of a given individual regarding his/her consumption behavior. But this analysis, in order to be achieved, requires the use of tools that can ease the process of questionnaire creation, data gathering, curation and representation, and later analysis and visualization to the user. This research presents the design and construction of a web-based platform able to facilitate each of the mentioned processes by integrating the different phases into an intuitive system with a graphical user interface that hides the complexity underlying each of the questionnaires and techniques used and presenting the results in a flexible and visual way, avoiding any manual handling of data during the process. Advantages of this approach are shown and compared to the previous situation where some of the tasks were accomplished by time consuming and error prone manipulations of data.

  6. A Web-Based Tool for Automatic Data Collection, Curation, and Visualization of Complex Healthcare Survey Studies including Social Network Analysis.

    Science.gov (United States)

    Benítez, José Alberto; Labra, José Emilio; Quiroga, Enedina; Martín, Vicente; García, Isaías; Marqués-Sánchez, Pilar; Benavides, Carmen

    2017-01-01

    There is a great concern nowadays regarding alcohol consumption and drug abuse, especially in young people. Analyzing the social environment where these adolescents are immersed, as well as a series of measures determining the alcohol abuse risk or personal situation and perception using a number of questionnaires like AUDIT, FAS, KIDSCREEN, and others, it is possible to gain insight into the current situation of a given individual regarding his/her consumption behavior. But this analysis, in order to be achieved, requires the use of tools that can ease the process of questionnaire creation, data gathering, curation and representation, and later analysis and visualization to the user. This research presents the design and construction of a web-based platform able to facilitate each of the mentioned processes by integrating the different phases into an intuitive system with a graphical user interface that hides the complexity underlying each of the questionnaires and techniques used and presenting the results in a flexible and visual way, avoiding any manual handling of data during the process. Advantages of this approach are shown and compared to the previous situation where some of the tasks were accomplished by time consuming and error prone manipulations of data.

  7. Configuration of Web services as parametric design

    NARCIS (Netherlands)

    Ten Teije, Annette; Van Harmelen, Frank; Wielinga, Bob

    2004-01-01

    The configuration of Web services is particularly hard given the heterogeneous, unreliable and open nature of the Web. Furthermore, such composite Web services are likely to be complex services, that will require adaptation for each specific use. Current approaches to Web service configuration are

  8. Evaluating an Inquiry-Based Bioinformatics Course Using Q Methodology

    Science.gov (United States)

    Ramlo, Susan E.; McConnell, David; Duan, Zhong-Hui; Moore, Francisco B.

    2008-01-01

    Faculty at a Midwestern metropolitan public university recently developed a course on bioinformatics that emphasized collaboration and inquiry. Bioinformatics, essentially the application of computational tools to biological data, is inherently interdisciplinary. Thus part of the challenge of creating this course was serving the needs and…

  9. Bioinformatics education dissemination with an evolutionary problem solving perspective.

    Science.gov (United States)

    Jungck, John R; Donovan, Samuel S; Weisstein, Anton E; Khiripet, Noppadon; Everse, Stephen J

    2010-11-01

    Bioinformatics is central to biology education in the 21st century. With the generation of terabytes of data per day, the application of computer-based tools to stored and distributed data is fundamentally changing research and its application to problems in medicine, agriculture, conservation and forensics. In light of this 'information revolution,' undergraduate biology curricula must be redesigned to prepare the next generation of informed citizens as well as those who will pursue careers in the life sciences. The BEDROCK initiative (Bioinformatics Education Dissemination: Reaching Out, Connecting and Knitting together) has fostered an international community of bioinformatics educators. The initiative's goals are to: (i) Identify and support faculty who can take leadership roles in bioinformatics education; (ii) Highlight and distribute innovative approaches to incorporating evolutionary bioinformatics data and techniques throughout undergraduate education; (iii) Establish mechanisms for the broad dissemination of bioinformatics resource materials and teaching models; (iv) Emphasize phylogenetic thinking and problem solving; and (v) Develop and publish new software tools to help students develop and test evolutionary hypotheses. Since 2002, BEDROCK has offered more than 50 faculty workshops around the world, published many resources and supported an environment for developing and sharing bioinformatics education approaches. The BEDROCK initiative builds on the established pedagogical philosophy and academic community of the BioQUEST Curriculum Consortium to assemble the diverse intellectual and human resources required to sustain an international reform effort in undergraduate bioinformatics education.

  10. Bioinformatics and its application in animal health: a review | Soetan ...

    African Journals Online (AJOL)

    Bioinformatics is an interdisciplinary subject, which uses computer application, statistics, mathematics and engineering for the analysis and management of biological information. It has become an important tool for basic and applied research in veterinary sciences. Bioinformatics has brought about advancements into ...

  11. Recent developments in life sciences research: Role of bioinformatics

    African Journals Online (AJOL)

    Life sciences research and development has opened up new challenges and opportunities for bioinformatics. The contribution of bioinformatics advances made possible the mapping of the entire human genome and genomes of many other organisms in just over a decade. These discoveries, along with current efforts to ...

  12. Generative Topic Modeling in Image Data Mining and Bioinformatics Studies

    Science.gov (United States)

    Chen, Xin

    2012-01-01

    Probabilistic topic models have been developed for applications in various domains such as text mining, information retrieval and computer vision and bioinformatics domain. In this thesis, we focus on developing novel probabilistic topic models for image mining and bioinformatics studies. Specifically, a probabilistic topic-connection (PTC) model…

  13. Assessment of a Bioinformatics across Life Science Curricula Initiative

    Science.gov (United States)

    Howard, David R.; Miskowski, Jennifer A.; Grunwald, Sandra K.; Abler, Michael L.

    2007-01-01

    At the University of Wisconsin-La Crosse, we have undertaken a program to integrate the study of bioinformatics across the undergraduate life science curricula. Our efforts have included incorporating bioinformatics exercises into courses in the biology, microbiology, and chemistry departments, as well as coordinating the efforts of faculty within…

  14. Concepts Of Bioinformatics And Its Application In Veterinary ...

    African Journals Online (AJOL)

    Bioinformatics has advanced the course of research and future veterinary vaccines development because it has provided new tools for identification of vaccine targets from sequenced biological data of organisms. In Nigeria, there is lack of bioinformatics training in the universities, expect for short training courses in which ...

  15. Current status and future perspectives of bioinformatics in Tanzania ...

    African Journals Online (AJOL)

    The main bottleneck in advancing genomics in present times is the lack of expertise in using bioinformatics tools and approaches for data mining in raw DNA sequences generated by modern high throughput technologies such as next generation sequencing. Although bioinformatics has been making major progress and ...

  16. The 2015 Bioinformatics Open Source Conference (BOSC 2015).

    Science.gov (United States)

    Harris, Nomi L; Cock, Peter J A; Lapp, Hilmar; Chapman, Brad; Davey, Rob; Fields, Christopher; Hokamp, Karsten; Munoz-Torres, Monica

    2016-02-01

    The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included "Data Science;" "Standards and Interoperability;" "Open Science and Reproducibility;" "Translational Bioinformatics;" "Visualization;" and "Bioinformatics Open Source Project Updates". In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled "Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community," that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule.

  17. Is there room for ethics within bioinformatics education?

    Science.gov (United States)

    Taneri, Bahar

    2011-07-01

    When bioinformatics education is considered, several issues are addressed. At the undergraduate level, the main issue revolves around conveying information from two main and different fields: biology and computer science. At the graduate level, the main issue is bridging the gap between biology students and computer science students. However, there is an educational component that is rarely addressed within the context of bioinformatics education: the ethics component. Here, a different perspective is provided on bioinformatics education, and the current status of ethics is analyzed within the existing bioinformatics programs. Analysis of the existing undergraduate and graduate programs, in both Europe and the United States, reveals the minimal attention given to ethics within bioinformatics education. Given that bioinformaticians speedily and effectively shape the biomedical sciences and hence their implications for society, here redesigning of the bioinformatics curricula is suggested in order to integrate the necessary ethics education. Unique ethical problems awaiting bioinformaticians and bioinformatics ethics as a separate field of study are discussed. In addition, a template for an "Ethics in Bioinformatics" course is provided.

  18. SIDECACHE: Information access, management and dissemination framework for web services.

    Science.gov (United States)

    Doderer, Mark S; Burkhardt, Cory; Robbins, Kay A

    2011-06-14

    Many bioinformatics algorithms and data sets are deployed using web services so that the results can be explored via the Internet and easily integrated into other tools and services. These services often include data from other sites that is accessed either dynamically or through file downloads. Developers of these services face several problems because of the dynamic nature of the information from the upstream services. Many publicly available repositories of bioinformatics data frequently update their information. When such an update occurs, the developers of the downstream service may also need to update. For file downloads, this process is typically performed manually followed by web service restart. Requests for information obtained by dynamic access of upstream sources is sometimes subject to rate restrictions. SideCache provides a framework for deploying web services that integrate information extracted from other databases and from web sources that are periodically updated. This situation occurs frequently in biotechnology where new information is being continuously generated and the latest information is important. SideCache provides several types of services including proxy access and rate control, local caching, and automatic web service updating. We have used the SideCache framework to automate the deployment and updating of a number of bioinformatics web services and tools that extract information from remote primary sources such as NCBI, NCIBI, and Ensembl. The SideCache framework also has been used to share research results through the use of a SideCache derived web service.

  19. Head First Web Design

    CERN Document Server

    Watrall, Ethan

    2008-01-01

    Want to know how to make your pages look beautiful, communicate your message effectively, guide visitors through your website with ease, and get everything approved by the accessibility and usability police at the same time? Head First Web Design is your ticket to mastering all of these complex topics, and understanding what's really going on in the world of web design. Whether you're building a personal blog or a corporate website, there's a lot more to web design than div's and CSS selectors, but what do you really need to know? With this book, you'll learn the secrets of designing effecti

  20. 4273π: bioinformatics education on low cost ARM hardware.

    Science.gov (United States)

    Barker, Daniel; Ferrier, David Ek; Holland, Peter Wh; Mitchell, John Bo; Plaisier, Heleen; Ritchie, Michael G; Smart, Steven D

    2013-08-12

    Teaching bioinformatics at universities is complicated by typical computer classroom settings. As well as running software locally and online, students should gain experience of systems administration. For a future career in biology or bioinformatics, the installation of software is a useful skill. We propose that this may be taught by running the course on GNU/Linux running on inexpensive Raspberry Pi computer hardware, for which students may be granted full administrator access. We release 4273π, an operating system image for Raspberry Pi based on Raspbian Linux. This includes minor customisations for classroom use and includes our Open Access bioinformatics course, 4273π Bioinformatics for Biologists. This is based on the final-year undergraduate module BL4273, run on Raspberry Pi computers at the University of St Andrews, Semester 1, academic year 2012-2013. 4273π is a means to teach bioinformatics, including systems administration tasks, to undergraduates at low cost.

  1. The development and application of bioinformatics core competencies to improve bioinformatics training and education.

    Science.gov (United States)

    Mulder, Nicola; Schwartz, Russell; Brazas, Michelle D; Brooksbank, Cath; Gaeta, Bruno; Morgan, Sarah L; Pauley, Mark A; Rosenwald, Anne; Rustici, Gabriella; Sierk, Michael; Warnow, Tandy; Welch, Lonnie

    2018-02-01

    Bioinformatics is recognized as part of the essential knowledge base of numerous career paths in biomedical research and healthcare. However, there is little agreement in the field over what that knowledge entails or how best to provide it. These disagreements are compounded by the wide range of populations in need of bioinformatics training, with divergent prior backgrounds and intended application areas. The Curriculum Task Force of the International Society of Computational Biology (ISCB) Education Committee has sought to provide a framework for training needs and curricula in terms of a set of bioinformatics core competencies that cut across many user personas and training programs. The initial competencies developed based on surveys of employers and training programs have since been refined through a multiyear process of community engagement. This report describes the current status of the competencies and presents a series of use cases illustrating how they are being applied in diverse training contexts. These use cases are intended to demonstrate how others can make use of the competencies and engage in the process of their continuing refinement and application. The report concludes with a consideration of remaining challenges and future plans.

  2. The development and application of bioinformatics core competencies to improve bioinformatics training and education

    Science.gov (United States)

    Brooksbank, Cath; Morgan, Sarah L.; Rosenwald, Anne; Warnow, Tandy; Welch, Lonnie

    2018-01-01

    Bioinformatics is recognized as part of the essential knowledge base of numerous career paths in biomedical research and healthcare. However, there is little agreement in the field over what that knowledge entails or how best to provide it. These disagreements are compounded by the wide range of populations in need of bioinformatics training, with divergent prior backgrounds and intended application areas. The Curriculum Task Force of the International Society of Computational Biology (ISCB) Education Committee has sought to provide a framework for training needs and curricula in terms of a set of bioinformatics core competencies that cut across many user personas and training programs. The initial competencies developed based on surveys of employers and training programs have since been refined through a multiyear process of community engagement. This report describes the current status of the competencies and presents a series of use cases illustrating how they are being applied in diverse training contexts. These use cases are intended to demonstrate how others can make use of the competencies and engage in the process of their continuing refinement and application. The report concludes with a consideration of remaining challenges and future plans. PMID:29390004

  3. Bioinformatic approaches reveal metagenomic characterization of soil microbial community.

    Directory of Open Access Journals (Sweden)

    Zhuofei Xu

    Full Text Available As is well known, soil is a complex ecosystem harboring the most prokaryotic biodiversity on the Earth. In recent years, the advent of high-throughput sequencing techniques has greatly facilitated the progress of soil ecological studies. However, how to effectively understand the underlying biological features of large-scale sequencing data is a new challenge. In the present study, we used 33 publicly available metagenomes from diverse soil sites (i.e. grassland, forest soil, desert, Arctic soil, and mangrove sediment and integrated some state-of-the-art computational tools to explore the phylogenetic and functional characterizations of the microbial communities in soil. Microbial composition and metabolic potential in soils were comprehensively illustrated at the metagenomic level. A spectrum of metagenomic biomarkers containing 46 taxa and 33 metabolic modules were detected to be significantly differential that could be used as indicators to distinguish at least one of five soil communities. The co-occurrence associations between complex microbial compositions and functions were inferred by network-based approaches. Our results together with the established bioinformatic pipelines should provide a foundation for future research into the relation between soil biodiversity and ecosystem function.

  4. Web archives

    DEFF Research Database (Denmark)

    Finnemann, Niels Ole

    2018-01-01

    This article deals with general web archives and the principles for selection of materials to be preserved. It opens with a brief overview of reasons why general web archives are needed. Section two and three present major, long termed web archive initiatives and discuss the purposes and possible...... values of web archives and asks how to meet unknown future needs, demands and concerns. Section four analyses three main principles in contemporary web archiving strategies, topic centric, domain centric and time-centric archiving strategies and section five discuss how to combine these to provide...... a broad and rich archive. Section six is concerned with inherent limitations and why web archives are always flawed. The last sections deal with the question how web archives may fit into the rapidly expanding, but fragmented landscape of digital repositories taking care of various parts...

  5. Biological Web Service Repositories Review.

    Science.gov (United States)

    Urdidiales-Nieto, David; Navas-Delgado, Ismael; Aldana-Montes, José F

    2017-05-01

    Web services play a key role in bioinformatics enabling the integration of database access and analysis of algorithms. However, Web service repositories do not usually publish information on the changes made to their registered Web services. Dynamism is directly related to the changes in the repositories (services registered or unregistered) and at service level (annotation changes). Thus, users, software clients or workflow based approaches lack enough relevant information to decide when they should review or re-execute a Web service or workflow to get updated or improved results. The dynamism of the repository could be a measure for workflow developers to re-check service availability and annotation changes in the services of interest to them. This paper presents a review on the most well-known Web service repositories in the life sciences including an analysis of their dynamism. Freshness is introduced in this paper, and has been used as the measure for the dynamism of these repositories. © 2017 The Authors. Published by Wiley-VCH Verlag GmbH & Co. KGaA.

  6. Bioinformatics research in the Asia Pacific: a 2007 update.

    Science.gov (United States)

    Ranganathan, Shoba; Gribskov, Michael; Tan, Tin Wee

    2008-01-01

    We provide a 2007 update on the bioinformatics research in the Asia-Pacific from the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998. From 2002, APBioNet has organized the first International Conference on Bioinformatics (InCoB) bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2007 Conference was organized as the 6th annual conference of the Asia-Pacific Bioinformatics Network, on Aug. 27-30, 2007 at Hong Kong, following a series of successful events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea) and New Delhi (India). Besides a scientific meeting at Hong Kong, satellite events organized are a pre-conference training workshop at Hanoi, Vietnam and a post-conference workshop at Nansha, China. This Introduction provides a brief overview of the peer-reviewed manuscripts accepted for publication in this Supplement. We have organized the papers into thematic areas, highlighting the growing contribution of research excellence from this region, to global bioinformatics endeavours.

  7. Continuing Education Workshops in Bioinformatics Positively Impact Research and Careers.

    Science.gov (United States)

    Brazas, Michelle D; Ouellette, B F Francis

    2016-06-01

    Bioinformatics.ca has been hosting continuing education programs in introductory and advanced bioinformatics topics in Canada since 1999 and has trained more than 2,000 participants to date. These workshops have been adapted over the years to keep pace with advances in both science and technology as well as the changing landscape in available learning modalities and the bioinformatics training needs of our audience. Post-workshop surveys have been a mandatory component of each workshop and are used to ensure appropriate adjustments are made to workshops to maximize learning. However, neither bioinformatics.ca nor others offering similar training programs have explored the long-term impact of bioinformatics continuing education training. Bioinformatics.ca recently initiated a look back on the impact its workshops have had on the career trajectories, research outcomes, publications, and collaborations of its participants. Using an anonymous online survey, bioinformatics.ca analyzed responses from those surveyed and discovered its workshops have had a positive impact on collaborations, research, publications, and career progression.

  8. Bioinformatics approaches for identifying new therapeutic bioactive peptides in food

    Directory of Open Access Journals (Sweden)

    Nora Khaldi

    2012-10-01

    Full Text Available ABSTRACT:The traditional methods for mining foods for bioactive peptides are tedious and long. Similar to the drug industry, the length of time to identify and deliver a commercial health ingredient that reduces disease symptoms can take anything between 5 to 10 years. Reducing this time and effort is crucial in order to create new commercially viable products with clear and important health benefits. In the past few years, bioinformatics, the science that brings together fast computational biology, and efficient genome mining, is appearing as the long awaited solution to this problem. By quickly mining food genomes for characteristics of certain food therapeutic ingredients, researchers can potentially find new ones in a matter of a few weeks. Yet, surprisingly, very little success has been achieved so far using bioinformatics in mining for food bioactives.The absence of food specific bioinformatic mining tools, the slow integration of both experimental mining and bioinformatics, and the important difference between different experimental platforms are some of the reasons for the slow progress of bioinformatics in the field of functional food and more specifically in bioactive peptide discovery.In this paper I discuss some methods that could be easily translated, using a rational peptide bioinformatics design, to food bioactive peptide mining. I highlight the need for an integrated food peptide database. I also discuss how to better integrate experimental work with bioinformatics in order to improve the mining of food for bioactive peptides, therefore achieving a higher success rates.

  9. SWS: accessing SRS sites contents through Web Services

    OpenAIRE

    Romano, Paolo; Marra, Domenico

    2008-01-01

    Background Web Services and Workflow Management Systems can support creation and deployment of network systems, able to automate data analysis and retrieval processes in biomedical research. Web Services have been implemented at bioinformatics centres and workflow systems have been proposed for biological data analysis. New databanks are often developed by taking into account these technologies, but many existing databases do not allow a programmatic access. Only a fraction of available datab...

  10. Bioinformatics in cancer therapy and drug design

    International Nuclear Information System (INIS)

    Horbach, D.Y.; Usanov, S.A.

    2005-01-01

    One of the mechanisms of external signal transduction (ionizing radiation, toxicants, stress) to the target cell is the existence of membrane and intracellular proteins with intrinsic tyrosine kinase activity. No wonder that etiology of malignant growth links to abnormalities in signal transduction through tyrosine kinases. The epidermal growth factor receptor (EGFR) tyrosine kinases play fundamental roles in development, proliferation and differentiation of tissues of epithelial, mesenchymal and neuronal origin. There are four types of EGFR: EGF receptor (ErbB1/HER1), ErbB2/Neu/HER2, ErbB3/HER3 and ErbB4/HER4. Abnormal expression of EGFR, appearance of receptor mutants with changed ability to protein-protein interactions or increased tyrosine kinase activity have been implicated in the malignancy of different types of human tumors. Bioinformatics is currently using in investigation on design and selection of drugs that can make alterations in structure or competitively bind with receptors and so display antagonistic characteristics. (authors)

  11. Bioinformatics in cancer therapy and drug design

    Energy Technology Data Exchange (ETDEWEB)

    Horbach, D Y [International A. Sakharov environmental univ., Minsk (Belarus); Usanov, S A [Inst. of bioorganic chemistry, National academy of sciences of Belarus, Minsk (Belarus)

    2005-05-15

    One of the mechanisms of external signal transduction (ionizing radiation, toxicants, stress) to the target cell is the existence of membrane and intracellular proteins with intrinsic tyrosine kinase activity. No wonder that etiology of malignant growth links to abnormalities in signal transduction through tyrosine kinases. The epidermal growth factor receptor (EGFR) tyrosine kinases play fundamental roles in development, proliferation and differentiation of tissues of epithelial, mesenchymal and neuronal origin. There are four types of EGFR: EGF receptor (ErbB1/HER1), ErbB2/Neu/HER2, ErbB3/HER3 and ErbB4/HER4. Abnormal expression of EGFR, appearance of receptor mutants with changed ability to protein-protein interactions or increased tyrosine kinase activity have been implicated in the malignancy of different types of human tumors. Bioinformatics is currently using in investigation on design and selection of drugs that can make alterations in structure or competitively bind with receptors and so display antagonistic characteristics. (authors)

  12. Bioinformatics study of the mangrove actin genes

    Science.gov (United States)

    Basyuni, M.; Wasilah, M.; Sumardi

    2017-01-01

    This study describes the bioinformatics methods to analyze eight actin genes from mangrove plants on DDBJ/EMBL/GenBank as well as predicted the structure, composition, subcellular localization, similarity, and phylogenetic. The physical and chemical properties of eight mangroves showed variation among the genes. The percentage of the secondary structure of eight mangrove actin genes followed the order of a helix > random coil > extended chain structure for BgActl, KcActl, RsActl, and A. corniculatum Act. In contrast to this observation, the remaining actin genes were random coil > extended chain structure > a helix. This study, therefore, shown the prediction of secondary structure was performed for necessary structural information. The values of chloroplast or signal peptide or mitochondrial target were too small, indicated that no chloroplast or mitochondrial transit peptide or signal peptide of secretion pathway in mangrove actin genes. These results suggested the importance of understanding the diversity and functional of properties of the different amino acids in mangrove actin genes. To clarify the relationship among the mangrove actin gene, a phylogenetic tree was constructed. Three groups of mangrove actin genes were formed, the first group contains B. gymnorrhiza BgAct and R. stylosa RsActl. The second cluster which consists of 5 actin genes the largest group, and the last branch consist of one gene, B. sexagula Act. The present study, therefore, supported the previous results that plant actin genes form distinct clusters in the tree.

  13. Bioconductor: open software development for computational biology and bioinformatics

    DEFF Research Database (Denmark)

    Gentleman, R.C.; Carey, V.J.; Bates, D.M.

    2004-01-01

    The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisci......The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry...... into interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples....

  14. An Overview of Bioinformatics Tools and Resources in Allergy.

    Science.gov (United States)

    Fu, Zhiyan; Lin, Jing

    2017-01-01

    The rapidly increasing number of characterized allergens has created huge demands for advanced information storage, retrieval, and analysis. Bioinformatics and machine learning approaches provide useful tools for the study of allergens and epitopes prediction, which greatly complement traditional laboratory techniques. The specific applications mainly include identification of B- and T-cell epitopes, and assessment of allergenicity and cross-reactivity. In order to facilitate the work of clinical and basic researchers who are not familiar with bioinformatics, we review in this chapter the most important databases, bioinformatic tools, and methods with relevance to the study of allergens.

  15. Cytoscape tools for the web age: D3.js and Cytoscape.js exporters [v2; ref status: indexed, http://f1000r.es/4k7

    Directory of Open Access Journals (Sweden)

    Keiichiro Ono

    2014-10-01

    Full Text Available In this paper we present new data export modules for Cytoscape 3 that can generate network files for Cytoscape.js and D3.js. Cytoscape.js exporter is implemented as a core feature of Cytoscape 3, and D3.js exporter is available as a Cytoscape 3 app. These modules enable users to seamlessly export network and table data sets generated in Cytoscape to popular JavaScript library readable formats. In addition, we implemented template web applications for browser-based interactive network visualization that can be used as basis for complex data visualization applications for bioinformatics research. Example web applications created with these tools demonstrate how Cytoscape works in modern data visualization workflows built with traditional desktop tools and emerging web-based technologies. This interactivity enables researchers more flexibility than with static images, thereby greatly improving the quality of insights researchers can gain from them.

  16. Graphics processing units in bioinformatics, computational biology and systems biology.

    Science.gov (United States)

    Nobile, Marco S; Cazzaniga, Paolo; Tangherloni, Andrea; Besozzi, Daniela

    2017-09-01

    Several studies in Bioinformatics, Computational Biology and Systems Biology rely on the definition of physico-chemical or mathematical models of biological systems at different scales and levels of complexity, ranging from the interaction of atoms in single molecules up to genome-wide interaction networks. Traditional computational methods and software tools developed in these research fields share a common trait: they can be computationally demanding on Central Processing Units (CPUs), therefore limiting their applicability in many circumstances. To overcome this issue, general-purpose Graphics Processing Units (GPUs) are gaining an increasing attention by the scientific community, as they can considerably reduce the running time required by standard CPU-based software, and allow more intensive investigations of biological systems. In this review, we present a collection of GPU tools recently developed to perform computational analyses in life science disciplines, emphasizing the advantages and the drawbacks in the use of these parallel architectures. The complete list of GPU-powered tools here reviewed is available at http://bit.ly/gputools. © The Author 2016. Published by Oxford University Press.

  17. Web Engineering

    Energy Technology Data Exchange (ETDEWEB)

    White, Bebo

    2003-06-23

    Web Engineering is the application of systematic, disciplined and quantifiable approaches to development, operation, and maintenance of Web-based applications. It is both a pro-active approach and a growing collection of theoretical and empirical research in Web application development. This paper gives an overview of Web Engineering by addressing the questions: (a) why is it needed? (b) what is its domain of operation? (c) how does it help and what should it do to improve Web application development? and (d) how should it be incorporated in education and training? The paper discusses the significant differences that exist between Web applications and conventional software, the taxonomy of Web applications, the progress made so far and the research issues and experience of creating a specialization at the master's level. The paper reaches a conclusion that Web Engineering at this stage is a moving target since Web technologies are constantly evolving, making new types of applications possible, which in turn may require innovations in how they are built, deployed and maintained.

  18. A survey on web modeling approaches for ubiquitous web applications

    NARCIS (Netherlands)

    Schwinger, W.; Retschitzegger, W.; Schauerhuber, A.; Kappel, G.; Wimmer, M.; Pröll, B.; Cachero Castro, C.; Casteleyn, S.; De Troyer, O.; Fraternali, P.; Garrigos, I.; Garzotto, F.; Ginige, A.; Houben, G.J.P.M.; Koch, N.; Moreno, N.; Pastor, O.; Paolini, P.; Pelechano Ferragud, V.; Rossi, G.; Schwabe, D.; Tisi, M.; Vallecillo, A.; Sluijs, van der K.A.M.; Zhang, G.

    2008-01-01

    Purpose – Ubiquitous web applications (UWA) are a new type of web applications which are accessed in various contexts, i.e. through different devices, by users with various interests, at anytime from anyplace around the globe. For such full-fledged, complex software systems, a methodologically sound

  19. Development of a cloud-based Bioinformatics Training Platform.

    Science.gov (United States)

    Revote, Jerico; Watson-Haigh, Nathan S; Quenette, Steve; Bethwaite, Blair; McGrath, Annette; Shang, Catherine A

    2017-05-01

    The Bioinformatics Training Platform (BTP) has been developed to provide access to the computational infrastructure required to deliver sophisticated hands-on bioinformatics training courses. The BTP is a cloud-based solution that is in active use for delivering next-generation sequencing training to Australian researchers at geographically dispersed locations. The BTP was built to provide an easy, accessible, consistent and cost-effective approach to delivering workshops at host universities and organizations with a high demand for bioinformatics training but lacking the dedicated bioinformatics training suites required. To support broad uptake of the BTP, the platform has been made compatible with multiple cloud infrastructures. The BTP is an open-source and open-access resource. To date, 20 training workshops have been delivered to over 700 trainees at over 10 venues across Australia using the BTP. © The Author 2016. Published by Oxford University Press.

  20. Virginia Bioinformatics Institute to expand cyberinfrastructure education and outreach project

    OpenAIRE

    Whyte, Barry James

    2008-01-01

    The National Science Foundation has awarded the Virginia Bioinformatics Institute at Virginia Tech $918,000 to expand its education and outreach program in Cyberinfrastructure - Training, Education, Advancement and Mentoring, commonly known as the CI-TEAM.

  1. An Adaptive Hybrid Multiprocessor technique for bioinformatics sequence alignment

    KAUST Repository

    Bonny, Talal; Salama, Khaled N.; Zidan, Mohammed A.

    2012-01-01

    Sequence alignment algorithms such as the Smith-Waterman algorithm are among the most important applications in the development of bioinformatics. Sequence alignment algorithms must process large amounts of data which may take a long time. Here, we

  2. Metagenomics and Bioinformatics in Microbial Ecology: Current Status and Beyond.

    Science.gov (United States)

    Hiraoka, Satoshi; Yang, Ching-Chia; Iwasaki, Wataru

    2016-09-29

    Metagenomic approaches are now commonly used in microbial ecology to study microbial communities in more detail, including many strains that cannot be cultivated in the laboratory. Bioinformatic analyses make it possible to mine huge metagenomic datasets and discover general patterns that govern microbial ecosystems. However, the findings of typical metagenomic and bioinformatic analyses still do not completely describe the ecology and evolution of microbes in their environments. Most analyses still depend on straightforward sequence similarity searches against reference databases. We herein review the current state of metagenomics and bioinformatics in microbial ecology and discuss future directions for the field. New techniques will allow us to go beyond routine analyses and broaden our knowledge of microbial ecosystems. We need to enrich reference databases, promote platforms that enable meta- or comprehensive analyses of diverse metagenomic datasets, devise methods that utilize long-read sequence information, and develop more powerful bioinformatic methods to analyze data from diverse perspectives.

  3. Bioinformatics Education in Pathology Training: Current Scope and Future Direction

    Directory of Open Access Journals (Sweden)

    Michael R Clay

    2017-04-01

    Full Text Available Training anatomic and clinical pathology residents in the principles of bioinformatics is a challenging endeavor. Most residents receive little to no formal exposure to bioinformatics during medical education, and most of the pathology training is spent interpreting histopathology slides using light microscopy or focused on laboratory regulation, management, and interpretation of discrete laboratory data. At a minimum, residents should be familiar with data structure, data pipelines, data manipulation, and data regulations within clinical laboratories. Fellowship-level training should incorporate advanced principles unique to each subspecialty. Barriers to bioinformatics education include the clinical apprenticeship training model, ill-defined educational milestones, inadequate faculty expertise, and limited exposure during medical training. Online educational resources, case-based learning, and incorporation into molecular genomics education could serve as effective educational strategies. Overall, pathology bioinformatics training can be incorporated into pathology resident curricula, provided there is motivation to incorporate, institutional support, educational resources, and adequate faculty expertise.

  4. In silico cloning and bioinformatic analysis of PEPCK gene in ...

    African Journals Online (AJOL)

    Phosphoenolpyruvate carboxykinase (PEPCK), a critical gluconeogenic enzyme, catalyzes the first committed step in the diversion of tricarboxylic acid cycle intermediates toward gluconeogenesis. According to the relative conservation of homologous gene, a bioinformatics strategy was applied to clone Fusarium ...

  5. Best practices in bioinformatics training for life scientists.

    KAUST Repository

    Via, Allegra; Blicher, Thomas; Bongcam-Rudloff, Erik; Brazas, Michelle D; Brooksbank, Cath; Budd, Aidan; De Las Rivas, Javier; Dreyer, Jacqueline; Fernandes, Pedro L; van Gelder, Celia; Jacob, Joachim; Jimenez, Rafael C; Loveland, Jane; Moran, Federico; Mulder, Nicola; Nyrö nen, Tommi; Rother, Kristian; Schneider, Maria Victoria; Attwood, Teresa K

    2013-01-01

    concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource

  6. Bioinformatics tools for development of fast and cost effective simple ...

    African Journals Online (AJOL)

    Bioinformatics tools for development of fast and cost effective simple sequence repeat ... comparative mapping and exploration of functional genetic diversity in the ... Already, a number of computer programs have been implemented that aim at ...

  7. Skate Genome Project: Cyber-Enabled Bioinformatics Collaboration

    Science.gov (United States)

    Vincent, J.

    2011-01-01

    The Skate Genome Project, a pilot project of the North East Cyber infrastructure Consortium, aims to produce a draft genome sequence of Leucoraja erinacea, the Little Skate. The pilot project was designed to also develop expertise in large scale collaborations across the NECC region. An overview of the bioinformatics and infrastructure challenges faced during the first year of the project will be presented. Results to date and lessons learned from the perspective of a bioinformatics core will be highlighted.

  8. PubData: search engine for bioinformatics databases worldwide

    OpenAIRE

    Vand, Kasra; Wahlestedt, Thor; Khomtchouk, Kelly; Sayed, Mohammed; Wahlestedt, Claes; Khomtchouk, Bohdan

    2016-01-01

    We propose a search engine and file retrieval system for all bioinformatics databases worldwide. PubData searches biomedical data in a user-friendly fashion similar to how PubMed searches biomedical literature. PubData is built on novel network programming, natural language processing, and artificial intelligence algorithms that can patch into the file transfer protocol servers of any user-specified bioinformatics database, query its contents, retrieve files for download, and adapt to the use...

  9. Assessment of Data Reliability of Wireless Sensor Network for Bioinformatics

    Directory of Open Access Journals (Sweden)

    Ting Dong

    2017-09-01

    Full Text Available As a focal point of biotechnology, bioinformatics integrates knowledge from biology, mathematics, physics, chemistry, computer science and information science. It generally deals with genome informatics, protein structure and drug design. However, the data or information thus acquired from the main areas of bioinformatics may not be effective. Some researchers combined bioinformatics with wireless sensor network (WSN into biosensor and other tools, and applied them to such areas as fermentation, environmental monitoring, food engineering, clinical medicine and military. In the combination, the WSN is used to collect data and information. The reliability of the WSN in bioinformatics is the prerequisite to effective utilization of information. It is greatly influenced by factors like quality, benefits, service, timeliness and stability, some of them are qualitative and some are quantitative. Hence, it is necessary to develop a method that can handle both qualitative and quantitative assessment of information. A viable option is the fuzzy linguistic method, especially 2-tuple linguistic model, which has been extensively used to cope with such issues. As a result, this paper introduces 2-tuple linguistic representation to assist experts in giving their opinions on different WSNs in bioinformatics that involve multiple factors. Moreover, the author proposes a novel way to determine attribute weights and uses the method to weigh the relative importance of different influencing factors which can be considered as attributes in the assessment of the WSN in bioinformatics. Finally, an illustrative example is given to provide a reasonable solution for the assessment.

  10. Web 25

    DEFF Research Database (Denmark)

    the reader on an exciting time travel journey to learn more about the prehistory of the hyperlink, the birth of the Web, the spread of the early Web, and the Web’s introduction to the general public in mainstream media. Fur- thermore, case studies of blogs, literature, and traditional media going online...

  11. Effects of genetically modified maize events expressing Cry34Ab1, Cry35Ab1, Cry1F, and CP4 EPSPS proteins on arthropod complex food webs.

    Science.gov (United States)

    Pálinkás, Zoltán; Kiss, József; Zalai, Mihály; Szénási, Ágnes; Dorner, Zita; North, Samuel; Woodward, Guy; Balog, Adalbert

    2017-04-01

    Four genetically modified (GM) maize ( Zea mays L.) hybrids (coleopteran resistant, coleopteran and lepidopteran resistant, lepidopteran resistant and herbicide tolerant, coleopteran and herbicide tolerant) and its non-GM control maize stands were tested to compare the functional diversity of arthropods and to determine whether genetic modifications alter the structure of arthropods food webs. A total number of 399,239 arthropod individuals were used for analyses. The trophic groups' number and the links between them indicated that neither the higher magnitude of Bt toxins (included resistance against insect, and against both insects and glyphosate) nor the extra glyphosate treatment changed the structure of food webs. However, differences in the average trophic links/trophic groups were detected between GM and non-GM food webs for herbivore groups and plants. Also, differences in characteristic path lengths between GM and non-GM food webs for herbivores were observed. Food webs parameterized based on 2-year in-field assessments, and their properties can be considered a useful and simple tool to evaluate the effects of Bt toxins on non-target organisms.

  12. Web Page Recommendation Using Web Mining

    OpenAIRE

    Modraj Bhavsar; Mrs. P. M. Chavan

    2014-01-01

    On World Wide Web various kind of content are generated in huge amount, so to give relevant result to user web recommendation become important part of web application. On web different kind of web recommendation are made available to user every day that includes Image, Video, Audio, query suggestion and web page. In this paper we are aiming at providing framework for web page recommendation. 1) First we describe the basics of web mining, types of web mining. 2) Details of each...

  13. A Web of Controversies

    DEFF Research Database (Denmark)

    Baron, Christian

    2011-01-01

    consequences for the treatment of apparently independent epistemic problems that are subject of investigation in other thought collectives. For the practicing scientist it is necessary to take this complex web of interactions into account in order to be able to navigate in such a situation. So far most studies...

  14. Host-parasite interactions and ecology of the malaria parasite-a bioinformatics approach.

    Science.gov (United States)

    Izak, Dariusz; Klim, Joanna; Kaczanowski, Szymon

    2018-04-25

    Malaria remains one of the highest mortality infectious diseases. Malaria is caused by parasites from the genus Plasmodium. Most deaths are caused by infections involving Plasmodium falciparum, which has a complex life cycle. Malaria parasites are extremely well adapted for interactions with their host and their host's immune system and are able to suppress the human immune system, erase immunological memory and rapidly alter exposed antigens. Owing to this rapid evolution, parasites develop drug resistance and express novel forms of antigenic proteins that are not recognized by the host immune system. There is an emerging need for novel interventions, including novel drugs and vaccines. Designing novel therapies requires knowledge about host-parasite interactions, which is still limited. However, significant progress has recently been achieved in this field through the application of bioinformatics analysis of parasite genome sequences. In this review, we describe the main achievements in 'malarial' bioinformatics and provide examples of successful applications of protein sequence analysis. These examples include the prediction of protein functions based on homology and the prediction of protein surface localization via domain and motif analysis. Additionally, we describe PlasmoDB, a database that stores accumulated experimental data. This tool allows data mining of the stored information and will play an important role in the development of malaria science. Finally, we illustrate the application of bioinformatics in the development of population genetics research on malaria parasites, an approach referred to as reverse ecology.

  15. G-DOC Plus - an integrative bioinformatics platform for precision medicine.

    Science.gov (United States)

    Bhuvaneshwar, Krithika; Belouali, Anas; Singh, Varun; Johnson, Robert M; Song, Lei; Alaoui, Adil; Harris, Michael A; Clarke, Robert; Weiner, Louis M; Gusev, Yuriy; Madhavan, Subha

    2016-04-30

    G-DOC Plus is a data integration and bioinformatics platform that uses cloud computing and other advanced computational tools to handle a variety of biomedical BIG DATA including gene expression arrays, NGS and medical images so that they can be analyzed in the full context of other omics and clinical information. G-DOC Plus currently holds data from over 10,000 patients selected from private and public resources including Gene Expression Omnibus (GEO), The Cancer Genome Atlas (TCGA) and the recently added datasets from REpository for Molecular BRAin Neoplasia DaTa (REMBRANDT), caArray studies of lung and colon cancer, ImmPort and the 1000 genomes data sets. The system allows researchers to explore clinical-omic data one sample at a time, as a cohort of samples; or at the level of population, providing the user with a comprehensive view of the data. G-DOC Plus tools have been leveraged in cancer and non-cancer studies for hypothesis generation and validation; biomarker discovery and multi-omics analysis, to explore somatic mutations and cancer MRI images; as well as for training and graduate education in bioinformatics, data and computational sciences. Several of these use cases are described in this paper to demonstrate its multifaceted usability. G-DOC Plus can be used to support a variety of user groups in multiple domains to enable hypothesis generation for precision medicine research. The long-term vision of G-DOC Plus is to extend this translational bioinformatics platform to stay current with emerging omics technologies and analysis methods to continue supporting novel hypothesis generation, analysis and validation for integrative biomedical research. By integrating several aspects of the disease and exposing various data elements, such as outpatient lab workup, pathology, radiology, current treatments, molecular signatures and expected outcomes over a web interface, G-DOC Plus will continue to strengthen precision medicine research. G-DOC Plus is available

  16. PyPedia: using the wiki paradigm as crowd sourcing environment for bioinformatics protocols.

    Science.gov (United States)

    Kanterakis, Alexandros; Kuiper, Joël; Potamias, George; Swertz, Morris A

    2015-01-01

    Today researchers can choose from many bioinformatics protocols for all types of life sciences research, computational environments and coding languages. Although the majority of these are open source, few of them possess all virtues to maximize reuse and promote reproducible science. Wikipedia has proven a great tool to disseminate information and enhance collaboration between users with varying expertise and background to author qualitative content via crowdsourcing. However, it remains an open question whether the wiki paradigm can be applied to bioinformatics protocols. We piloted PyPedia, a wiki where each article is both implementation and documentation of a bioinformatics computational protocol in the python language. Hyperlinks within the wiki can be used to compose complex workflows and induce reuse. A RESTful API enables code execution outside the wiki. Initial content of PyPedia contains articles for population statistics, bioinformatics format conversions and genotype imputation. Use of the easy to learn wiki syntax effectively lowers the barriers to bring expert programmers and less computer savvy researchers on the same page. PyPedia demonstrates how wiki can provide a collaborative development, sharing and even execution environment for biologists and bioinformaticians that complement existing resources, useful for local and multi-center research teams. PyPedia is available online at: http://www.pypedia.com. The source code and installation instructions are available at: https://github.com/kantale/PyPedia_server. The PyPedia python library is available at: https://github.com/kantale/pypedia. PyPedia is open-source, available under the BSD 2-Clause License.

  17. The secondary metabolite bioinformatics portal: Computational tools to facilitate synthetic biology of secondary metabolite production

    Directory of Open Access Journals (Sweden)

    Tilmann Weber

    2016-06-01

    Full Text Available Natural products are among the most important sources of lead molecules for drug discovery. With the development of affordable whole-genome sequencing technologies and other ‘omics tools, the field of natural products research is currently undergoing a shift in paradigms. While, for decades, mainly analytical and chemical methods gave access to this group of compounds, nowadays genomics-based methods offer complementary approaches to find, identify and characterize such molecules. This paradigm shift also resulted in a high demand for computational tools to assist researchers in their daily work. In this context, this review gives a summary of tools and databases that currently are available to mine, identify and characterize natural product biosynthesis pathways and their producers based on ‘omics data. A web portal called Secondary Metabolite Bioinformatics Portal (SMBP at http://www.secondarymetabolites.org is introduced to provide a one-stop catalog and links to these bioinformatics resources. In addition, an outlook is presented how the existing tools and those to be developed will influence synthetic biology approaches in the natural products field.

  18. BioQueue: a novel pipeline framework to accelerate bioinformatics analysis.

    Science.gov (United States)

    Yao, Li; Wang, Heming; Song, Yuanyuan; Sui, Guangchao

    2017-10-15

    With the rapid development of Next-Generation Sequencing, a large amount of data is now available for bioinformatics research. Meanwhile, the presence of many pipeline frameworks makes it possible to analyse these data. However, these tools concentrate mainly on their syntax and design paradigms, and dispatch jobs based on users' experience about the resources needed by the execution of a certain step in a protocol. As a result, it is difficult for these tools to maximize the potential of computing resources, and avoid errors caused by overload, such as memory overflow. Here, we have developed BioQueue, a web-based framework that contains a checkpoint before each step to automatically estimate the system resources (CPU, memory and disk) needed by the step and then dispatch jobs accordingly. BioQueue possesses a shell command-like syntax instead of implementing a new script language, which means most biologists without computer programming background can access the efficient queue system with ease. BioQueue is freely available at https://github.com/liyao001/BioQueue. The extensive documentation can be found at http://bioqueue.readthedocs.io. li_yao@outlook.com or gcsui@nefu.edu.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  19. Bioinformatics Meets Virology: The European Virus Bioinformatics Center's Second Annual Meeting.

    Science.gov (United States)

    Ibrahim, Bashar; Arkhipova, Ksenia; Andeweg, Arno C; Posada-Céspedes, Susana; Enault, François; Gruber, Arthur; Koonin, Eugene V; Kupczok, Anne; Lemey, Philippe; McHardy, Alice C; McMahon, Dino P; Pickett, Brett E; Robertson, David L; Scheuermann, Richard H; Zhernakova, Alexandra; Zwart, Mark P; Schönhuth, Alexander; Dutilh, Bas E; Marz, Manja

    2018-05-14

    The Second Annual Meeting of the European Virus Bioinformatics Center (EVBC), held in Utrecht, Netherlands, focused on computational approaches in virology, with topics including (but not limited to) virus discovery, diagnostics, (meta-)genomics, modeling, epidemiology, molecular structure, evolution, and viral ecology. The goals of the Second Annual Meeting were threefold: (i) to bring together virologists and bioinformaticians from across the academic, industrial, professional, and training sectors to share best practice; (ii) to provide a meaningful and interactive scientific environment to promote discussion and collaboration between students, postdoctoral fellows, and both new and established investigators; (iii) to inspire and suggest new research directions and questions. Approximately 120 researchers from around the world attended the Second Annual Meeting of the EVBC this year, including 15 renowned international speakers. This report presents an overview of new developments and novel research findings that emerged during the meeting.

  20. Sensor web

    Science.gov (United States)

    Delin, Kevin A. (Inventor); Jackson, Shannon P. (Inventor)

    2011-01-01

    A Sensor Web formed of a number of different sensor pods. Each of the sensor pods include a clock which is synchronized with a master clock so that all of the sensor pods in the Web have a synchronized clock. The synchronization is carried out by first using a coarse synchronization which takes less power, and subsequently carrying out a fine synchronization to make a fine sync of all the pods on the Web. After the synchronization, the pods ping their neighbors to determine which pods are listening and responded, and then only listen during time slots corresponding to those pods which respond.

  1. Automatic generation of bioinformatics tools for predicting protein-ligand binding sites.

    Science.gov (United States)

    Komiyama, Yusuke; Banno, Masaki; Ueki, Kokoro; Saad, Gul; Shimizu, Kentaro

    2016-03-15

    Predictive tools that model protein-ligand binding on demand are needed to promote ligand research in an innovative drug-design environment. However, it takes considerable time and effort to develop predictive tools that can be applied to individual ligands. An automated production pipeline that can rapidly and efficiently develop user-friendly protein-ligand binding predictive tools would be useful. We developed a system for automatically generating protein-ligand binding predictions. Implementation of this system in a pipeline of Semantic Web technique-based web tools will allow users to specify a ligand and receive the tool within 0.5-1 day. We demonstrated high prediction accuracy for three machine learning algorithms and eight ligands. The source code and web application are freely available for download at http://utprot.net They are implemented in Python and supported on Linux. shimizu@bi.a.u-tokyo.ac.jp Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  2. Integrative Analysis of Complex Cancer Genomics and Clinical Profiles Using the cBioPortal

    Science.gov (United States)

    Gao, Jianjiong; Aksoy, Bülent Arman; Dogrusoz, Ugur; Dresdner, Gideon; Gross, Benjamin; Sumer, S. Onur; Sun, Yichao; Jacobsen, Anders; Sinha, Rileen; Larsson, Erik; Cerami, Ethan; Sander, Chris; Schultz, Nikolaus

    2014-01-01

    The cBioPortal for Cancer Genomics (http://cbioportal.org) provides a Web resource for exploring, visualizing, and analyzing multidimensional cancer genomics data. The portal reduces molecular profiling data from cancer tissues and cell lines into readily understandable genetic, epigenetic, gene expression, and proteomic events. The query interface combined with customized data storage enables researchers to interactively explore genetic alterations across samples, genes, and pathways and, when available in the underlying data, to link these to clinical outcomes. The portal provides graphical summaries of gene-level data from multiple platforms, network visualization and analysis, survival analysis, patient-centric queries, and software programmatic access. The intuitive Web interface of the portal makes complex cancer genomics profiles accessible to researchers and clinicians without requiring bioinformatics expertise, thus facilitating biological discoveries. Here, we provide a practical guide to the analysis and visualization features of the cBioPortal for Cancer Genomics. PMID:23550210

  3. Web Analytics

    Science.gov (United States)

    EPA’s Web Analytics Program collects, analyzes, and provides reports on traffic, quality assurance, and customer satisfaction metrics for EPA’s website. The program uses a variety of analytics tools, including Google Analytics and CrazyEgg.

  4. Web Service

    Science.gov (United States)

    ... topic data in XML format. Using the Web service, software developers can build applications that utilize MedlinePlus health topic information. The service accepts keyword searches as requests and returns relevant ...

  5. SeqHound: biological sequence and structure database as a platform for bioinformatics research

    Directory of Open Access Journals (Sweden)

    Dumontier Michel

    2002-10-01

    Full Text Available Abstract Background SeqHound has been developed as an integrated biological sequence, taxonomy, annotation and 3-D structure database system. It provides a high-performance server platform for bioinformatics research in a locally-hosted environment. Results SeqHound is based on the National Center for Biotechnology Information data model and programming tools. It offers daily updated contents of all Entrez sequence databases in addition to 3-D structural data and information about sequence redundancies, sequence neighbours, taxonomy, complete genomes, functional annotation including Gene Ontology terms and literature links to PubMed. SeqHound is accessible via a web server through a Perl, C or C++ remote API or an optimized local API. It provides functionality necessary to retrieve specialized subsets of sequences, structures and structural domains. Sequences may be retrieved in FASTA, GenBank, ASN.1 and XML formats. Structures are available in ASN.1, XML and PDB formats. Emphasis has been placed on complete genomes, taxonomy, domain and functional annotation as well as 3-D structural functionality in the API, while fielded text indexing functionality remains under development. SeqHound also offers a streamlined WWW interface for simple web-user queries. Conclusions The system has proven useful in several published bioinformatics projects such as the BIND database and offers a cost-effective infrastructure for research. SeqHound will continue to develop and be provided as a service of the Blueprint Initiative at the Samuel Lunenfeld Research Institute. The source code and examples are available under the terms of the GNU public license at the Sourceforge site http://sourceforge.net/projects/slritools/ in the SLRI Toolkit.

  6. GENEASE: Real time bioinformatics tool for multi-omics and disease ontology exploration, analysis and visualization.

    Science.gov (United States)

    Ghandikota, Sudhir; Hershey, Gurjit K Khurana; Mersha, Tesfaye B

    2018-03-24

    Advances in high-throughput sequencing technologies have made it possible to generate multiple omics data at an unprecedented rate and scale. The accumulation of these omics data far outpaces the rate at which biologists can mine and generate new hypothesis to test experimentally. There is an urgent need to develop a myriad of powerful tools to efficiently and effectively search and filter these resources to address specific post-GWAS functional genomics questions. However, to date, these resources are scattered across several databases and often lack a unified portal for data annotation and analytics. In addition, existing tools to analyze and visualize these databases are highly fragmented, resulting researchers to access multiple applications and manual interventions for each gene or variant in an ad hoc fashion until all the questions are answered. In this study, we present GENEASE, a web-based one-stop bioinformatics tool designed to not only query and explore multi-omics and phenotype databases (e.g., GTEx, ClinVar, dbGaP, GWAS Catalog, ENCODE, Roadmap Epigenomics, KEGG, Reactome, Gene and Phenotype Ontology) in a single web interface but also to perform seamless post genome-wide association downstream functional and overlap analysis for non-coding regulatory variants. GENEASE accesses over 50 different databases in public domain including model organism-specific databases to facilitate gene/variant and disease exploration, enrichment and overlap analysis in real time. It is a user-friendly tool with point-and-click interface containing links for support information including user manual and examples. GENEASE can be accessed freely at http://research.cchmc.org/mershalab/genease_new/login.html. Tesfaye.Mersha@cchmc.org, Sudhir.Ghandikota@cchmc.org. Supplementary data are available at Bioinformatics online.

  7. Unipept web services for metaproteomics analysis.

    Science.gov (United States)

    Mesuere, Bart; Willems, Toon; Van der Jeugt, Felix; Devreese, Bart; Vandamme, Peter; Dawyndt, Peter

    2016-06-01

    Unipept is an open source web application that is designed for metaproteomics analysis with a focus on interactive datavisualization. It is underpinned by a fast index built from UniProtKB and the NCBI taxonomy that enables quick retrieval of all UniProt entries in which a given tryptic peptide occurs. Unipept version 2.4 introduced web services that provide programmatic access to the metaproteomics analysis features. This enables integration of Unipept functionality in custom applications and data processing pipelines. The web services are freely available at http://api.unipept.ugent.be and are open sourced under the MIT license. Unipept@ugent.be Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  8. Moby and Moby 2: creatures of the deep (web).

    Science.gov (United States)

    Vandervalk, Ben P; McCarthy, E Luke; Wilkinson, Mark D

    2009-03-01

    Facile and meaningful integration of data from disparate resources is the 'holy grail' of bioinformatics. Some resources have begun to address this problem by providing their data using Semantic Web standards, specifically the Resource Description Framework (RDF) and the Web Ontology Language (OWL). Unfortunately, adoption of Semantic Web standards has been slow overall, and even in cases where the standards are being utilized, interconnectivity between resources is rare. In response, we have seen the emergence of centralized 'semantic warehouses' that collect public data from third parties, integrate it, translate it into OWL/RDF and provide it to the community as a unified and queryable resource. One limitation of the warehouse approach is that queries are confined to the resources that have been selected for inclusion. A related problem, perhaps of greater concern, is that the majority of bioinformatics data exists in the 'Deep Web'-that is, the data does not exist until an application or analytical tool is invoked, and therefore does not have a predictable Web address. The inability to utilize Uniform Resource Identifiers (URIs) to address this data is a barrier to its accessibility via URI-centric Semantic Web technologies. Here we examine 'The State of the Union' for the adoption of Semantic Web standards in the health care and life sciences domain by key bioinformatics resources, explore the nature and connectivity of several community-driven semantic warehousing projects, and report on our own progress with the CardioSHARE/Moby-2 project, which aims to make the resources of the Deep Web transparently accessible through SPARQL queries.

  9. GBA manager: an online tool for querying low-complexity regions in proteins.

    Science.gov (United States)

    Bandyopadhyay, Nirmalya; Kahveci, Tamer

    2010-01-01

    Abstract We developed GBA Manager, an online software that facilitates the Graph-Based Algorithm (GBA) we proposed in our earlier work. GBA identifies the low-complexity regions (LCR) of protein sequences. GBA exploits a similarity matrix, such as BLOSUM62, to compute the complexity of the subsequences of the input protein sequence. It uses a graph-based algorithm to accurately compute the regions that have low complexities. GBA Manager is a user friendly web-service that enables online querying of protein sequences using GBA. In addition to querying capabilities of the existing GBA algorithm, GBA Manager computes the p-values of the LCR identified. The p-value gives an estimate of the possibility that the region appears by chance. GBA Manager presents the output in three different understandable formats. GBA Manager is freely accessible at http://bioinformatics.cise.ufl.edu/GBA/GBA.htm .

  10. Bioinformatics in the Netherlands: the value of a nationwide community.

    Science.gov (United States)

    van Gelder, Celia W G; Hooft, Rob W W; van Rijswijk, Merlijn N; van den Berg, Linda; Kok, Ruben G; Reinders, Marcel; Mons, Barend; Heringa, Jaap

    2017-09-15

    This review provides a historical overview of the inception and development of bioinformatics research in the Netherlands. Rooted in theoretical biology by foundational figures such as Paulien Hogeweg (at Utrecht University since the 1970s), the developments leading to organizational structures supporting a relatively large Dutch bioinformatics community will be reviewed. We will show that the most valuable resource that we have built over these years is the close-knit national expert community that is well engaged in basic and translational life science research programmes. The Dutch bioinformatics community is accustomed to facing the ever-changing landscape of data challenges and working towards solutions together. In addition, this community is the stable factor on the road towards sustainability, especially in times where existing funding models are challenged and change rapidly. © The Author 2017. Published by Oxford University Press.

  11. GOBLET: the Global Organisation for Bioinformatics Learning, Education and Training.

    Science.gov (United States)

    Attwood, Teresa K; Atwood, Teresa K; Bongcam-Rudloff, Erik; Brazas, Michelle E; Corpas, Manuel; Gaudet, Pascale; Lewitter, Fran; Mulder, Nicola; Palagi, Patricia M; Schneider, Maria Victoria; van Gelder, Celia W G

    2015-04-01

    In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy--paradoxically, many are actually closing "niche" bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all.

  12. PseKRAAC: a flexible web server for generating pseudo K-tuple reduced amino acids composition.

    Science.gov (United States)

    Zuo, Yongchun; Li, Yuan; Chen, Yingli; Li, Guangpeng; Yan, Zhenhe; Yang, Lei

    2017-01-01

    The reduced amino acids perform powerful ability for both simplifying protein complexity and identifying functional conserved regions. However, dealing with different protein problems may need different kinds of cluster methods. Encouraged by the success of pseudo-amino acid composition algorithm, we developed a freely available web server, called PseKRAAC (the pseudo K-tuple reduced amino acids composition). By implementing reduced amino acid alphabets, the protein complexity can be significantly simplified, which leads to decrease chance of overfitting, lower computational handicap and reduce information redundancy. PseKRAAC delivers more capability for protein research by incorporating three crucial parameters that describes protein composition. Users can easily generate many different modes of PseKRAAC tailored to their needs by selecting various reduced amino acids alphabets and other characteristic parameters. It is anticipated that the PseKRAAC web server will become a very useful tool in computational proteomics and protein sequence analysis. Freely available on the web at http://bigdata.imu.edu.cn/psekraac CONTACTS: yczuo@imu.edu.cn or imu.hema@foxmail.com or yanglei_hmu@163.comSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  13. Genomics and bioinformatics resources for translational science in Rosaceae.

    Science.gov (United States)

    Jung, Sook; Main, Dorrie

    2014-01-01

    Recent technological advances in biology promise unprecedented opportunities for rapid and sustainable advancement of crop quality. Following this trend, the Rosaceae research community continues to generate large amounts of genomic, genetic and breeding data. These include annotated whole genome sequences, transcriptome and expression data, proteomic and metabolomic data, genotypic and phenotypic data, and genetic and physical maps. Analysis, storage, integration and dissemination of these data using bioinformatics tools and databases are essential to provide utility of the data for basic, translational and applied research. This review discusses the currently available genomics and bioinformatics resources for the Rosaceae family.

  14. Naturally selecting solutions: the use of genetic algorithms in bioinformatics.

    Science.gov (United States)

    Manning, Timmy; Sleator, Roy D; Walsh, Paul

    2013-01-01

    For decades, computer scientists have looked to nature for biologically inspired solutions to computational problems; ranging from robotic control to scheduling optimization. Paradoxically, as we move deeper into the post-genomics era, the reverse is occurring, as biologists and bioinformaticians look to computational techniques, to solve a variety of biological problems. One of the most common biologically inspired techniques are genetic algorithms (GAs), which take the Darwinian concept of natural selection as the driving force behind systems for solving real world problems, including those in the bioinformatics domain. Herein, we provide an overview of genetic algorithms and survey some of the most recent applications of this approach to bioinformatics based problems.

  15. Best practices in bioinformatics training for life scientists

    DEFF Research Database (Denmark)

    Via, Allegra; Blicher, Thomas; Bongcam-Rudloff, Erik

    2013-01-01

    their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes...... to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse...

  16. TAPIR, a web server for the prediction of plant microRNA targets, including target mimics.

    Science.gov (United States)

    Bonnet, Eric; He, Ying; Billiau, Kenny; Van de Peer, Yves

    2010-06-15

    We present a new web server called TAPIR, designed for the prediction of plant microRNA targets. The server offers the possibility to search for plant miRNA targets using a fast and a precise algorithm. The precise option is much slower but guarantees to find less perfectly paired miRNA-target duplexes. Furthermore, the precise option allows the prediction of target mimics, which are characterized by a miRNA-target duplex having a large loop, making them undetectable by traditional tools. The TAPIR web server can be accessed at: http://bioinformatics.psb.ugent.be/webtools/tapir. Supplementary data are available at Bioinformatics online.

  17. A bioinformatic survey of RNA-binding proteins in Plasmodium.

    Science.gov (United States)

    Reddy, B P Niranjan; Shrestha, Sony; Hart, Kevin J; Liang, Xiaoying; Kemirembe, Karen; Cui, Liwang; Lindner, Scott E

    2015-11-02

    The malaria parasites in the genus Plasmodium have a very complicated life cycle involving an invertebrate vector and a vertebrate host. RNA-binding proteins (RBPs) are critical factors involved in every aspect of the development of these parasites. However, very few RBPs have been functionally characterized to date in the human parasite Plasmodium falciparum. Using different bioinformatic methods and tools we searched P. falciparum genome to list and annotate RBPs. A representative 3D models for each of the RBD domain identified in P. falciparum was created using I-TESSAR and SWISS-MODEL. Microarray and RNAseq data analysis pertaining PfRBPs was performed using MeV software. Finally, Cytoscape was used to create protein-protein interaction network for CITH-Dozi and Caf1-CCR4-Not complexes. We report the identification of 189 putative RBP genes belonging to 13 different families in Plasmodium, which comprise 3.5% of all annotated genes. Almost 90% (169/189) of these genes belong to six prominent RBP classes, namely RNA recognition motifs, DEAD/H-box RNA helicases, K homology, Zinc finger, Puf and Alba gene families. Interestingly, almost all of the identified RNA-binding helicases and KH genes have cognate homologs in model species, suggesting their evolutionary conservation. Exploration of the existing P. falciparum blood-stage transcriptomes revealed that most RBPs have peak mRNA expression levels early during the intraerythrocytic development cycle, which taper off in later stages. Nearly 27% of RBPs have elevated expression in gametocytes, while 47 and 24% have elevated mRNA expression in ookinete and asexual stages. Comparative interactome analyses using human and Plasmodium protein-protein interaction datasets suggest extensive conservation of the PfCITH/PfDOZI and PfCaf1-CCR4-NOT complexes. The Plasmodium parasites possess a large number of putative RBPs belonging to most of RBP families identified so far, suggesting the presence of extensive post

  18. Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum

    Energy Technology Data Exchange (ETDEWEB)

    Ditty, Jayna L.; Kvaal, Christopher A.; Goodner, Brad; Freyermuth, Sharyn K.; Bailey, Cheryl; Britton, Robert A.; Gordon, Stuart G.; Heinhorst, Sabine; Reed, Kelynne; Xu, Zhaohui; Sanders-Lorenz, Erin R.; Axen, Seth; Kim, Edwin; Johns, Mitrick; Scott, Kathleen; Kerfeld, Cheryl A.

    2011-08-01

    Undergraduate life sciences education needs an overhaul, as clearly described in the National Research Council of the National Academies publication BIO 2010: Transforming Undergraduate Education for Future Research Biologists. Among BIO 2010's top recommendations is the need to involve students in working with real data and tools that reflect the nature of life sciences research in the 21st century. Education research studies support the importance of utilizing primary literature, designing and implementing experiments, and analyzing results in the context of a bona fide scientific question in cultivating the analytical skills necessary to become a scientist. Incorporating these basic scientific methodologies in undergraduate education leads to increased undergraduate and post-graduate retention in the sciences. Toward this end, many undergraduate teaching organizations offer training and suggestions for faculty to update and improve their teaching approaches to help students learn as scientists, through design and discovery (e.g., Council of Undergraduate Research [www.cur.org] and Project Kaleidoscope [www.pkal.org]). With the advent of genome sequencing and bioinformatics, many scientists now formulate biological questions and interpret research results in the context of genomic information. Just as the use of bioinformatic tools and databases changed the way scientists investigate problems, it must change how scientists teach to create new opportunities for students to gain experiences reflecting the influence of genomics, proteomics, and bioinformatics on modern life sciences research. Educators have responded by incorporating bioinformatics into diverse life science curricula. While these published exercises in, and guidelines for, bioinformatics curricula are helpful and inspirational, faculty new to the area of bioinformatics inevitably need training in the theoretical underpinnings of the algorithms. Moreover, effectively integrating bioinformatics

  19. 5th HUPO BPP Bioinformatics Meeting at the European Bioinformatics Institute in Hinxton, UK--Setting the analysis frame.

    Science.gov (United States)

    Stephan, Christian; Hamacher, Michael; Blüggel, Martin; Körting, Gerhard; Chamrad, Daniel; Scheer, Christian; Marcus, Katrin; Reidegeld, Kai A; Lohaus, Christiane; Schäfer, Heike; Martens, Lennart; Jones, Philip; Müller, Michael; Auyeung, Kevin; Taylor, Chris; Binz, Pierre-Alain; Thiele, Herbert; Parkinson, David; Meyer, Helmut E; Apweiler, Rolf

    2005-09-01

    The Bioinformatics Committee of the HUPO Brain Proteome Project (HUPO BPP) meets regularly to execute the post-lab analyses of the data produced in the HUPO BPP pilot studies. On July 7, 2005 the members came together for the 5th time at the European Bioinformatics Institute (EBI) in Hinxton, UK, hosted by Rolf Apweiler. As a main result, the parameter set of the semi-automated data re-analysis of MS/MS spectra has been elaborated and the subsequent work steps have been defined.

  20. LipSpin: A New Bioinformatics Tool for Quantitative 1H NMR Lipid Profiling.

    Science.gov (United States)

    Barrilero, Rubén; Gil, Miriam; Amigó, Núria; Dias, Cintia B; Wood, Lisa G; Garg, Manohar L; Ribalta, Josep; Heras, Mercedes; Vinaixa, Maria; Correig, Xavier

    2018-02-06

    The structural similarity among lipid species and the low sensitivity and spectral resolution of nuclear magnetic resonance (NMR) have traditionally hampered the routine use of 1 H NMR lipid profiling of complex biological samples in metabolomics, which remains mostly manual and lacks freely available bioinformatics tools. However, 1 H NMR lipid profiling provides fast quantitative screening of major lipid classes (fatty acids, glycerolipids, phospholipids, and sterols) and some individual species and has been used in several clinical and nutritional studies, leading to improved risk prediction models. In this Article, we present LipSpin, a free and open-source bioinformatics tool for quantitative 1 H NMR lipid profiling. LipSpin implements a constrained line shape fitting algorithm based on voigt profiles and spectral templates from spectra of lipid standards, which automates the analysis of severely overlapped spectral regions and lipid signals with complex coupling patterns. LipSpin provides the most detailed quantification of fatty acid families and choline phospholipids in serum lipid samples by 1 H NMR to date. Moreover, analytical and clinical results using LipSpin quantifications conform with other techniques commonly used for lipid analysis.

  1. Intrageneric Primer Design: Bringing Bioinformatics Tools to the Class

    Science.gov (United States)

    Lima, Andre O. S.; Garces, Sergio P. S.

    2006-01-01

    Bioinformatics is one of the fastest growing scientific areas over the last decade. It focuses on the use of informatics tools for the organization and analysis of biological data. An example of their importance is the availability nowadays of dozens of software programs for genomic and proteomic studies. Thus, there is a growing field (private…

  2. A BIOINFORMATIC STRATEGY TO RAPIDLY CHARACTERIZE CDNA LIBRARIES

    Science.gov (United States)

    A Bioinformatic Strategy to Rapidly Characterize cDNA LibrariesG. Charles Ostermeier1, David J. Dix2 and Stephen A. Krawetz1.1Departments of Obstetrics and Gynecology, Center for Molecular Medicine and Genetics, & Institute for Scientific Computing, Wayne State Univer...

  3. Bioinformatics in the Netherlands : The value of a nationwide community

    NARCIS (Netherlands)

    van Gelder, Celia W.G.; Hooft, Rob; van Rijswijk, Merlijn; van den Berg, Linda; Kok, Ruben; Reinders, M.J.T.; Mons, Barend; Heringa, Jaap

    2017-01-01

    This review provides a historical overview of the inception and development of bioinformatics research in the Netherlands. Rooted in theoretical biology by foundational figures such as Paulien Hogeweg (at Utrecht University since the 1970s), the developments leading to organizational structures

  4. Bioinformatic tools and guideline for PCR primer design | Abd ...

    African Journals Online (AJOL)

    Bioinformatics has become an essential tool not only for basic research but also for applied research in biotechnology and biomedical sciences. Optimal primer sequence and appropriate primer concentration are essential for maximal specificity and efficiency of PCR. A poorly designed primer can result in little or no ...

  5. CROSSWORK for Glycans: Glycan Identificatin Through Mass Spectrometry and Bioinformatics

    DEFF Research Database (Denmark)

    Rasmussen, Morten; Thaysen-Andersen, Morten; Højrup, Peter

      We have developed "GLYCANthrope " - CROSSWORKS for glycans:  a bioinformatics tool, which assists in identifying N-linked glycosylated peptides as well as their glycan moieties from MS2 data of enzymatically digested glycoproteins. The program runs either as a stand-alone application or as a plug...

  6. Bioinformatics for Undergraduates: Steps toward a Quantitative Bioscience Curriculum

    Science.gov (United States)

    Chapman, Barbara S.; Christmann, James L.; Thatcher, Eileen F.

    2006-01-01

    We describe an innovative bioinformatics course developed under grants from the National Science Foundation and the California State University Program in Research and Education in Biotechnology for undergraduate biology students. The project has been part of a continuing effort to offer students classroom experiences focused on principles and…

  7. Mathematics and evolutionary biology make bioinformatics education comprehensible

    Science.gov (United States)

    Weisstein, Anton E.

    2013-01-01

    The patterns of variation within a molecular sequence data set result from the interplay between population genetic, molecular evolutionary and macroevolutionary processes—the standard purview of evolutionary biologists. Elucidating these patterns, particularly for large data sets, requires an understanding of the structure, assumptions and limitations of the algorithms used by bioinformatics software—the domain of mathematicians and computer scientists. As a result, bioinformatics often suffers a ‘two-culture’ problem because of the lack of broad overlapping expertise between these two groups. Collaboration among specialists in different fields has greatly mitigated this problem among active bioinformaticians. However, science education researchers report that much of bioinformatics education does little to bridge the cultural divide, the curriculum too focused on solving narrow problems (e.g. interpreting pre-built phylogenetic trees) rather than on exploring broader ones (e.g. exploring alternative phylogenetic strategies for different kinds of data sets). Herein, we present an introduction to the mathematics of tree enumeration, tree construction, split decomposition and sequence alignment. We also introduce off-line downloadable software tools developed by the BioQUEST Curriculum Consortium to help students learn how to interpret and critically evaluate the results of standard bioinformatics analyses. PMID:23821621

  8. Rapid cloning and bioinformatic analysis of spinach Y chromosome ...

    Indian Academy of Sciences (India)

    Rapid cloning and bioinformatic analysis of spinach Y chromosome- specific EST sequences. Chuan-Liang Deng, Wei-Li Zhang, Ying Cao, Shao-Jing Wang, ... Arabidopsis thaliana mRNA for mitochondrial half-ABC transporter (STA1 gene). 389 2.31E-13. 98.96. SP3−12. Betula pendula histidine kinase 3 (HK3) mRNA, ...

  9. Staff Scientist - RNA Bioinformatics | Center for Cancer Research

    Science.gov (United States)

    The newly established RNA Biology Laboratory (RBL) at the Center for Cancer Research (CCR), National Cancer Institute (NCI), National Institutes of Health (NIH) in Frederick, Maryland is recruiting a Staff Scientist with strong expertise in RNA bioinformatics to join the Intramural Research Program’s mission of high impact, high reward science. The RBL is the equivalent of an

  10. Mathematics and evolutionary biology make bioinformatics education comprehensible.

    Science.gov (United States)

    Jungck, John R; Weisstein, Anton E

    2013-09-01

    The patterns of variation within a molecular sequence data set result from the interplay between population genetic, molecular evolutionary and macroevolutionary processes-the standard purview of evolutionary biologists. Elucidating these patterns, particularly for large data sets, requires an understanding of the structure, assumptions and limitations of the algorithms used by bioinformatics software-the domain of mathematicians and computer scientists. As a result, bioinformatics often suffers a 'two-culture' problem because of the lack of broad overlapping expertise between these two groups. Collaboration among specialists in different fields has greatly mitigated this problem among active bioinformaticians. However, science education researchers report that much of bioinformatics education does little to bridge the cultural divide, the curriculum too focused on solving narrow problems (e.g. interpreting pre-built phylogenetic trees) rather than on exploring broader ones (e.g. exploring alternative phylogenetic strategies for different kinds of data sets). Herein, we present an introduction to the mathematics of tree enumeration, tree construction, split decomposition and sequence alignment. We also introduce off-line downloadable software tools developed by the BioQUEST Curriculum Consortium to help students learn how to interpret and critically evaluate the results of standard bioinformatics analyses.

  11. Fiber webs

    Science.gov (United States)

    Roger M. Rowell; James S. Han; Von L. Byrd

    2005-01-01

    Wood fibers can be used to produce a wide variety of low-density three-dimensional webs, mats, and fiber-molded products. Short wood fibers blended with long fibers can be formed into flexible fiber mats, which can be made by physical entanglement, nonwoven needling, or thermoplastic fiber melt matrix technologies. The most common types of flexible mats are carded, air...

  12. Web Sitings.

    Science.gov (United States)

    Lo, Erika

    2001-01-01

    Presents seven mathematics games, located on the World Wide Web, for elementary students, including: Absurd Math: Pre-Algebra from Another Dimension; The Little Animals Activity Centre; MathDork Game Room (classic video games focusing on algebra); Lemonade Stand (students practice math and business skills); Math Cats (teaches the artistic beauty…

  13. Tracheal web

    International Nuclear Information System (INIS)

    Legasto, A.C.; Haller, J.O.; Giusti, R.J.

    2004-01-01

    Congenital tracheal web is a rare entity often misdiagnosed as refractory asthma. Clinical suspicion based on patient history, examination, and pulmonary function tests should lead to its consideration. Bronchoscopy combined with CT imaging and multiplanar reconstruction is an accepted, highly sensitive means of diagnosis. (orig.)

  14. Classification and learning using genetic algorithms applications in Bioinformatics and Web Intelligence

    CERN Document Server

    Bandyopadhyay, Sanghamitra

    2007-01-01

    This book provides a unified framework that describes how genetic learning can be used to design pattern recognition and learning systems. It examines how a search technique, the genetic algorithm, can be used for pattern classification mainly through approximating decision boundaries. Coverage also demonstrates the effectiveness of the genetic classifiers vis-à-vis several widely used classifiers, including neural networks.

  15. Bioinformatics analysis of disordered proteins in prokaryotes

    Directory of Open Access Journals (Sweden)

    Malkov Saša N

    2011-03-01

    -facultative anaerobic or aquatic or mesophilic organisms, etc. Maximum disorder in bacteria is observed for high GC content-high genome size organisms, high genome size-aerobic organisms, etc. Some of the most reliable association rules mined establish relationships between high GC content and high protein disorder, medium GC content and both medium and low protein disorder, anaerobic organisms and medium protein disorder, Gammaproteobacteria and low protein disorder, etc. A web site Prokaryote Disorder Database has been designed and implemented at the address http://bioinfo.matf.bg.ac.rs/disorder, which contains complete results of the analysis of protein disorder performed for 296 prokaryotic completely sequenced genomes. Conclusions Exhaustive disorder analysis has been performed by functional classes of proteins, for a larger dataset of prokaryotic organisms than previously done. Results obtained are well correlated to those previously published, with some extension in the range of disorder level and clear distinction between functional classes of proteins. Wide correlation and association analysis between protein disorder and genomic and ecological characteristics has been performed for the first time. The results obtained give insight into multi-relationships among the characteristics and protein disorder. Such analysis provides for better understanding of the evolutionary process and may be useful for taxon determination. The main drawback of the approach is the fact that the disorder considered has been predicted and not experimentally established.

  16. Bioclipse: an open source workbench for chemo- and bioinformatics

    Directory of Open Access Journals (Sweden)

    Wagener Johannes

    2007-02-01

    Full Text Available Abstract Background There is a need for software applications that provide users with a complete and extensible toolkit for chemo- and bioinformatics accessible from a single workbench. Commercial packages are expensive and closed source, hence they do not allow end users to modify algorithms and add custom functionality. Existing open source projects are more focused on providing a framework for integrating existing, separately installed bioinformatics packages, rather than providing user-friendly interfaces. No open source chemoinformatics workbench has previously been published, and no sucessful attempts have been made to integrate chemo- and bioinformatics into a single framework. Results Bioclipse is an advanced workbench for resources in chemo- and bioinformatics, such as molecules, proteins, sequences, spectra, and scripts. It provides 2D-editing, 3D-visualization, file format conversion, calculation of chemical properties, and much more; all fully integrated into a user-friendly desktop application. Editing supports standard functions such as cut and paste, drag and drop, and undo/redo. Bioclipse is written in Java and based on the Eclipse Rich Client Platform with a state-of-the-art plugin architecture. This gives Bioclipse an advantage over other systems as it can easily be extended with functionality in any desired direction. Conclusion Bioclipse is a powerful workbench for bio- and chemoinformatics as well as an advanced integration platform. The rich functionality, intuitive user interface, and powerful plugin architecture make Bioclipse the most advanced and user-friendly open source workbench for chemo- and bioinformatics. Bioclipse is released under Eclipse Public License (EPL, an open source license which sets no constraints on external plugin licensing; it is totally open for both open source plugins as well as commercial ones. Bioclipse is freely available at http://www.bioclipse.net.

  17. Missing "Links" in Bioinformatics Education: Expanding Students' Conceptions of Bioinformatics Using a Biodiversity Database of Living and Fossil Reef Corals

    Science.gov (United States)

    Nehm, Ross H.; Budd, Ann F.

    2006-01-01

    NMITA is a reef coral biodiversity database that we use to introduce students to the expansive realm of bioinformatics beyond genetics. We introduce a series of lessons that have students use this database, thereby accessing real data that can be used to test hypotheses about biodiversity and evolution while targeting the "National Science …

  18. Bioinformatics and multiepitope DNA immunization to design rational snake antivenom.

    Directory of Open Access Journals (Sweden)

    Simon C Wagstaff

    2006-06-01

    Full Text Available Snake venom is a potentially lethal and complex mixture of hundreds of functionally diverse proteins that are difficult to purify and hence difficult to characterize. These difficulties have inhibited the development of toxin-targeted therapy, and conventional antivenom is still generated from the sera of horses or sheep immunized with whole venom. Although life-saving, antivenoms contain an immunoglobulin pool of unknown antigen specificity and known redundancy, which necessitates the delivery of large volumes of heterologous immunoglobulin to the envenomed victim, thus increasing the risk of anaphylactoid and serum sickness adverse effects. Here we exploit recent molecular sequence analysis and DNA immunization tools to design more rational toxin-targeted antivenom.We developed a novel bioinformatic strategy that identified sequences encoding immunogenic and structurally significant epitopes from an expressed sequence tag database of a venom gland cDNA library of Echis ocellatus, the most medically important viper in Africa. Focusing upon snake venom metalloproteinases (SVMPs that are responsible for the severe and frequently lethal hemorrhage in envenomed victims, we identified seven epitopes that we predicted would be represented in all isomers of this multimeric toxin and that we engineered into a single synthetic multiepitope DNA immunogen (epitope string. We compared the specificity and toxin-neutralizing efficacy of antiserum raised against the string to antisera raised against a single SVMP toxin (or domains or antiserum raised by conventional (whole venom immunization protocols. The SVMP string antiserum, as predicted in silico, contained antibody specificities to numerous SVMPs in E. ocellatus venom and venoms of several other African vipers. More significantly, the antiserum cross-specifically neutralized hemorrhage induced by E. ocellatus and Cerastes cerastes cerastes venoms.These data provide valuable sequence and structure

  19. XML databases and the semantic web

    CERN Document Server

    Thuraisingham, Bhavani

    2002-01-01

    Efficient access to data, sharing data, extracting information from data, and making use of the information have become urgent needs for today''s corporations. With so much data on the Web, managing it with conventional tools is becoming almost impossible. New tools and techniques are necessary to provide interoperability as well as warehousing between multiple data sources and systems, and to extract information from the databases. XML Databases and the Semantic Web focuses on critical and new Web technologies needed for organizations to carry out transactions on the Web, to understand how to use the Web effectively, and to exchange complex documents on the Web.This reference for database administrators, database designers, and Web designers working in tandem with database technologists covers three emerging technologies of significant impact for electronic business: Extensible Markup Language (XML), semi-structured databases, and the semantic Web. The first two parts of the book explore these emerging techn...

  20. Development and evaluation of a bioinformatics approach for designing molecular assays for viral detection.

    Directory of Open Access Journals (Sweden)

    Pierre H H Schneeberger

    Full Text Available Viruses belonging to the Flaviviridae and Bunyaviridae families show considerable genetic diversity. However, this diversity is not necessarily taken into account when developing diagnostic assays, which are often based on the pairwise alignment of a limited number of sequences. Our objective was to develop and evaluate a bioinformatics workflow addressing two recurrent issues of molecular assay design: (i the high intraspecies genetic diversity in viruses and (ii the potential for cross-reactivity with close relatives.The workflow developed herein was based on two consecutive BLASTn steps; the first was utilized to select highly conserved regions among the viral taxon of interest, and the second was employed to assess the degree of similarity of these highly-conserved regions to close relatives. Subsequently, the workflow was tested on a set of eight viral species, including various strains from the Flaviviridae and Bunyaviridae families.The genetic diversity ranges from as low as 0.45% variable sites over the complete genome of the Japanese encephalitis virus to more than 16% of variable sites on segment L of the Crimean-Congo hemorrhagic fever virus. Our proposed bioinformatics workflow allowed the selection-based on computing scores-of the best target for a diagnostic molecular assay for the eight viral species investigated.Our bioinformatics workflow allowed rapid selection of highly conserved and specific genomic fragments among the investigated viruses, while considering up to several hundred complete genomic sequences. The pertinence of this workflow will increase in parallel to the number of sequences made publicly available. We hypothesize that our workflow might be utilized to select diagnostic molecular markers for higher organisms with more complex genomes, provided the sequences are made available.

  1. Data partitioning enables the use of standard SOAP Web Services in genome-scale workflows.

    Science.gov (United States)

    Sztromwasser, Pawel; Puntervoll, Pål; Petersen, Kjell

    2011-07-26

    Biological databases and computational biology tools are provided by research groups around the world, and made accessible on the Web. Combining these resources is a common practice in bioinformatics, but integration of heterogeneous and often distributed tools and datasets can be challenging. To date, this challenge has been commonly addressed in a pragmatic way, by tedious and error-prone scripting. Recently however a more reliable technique has been identified and proposed as the platform that would tie together bioinformatics resources, namely Web Services. In the last decade the Web Services have spread wide in bioinformatics, and earned the title of recommended technology. However, in the era of high-throughput experimentation, a major concern regarding Web Services is their ability to handle large-scale data traffic. We propose a stream-like communication pattern for standard SOAP Web Services, that enables efficient flow of large data traffic between a workflow orchestrator and Web Services. We evaluated the data-partitioning strategy by comparing it with typical communication patterns on an example pipeline for genomic sequence annotation. The results show that data-partitioning lowers resource demands of services and increases their throughput, which in consequence allows to execute in-silico experiments on genome-scale, using standard SOAP Web Services and workflows. As a proof-of-principle we annotated an RNA-seq dataset using a plain BPEL workflow engine.

  2. Data partitioning enables the use of standard SOAP Web Services in genome-scale workflows

    Directory of Open Access Journals (Sweden)

    Sztromwasser Paweł

    2011-06-01

    Full Text Available Biological databases and computational biology tools are provided by research groups around the world, and made accessible on the Web. Combining these resources is a common practice in bioinformatics, but integration of heterogeneous and often distributed tools and datasets can be challenging. To date, this challenge has been commonly addressed in a pragmatic way, by tedious and error-prone scripting. Recently however a more reliable technique has been identified and proposed as the platform that would tie together bioinformatics resources, namely Web Services. In the last decade the Web Services have spread wide in bioinformatics, and earned the title of recommended technology. However, in the era of high-throughput experimentation, a major concern regarding Web Services is their ability to handle large-scale data traffic. We propose a stream-like communication pattern for standard SOAP Web Services, that enables efficient flow of large data traffic between a workflow orchestrator and Web Services. We evaluated the data-partitioning strategy by comparing it with typical communication patterns on an example pipeline for genomic sequence annotation. The results show that data-partitioning lowers resource demands of services and increases their throughput, which in consequence allows to execute in-silico experiments on genome-scale, using standard SOAP Web Services and workflows. As a proof-of-principle we annotated an RNA-seq dataset using a plain BPEL workflow engine.

  3. Web components and the semantic web

    OpenAIRE

    Casey, Maire; Pahl, Claus

    2003-01-01

    Component-based software engineering on the Web differs from traditional component and software engineering. We investigate Web component engineering activites that are crucial for the development,com position, and deployment of components on the Web. The current Web Services and Semantic Web initiatives strongly influence our work. Focussing on Web component composition we develop description and reasoning techniques that support a component developer in the composition activities,fo cussing...

  4. Bio-jETI: a service integration, design, and provisioning platform for orchestrated bioinformatics processes.

    Science.gov (United States)

    Margaria, Tiziana; Kubczak, Christian; Steffen, Bernhard

    2008-04-25

    With Bio-jETI, we introduce a service platform for interdisciplinary work on biological application domains and illustrate its use in a concrete application concerning statistical data processing in R and xcms for an LC/MS analysis of FAAH gene knockout. Bio-jETI uses the jABC environment for service-oriented modeling and design as a graphical process modeling tool and the jETI service integration technology for remote tool execution. As a service definition and provisioning platform, Bio-jETI has the potential to become a core technology in interdisciplinary service orchestration and technology transfer. Domain experts, like biologists not trained in computer science, directly define complex service orchestrations as process models and use efficient and complex bioinformatics tools in a simple and intuitive way.

  5. A development process meta-model for Web based expert systems: The Web engineering point of view

    DEFF Research Database (Denmark)

    Dokas, I.M.; Alapetite, Alexandre

    2006-01-01

    raised their complexity. Unfortunately, there is so far no clear answer to the question: How may the methods and experience of Web engineering and expert systems be combined and applied in order todevelop effective and successful Web based expert systems? In an attempt to answer this question...... on Web based expert systems – will be presented. The idea behind the presentation of theaccessibility evaluation and its conclusions is to show to Web based expert system developers, who typically have little Web engineering background, that Web engineering issues must be considered when developing Web......Similar to many legacy computer systems, expert systems can be accessed via the Web, forming a set of Web applications known as Web based expert systems. The tough Web competition, the way people and organizations rely on Web applications and theincreasing user requirements for better services have...

  6. Applying Instructional Design Theories to Bioinformatics Education in Microarray Analysis and Primer Design Workshops

    Science.gov (United States)

    Shachak, Aviv; Ophir, Ron; Rubin, Eitan

    2005-01-01

    The need to support bioinformatics training has been widely recognized by scientists, industry, and government institutions. However, the discussion of instructional methods for teaching bioinformatics is only beginning. Here we report on a systematic attempt to design two bioinformatics workshops for graduate biology students on the basis of…

  7. Introductory Bioinformatics Exercises Utilizing Hemoglobin and Chymotrypsin to Reinforce the Protein Sequence-Structure-Function Relationship

    Science.gov (United States)

    Inlow, Jennifer K.; Miller, Paige; Pittman, Bethany

    2007-01-01

    We describe two bioinformatics exercises intended for use in a computer laboratory setting in an upper-level undergraduate biochemistry course. To introduce students to bioinformatics, the exercises incorporate several commonly used bioinformatics tools, including BLAST, that are freely available online. The exercises build upon the students'…

  8. Vertical and Horizontal Integration of Bioinformatics Education: A Modular, Interdisciplinary Approach

    Science.gov (United States)

    Furge, Laura Lowe; Stevens-Truss, Regina; Moore, D. Blaine; Langeland, James A.

    2009-01-01

    Bioinformatics education for undergraduates has been approached primarily in two ways: introduction of new courses with largely bioinformatics focus or introduction of bioinformatics experiences into existing courses. For small colleges such as Kalamazoo, creation of new courses within an already resource-stretched setting has not been an option.…

  9. BioMaS: a modular pipeline for Bioinformatic analysis of Metagenomic AmpliconS.

    Science.gov (United States)

    Fosso, Bruno; Santamaria, Monica; Marzano, Marinella; Alonso-Alemany, Daniel; Valiente, Gabriel; Donvito, Giacinto; Monaco, Alfonso; Notarangelo, Pasquale; Pesole, Graziano

    2015-07-01

    Substantial advances in microbiology, molecular evolution and biodiversity have been carried out in recent years thanks to Metagenomics, which allows to unveil the composition and functions of mixed microbial communities in any environmental niche. If the investigation is aimed only at the microbiome taxonomic structure, a target-based metagenomic approach, here also referred as Meta-barcoding, is generally applied. This approach commonly involves the selective amplification of a species-specific genetic marker (DNA meta-barcode) in the whole taxonomic range of interest and the exploration of its taxon-related variants through High-Throughput Sequencing (HTS) technologies. The accessibility to proper computational systems for the large-scale bioinformatic analysis of HTS data represents, currently, one of the major challenges in advanced Meta-barcoding projects. BioMaS (Bioinformatic analysis of Metagenomic AmpliconS) is a new bioinformatic pipeline designed to support biomolecular researchers involved in taxonomic studies of environmental microbial communities by a completely automated workflow, comprehensive of all the fundamental steps, from raw sequence data upload and cleaning to final taxonomic identification, that are absolutely required in an appropriately designed Meta-barcoding HTS-based experiment. In its current version, BioMaS allows the analysis of both bacterial and fungal environments starting directly from the raw sequencing data from either Roche 454 or Illumina HTS platforms, following two alternative paths, respectively. BioMaS is implemented into a public web service available at https://recasgateway.ba.infn.it/ and is also available in Galaxy at http://galaxy.cloud.ba.infn.it:8080 (only for Illumina data). BioMaS is a friendly pipeline for Meta-barcoding HTS data analysis specifically designed for users without particular computing skills. A comparative benchmark, carried out by using a simulated dataset suitably designed to broadly represent

  10. Quantum Bio-Informatics II From Quantum Information to Bio-Informatics

    Science.gov (United States)

    Accardi, L.; Freudenberg, Wolfgang; Ohya, Masanori

    2009-02-01

    / H. Kamimura -- Massive collection of full-length complementary DNA clones and microarray analyses: keys to rice transcriptome analysis / S. Kikuchi -- Changes of influenza A(H5) viruses by means of entropic chaos degree / K. Sato and M. Ohya -- Basics of genome sequence analysis in bioinformatics - its fundamental ideas and problems / T. Suzuki and S. Miyazaki -- A basic introduction to gene expression studies using microarray expression data analysis / D. Wanke and J. Kilian -- Integrating biological perspectives: a quantum leap for microarray expression analysis / D. Wanke ... [et al.].

  11. The ChIP-Seq tools and web server: a resource for analyzing ChIP-seq and other types of genomic data.

    Science.gov (United States)

    Ambrosini, Giovanna; Dreos, René; Kumar, Sunil; Bucher, Philipp

    2016-11-18

    ChIP-seq and related high-throughput chromatin profilig assays generate ever increasing volumes of highly valuable biological data. To make sense out of it, biologists need versatile, efficient and user-friendly tools for access, visualization and itegrative analysis of such data. Here we present the ChIP-Seq command line tools and web server, implementing basic algorithms for ChIP-seq data analysis starting with a read alignment file. The tools are optimized for memory-efficiency and speed thus allowing for processing of large data volumes on inexpensive hardware. The web interface provides access to a large database of public data. The ChIP-Seq tools have a modular and interoperable design in that the output from one application can serve as input to another one. Complex and innovative tasks can thus be achieved by running several tools in a cascade. The various ChIP-Seq command line tools and web services either complement or compare favorably to related bioinformatics resources in terms of computational efficiency, ease of access to public data and interoperability with other web-based tools. The ChIP-Seq server is accessible at http://ccg.vital-it.ch/chipseq/ .

  12. OralCard: a bioinformatic tool for the study of oral proteome.

    Science.gov (United States)

    Arrais, Joel P; Rosa, Nuno; Melo, José; Coelho, Edgar D; Amaral, Diana; Correia, Maria José; Barros, Marlene; Oliveira, José Luís

    2013-07-01

    The molecular complexity of the human oral cavity can only be clarified through identification of components that participate within it. However current proteomic techniques produce high volumes of information that are dispersed over several online databases. Collecting all of this data and using an integrative approach capable of identifying unknown associations is still an unsolved problem. This is the main motivation for this work. We present the online bioinformatic tool OralCard, which comprises results from 55 manually curated articles reflecting the oral molecular ecosystem (OralPhysiOme). It comprises experimental information available from the oral proteome both of human (OralOme) and microbial origin (MicroOralOme) structured in protein, disease and organism. This tool is a key resource for researchers to understand the molecular foundations implicated in biology and disease mechanisms of the oral cavity. The usefulness of this tool is illustrated with the analysis of the oral proteome associated with diabetes melitus type 2. OralCard is available at http://bioinformatics.ua.pt/oralcard. Copyright © 2013 Elsevier Ltd. All rights reserved.

  13. Meeting review: 2002 O'Reilly Bioinformatics Technology Conference.

    Science.gov (United States)

    Counsell, Damian

    2002-01-01

    At the end of January I travelled to the States to speak at and attend the first O'Reilly Bioinformatics Technology Conference. It was a large, well-organized and diverse meeting with an interesting history. Although the meeting was not a typical academic conference, its style will, I am sure, become more typical of meetings in both biological and computational sciences.Speakers at the event included prominent bioinformatics researchers such as Ewan Birney, Terry Gaasterland and Lincoln Stein; authors and leaders in the open source programming community like Damian Conway and Nat Torkington; and representatives from several publishing companies including the Nature Publishing Group, Current Science Group and the President of O'Reilly himself, Tim O'Reilly. There were presentations, tutorials, debates, quizzes and even a 'jam session' for musical bioinformaticists.

  14. Rise and demise of bioinformatics? Promise and progress.

    Directory of Open Access Journals (Sweden)

    Christos A Ouzounis

    Full Text Available The field of bioinformatics and computational biology has gone through a number of transformations during the past 15 years, establishing itself as a key component of new biology. This spectacular growth has been challenged by a number of disruptive changes in science and technology. Despite the apparent fatigue of the linguistic use of the term itself, bioinformatics has grown perhaps to a point beyond recognition. We explore both historical aspects and future trends and argue that as the field expands, key questions remain unanswered and acquire new meaning while at the same time the range of applications is widening to cover an ever increasing number of biological disciplines. These trends appear to be pointing to a redefinition of certain objectives, milestones, and possibly the field itself.

  15. A Survey on Evolutionary Algorithm Based Hybrid Intelligence in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Shan Li

    2014-01-01

    Full Text Available With the rapid advance in genomics, proteomics, metabolomics, and other types of omics technologies during the past decades, a tremendous amount of data related to molecular biology has been produced. It is becoming a big challenge for the bioinformatists to analyze and interpret these data with conventional intelligent techniques, for example, support vector machines. Recently, the hybrid intelligent methods, which integrate several standard intelligent approaches, are becoming more and more popular due to their robustness and efficiency. Specifically, the hybrid intelligent approaches based on evolutionary algorithms (EAs are widely used in various fields due to the efficiency and robustness of EAs. In this review, we give an introduction about the applications of hybrid intelligent methods, in particular those based on evolutionary algorithm, in bioinformatics. In particular, we focus on their applications to three common problems that arise in bioinformatics, that is, feature selection, parameter estimation, and reconstruction of biological networks.

  16. Bioinformatics and Microarray Data Analysis on the Cloud.

    Science.gov (United States)

    Calabrese, Barbara; Cannataro, Mario

    2016-01-01

    High-throughput platforms such as microarray, mass spectrometry, and next-generation sequencing are producing an increasing volume of omics data that needs large data storage and computing power. Cloud computing offers massive scalable computing and storage, data sharing, on-demand anytime and anywhere access to resources and applications, and thus, it may represent the key technology for facing those issues. In fact, in the recent years it has been adopted for the deployment of different bioinformatics solutions and services both in academia and in the industry. Although this, cloud computing presents several issues regarding the security and privacy of data, that are particularly important when analyzing patients data, such as in personalized medicine. This chapter reviews main academic and industrial cloud-based bioinformatics solutions; with a special focus on microarray data analysis solutions and underlines main issues and problems related to the use of such platforms for the storage and analysis of patients data.

  17. Architecture exploration of FPGA based accelerators for bioinformatics applications

    CERN Document Server

    Varma, B Sharat Chandra; Balakrishnan, M

    2016-01-01

    This book presents an evaluation methodology to design future FPGA fabrics incorporating hard embedded blocks (HEBs) to accelerate applications. This methodology will be useful for selection of blocks to be embedded into the fabric and for evaluating the performance gain that can be achieved by such an embedding. The authors illustrate the use of their methodology by studying the impact of HEBs on two important bioinformatics applications: protein docking and genome assembly. The book also explains how the respective HEBs are designed and how hardware implementation of the application is done using these HEBs. It shows that significant speedups can be achieved over pure software implementations by using such FPGA-based accelerators. The methodology presented in this book may also be used for designing HEBs for accelerating software implementations in other domains besides bioinformatics. This book will prove useful to students, researchers, and practicing engineers alike.

  18. 2nd Colombian Congress on Computational Biology and Bioinformatics

    CERN Document Server

    Cristancho, Marco; Isaza, Gustavo; Pinzón, Andrés; Rodríguez, Juan

    2014-01-01

    This volume compiles accepted contributions for the 2nd Edition of the Colombian Computational Biology and Bioinformatics Congress CCBCOL, after a rigorous review process in which 54 papers were accepted for publication from 119 submitted contributions. Bioinformatics and Computational Biology are areas of knowledge that have emerged due to advances that have taken place in the Biological Sciences and its integration with Information Sciences. The expansion of projects involving the study of genomes has led the way in the production of vast amounts of sequence data which needs to be organized, analyzed and stored to understand phenomena associated with living organisms related to their evolution, behavior in different ecosystems, and the development of applications that can be derived from this analysis.  .

  19. Bioinformatics for whole-genome shotgun sequencing of microbial communities.

    Directory of Open Access Journals (Sweden)

    Kevin Chen

    2005-07-01

    Full Text Available The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fecal samples. The interpretation of this new kind of data poses a wide variety of exciting and difficult bioinformatics problems. The aim of this review is to introduce the bioinformatics community to this emerging field by surveying existing techniques and promising new approaches for several of the most interesting of these computational problems.

  20. Statistical modelling in biostatistics and bioinformatics selected papers

    CERN Document Server

    Peng, Defen

    2014-01-01

    This book presents selected papers on statistical model development related mainly to the fields of Biostatistics and Bioinformatics. The coverage of the material falls squarely into the following categories: (a) Survival analysis and multivariate survival analysis, (b) Time series and longitudinal data analysis, (c) Statistical model development and (d) Applied statistical modelling. Innovations in statistical modelling are presented throughout each of the four areas, with some intriguing new ideas on hierarchical generalized non-linear models and on frailty models with structural dispersion, just to mention two examples. The contributors include distinguished international statisticians such as Philip Hougaard, John Hinde, Il Do Ha, Roger Payne and Alessandra Durio, among others, as well as promising newcomers. Some of the contributions have come from researchers working in the BIO-SI research programme on Biostatistics and Bioinformatics, centred on the Universities of Limerick and Galway in Ireland and fu...

  1. Model-driven user interfaces for bioinformatics data resources: regenerating the wheel as an alternative to reinventing it

    Directory of Open Access Journals (Sweden)

    Swainston Neil

    2006-12-01

    Full Text Available Abstract Background The proliferation of data repositories in bioinformatics has resulted in the development of numerous interfaces that allow scientists to browse, search and analyse the data that they contain. Interfaces typically support repository access by means of web pages, but other means are also used, such as desktop applications and command line tools. Interfaces often duplicate functionality amongst each other, and this implies that associated development activities are repeated in different laboratories. Interfaces developed by public laboratories are often created with limited developer resources. In such environments, reducing the time spent on creating user interfaces allows for a better deployment of resources for specialised tasks, such as data integration or analysis. Laboratories maintaining data resources are challenged to reconcile requirements for software that is reliable, functional and flexible with limitations on software development resources. Results This paper proposes a model-driven approach for the partial generation of user interfaces for searching and browsing bioinformatics data repositories. Inspired by the Model Driven Architecture (MDA of the Object Management Group (OMG, we have developed a system that generates interfaces designed for use with bioinformatics resources. This approach helps laboratory domain experts decrease the amount of time they have to spend dealing with the repetitive aspects of user interface development. As a result, the amount of time they can spend on gathering requirements and helping develop specialised features increases. The resulting system is known as Pierre, and has been validated through its application to use cases in the life sciences, including the PEDRoDB proteomics database and the e-Fungi data warehouse. Conclusion MDAs focus on generating software from models that describe aspects of service capabilities, and can be applied to support rapid development of repository

  2. Intelligent bioinformatics : the application of artificial intelligence techniques to bioinformatics problems

    National Research Council Canada - National Science Library

    Keedwell, Edward

    2005-01-01

    ... Intelligence and Computer Science 3.1 Introduction to search 3.2 Search algorithms 3.3 Heuristic search methods 3.4 Optimal search strategies 3.5 Problems with search techniques 3.6 Complexity of...

  3. Neonatal Informatics: Transforming Neonatal Care Through Translational Bioinformatics

    Science.gov (United States)

    Palma, Jonathan P.; Benitz, William E.; Tarczy-Hornoch, Peter; Butte, Atul J.; Longhurst, Christopher A.

    2012-01-01

    The future of neonatal informatics will be driven by the availability of increasingly vast amounts of clinical and genetic data. The field of translational bioinformatics is concerned with linking and learning from these data and applying new findings to clinical care to transform the data into proactive, predictive, preventive, and participatory health. As a result of advances in translational informatics, the care of neonates will become more data driven, evidence based, and personalized. PMID:22924023

  4. Bioinformatics meets user-centred design: a perspective.

    Directory of Open Access Journals (Sweden)

    Katrina Pavelin

    Full Text Available Designers have a saying that "the joy of an early release lasts but a short time. The bitterness of an unusable system lasts for years." It is indeed disappointing to discover that your data resources are not being used to their full potential. Not only have you invested your time, effort, and research grant on the project, but you may face costly redesigns if you want to improve the system later. This scenario would be less likely if the product was designed to provide users with exactly what they need, so that it is fit for purpose before its launch. We work at EMBL-European Bioinformatics Institute (EMBL-EBI, and we consult extensively with life science researchers to find out what they need from biological data resources. We have found that although users believe that the bioinformatics community is providing accurate and valuable data, they often find the interfaces to these resources tricky to use and navigate. We believe that if you can find out what your users want even before you create the first mock-up of a system, the final product will provide a better user experience. This would encourage more people to use the resource and they would have greater access to the data, which could ultimately lead to more scientific discoveries. In this paper, we explore the need for a user-centred design (UCD strategy when designing bioinformatics resources and illustrate this with examples from our work at EMBL-EBI. Our aim is to introduce the reader to how selected UCD techniques may be successfully applied to software design for bioinformatics.

  5. A Quick Guide for Building a Successful Bioinformatics Community

    Science.gov (United States)

    Budd, Aidan; Corpas, Manuel; Brazas, Michelle D.; Fuller, Jonathan C.; Goecks, Jeremy; Mulder, Nicola J.; Michaut, Magali; Ouellette, B. F. Francis; Pawlik, Aleksandra; Blomberg, Niklas

    2015-01-01

    “Scientific community” refers to a group of people collaborating together on scientific-research-related activities who also share common goals, interests, and values. Such communities play a key role in many bioinformatics activities. Communities may be linked to a specific location or institute, or involve people working at many different institutions and locations. Education and training is typically an important component of these communities, providing a valuable context in which to develop skills and expertise, while also strengthening links and relationships within the community. Scientific communities facilitate: (i) the exchange and development of ideas and expertise; (ii) career development; (iii) coordinated funding activities; (iv) interactions and engagement with professionals from other fields; and (v) other activities beneficial to individual participants, communities, and the scientific field as a whole. It is thus beneficial at many different levels to understand the general features of successful, high-impact bioinformatics communities; how individual participants can contribute to the success of these communities; and the role of education and training within these communities. We present here a quick guide to building and maintaining a successful, high-impact bioinformatics community, along with an overview of the general benefits of participating in such communities. This article grew out of contributions made by organizers, presenters, panelists, and other participants of the ISMB/ECCB 2013 workshop “The ‘How To Guide’ for Establishing a Successful Bioinformatics Network” at the 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and the 12th European Conference on Computational Biology (ECCB). PMID:25654371

  6. Kubernetes as an approach for solving bioinformatic problems.

    OpenAIRE

    Markstedt, Olof

    2017-01-01

    The cluster orchestration tool Kubernetes enables easy deployment and reproducibility of life science research by utilizing the advantages of the container technology. The container technology allows for easy tool creation, sharing and runs on any Linux system once it has been built. The applicability of Kubernetes as an approach to run bioinformatic workflows was evaluated and resulted in some examples of how Kubernetes and containers could be used within the field of life science and how th...

  7. GOBLET: The Global Organisation for Bioinformatics Learning, Education and Training

    Science.gov (United States)

    Atwood, Teresa K.; Bongcam-Rudloff, Erik; Brazas, Michelle E.; Corpas, Manuel; Gaudet, Pascale; Lewitter, Fran; Mulder, Nicola; Palagi, Patricia M.; Schneider, Maria Victoria; van Gelder, Celia W. G.

    2015-01-01

    In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy—paradoxically, many are actually closing “niche” bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all. PMID:25856076

  8. Bioinformatics analysis and detection of gelatinase encoded gene in Lysinibacillussphaericus

    Science.gov (United States)

    Repin, Rul Aisyah Mat; Mutalib, Sahilah Abdul; Shahimi, Safiyyah; Khalid, Rozida Mohd.; Ayob, Mohd. Khan; Bakar, Mohd. Faizal Abu; Isa, Mohd Noor Mat

    2016-11-01

    In this study, we performed bioinformatics analysis toward genome sequence of Lysinibacillussphaericus (L. sphaericus) to determine gene encoded for gelatinase. L. sphaericus was isolated from soil and gelatinase species-specific bacterium to porcine and bovine gelatin. This bacterium offers the possibility of enzymes production which is specific to both species of meat, respectively. The main focus of this research is to identify the gelatinase encoded gene within the bacteria of L. Sphaericus using bioinformatics analysis of partially sequence genome. From the research study, three candidate gene were identified which was, gelatinase candidate gene 1 (P1), NODE_71_length_93919_cov_158.931839_21 which containing 1563 base pair (bp) in size with 520 amino acids sequence; Secondly, gelatinase candidate gene 2 (P2), NODE_23_length_52851_cov_190.061386_17 which containing 1776 bp in size with 591 amino acids sequence; and Thirdly, gelatinase candidate gene 3 (P3), NODE_106_length_32943_cov_169.147919_8 containing 1701 bp in size with 566 amino acids sequence. Three pairs of oligonucleotide primers were designed and namely as, F1, R1, F2, R2, F3 and R3 were targeted short sequences of cDNA by PCR. The amplicons were reliably results in 1563 bp in size for candidate gene P1 and 1701 bp in size for candidate gene P3. Therefore, the results of bioinformatics analysis of L. Sphaericus resulting in gene encoded gelatinase were identified.

  9. Combining medical informatics and bioinformatics toward tools for personalized medicine.

    Science.gov (United States)

    Sarachan, B D; Simmons, M K; Subramanian, P; Temkin, J M

    2003-01-01

    Key bioinformatics and medical informatics research areas need to be identified to advance knowledge and understanding of disease risk factors and molecular disease pathology in the 21 st century toward new diagnoses, prognoses, and treatments. Three high-impact informatics areas are identified: predictive medicine (to identify significant correlations within clinical data using statistical and artificial intelligence methods), along with pathway informatics and cellular simulations (that combine biological knowledge with advanced informatics to elucidate molecular disease pathology). Initial predictive models have been developed for a pilot study in Huntington's disease. An initial bioinformatics platform has been developed for the reconstruction and analysis of pathways, and work has begun on pathway simulation. A bioinformatics research program has been established at GE Global Research Center as an important technology toward next generation medical diagnostics. We anticipate that 21 st century medical research will be a combination of informatics tools with traditional biology wet lab research, and that this will translate to increased use of informatics techniques in the clinic.

  10. A comparison of common programming languages used in bioinformatics.

    Science.gov (United States)

    Fourment, Mathieu; Gillings, Michael R

    2008-02-05

    The performance of different programming languages has previously been benchmarked using abstract mathematical algorithms, but not using standard bioinformatics algorithms. We compared the memory usage and speed of execution for three standard bioinformatics methods, implemented in programs using one of six different programming languages. Programs for the Sellers algorithm, the Neighbor-Joining tree construction algorithm and an algorithm for parsing BLAST file outputs were implemented in C, C++, C#, Java, Perl and Python. Implementations in C and C++ were fastest and used the least memory. Programs in these languages generally contained more lines of code. Java and C# appeared to be a compromise between the flexibility of Perl and Python and the fast performance of C and C++. The relative performance of the tested languages did not change from Windows to Linux and no clear evidence of a faster operating system was found. Source code and additional information are available from http://www.bioinformatics.org/benchmark/. This benchmark provides a comparison of six commonly used programming languages under two different operating systems. The overall comparison shows that a developer should choose an appropriate language carefully, taking into account the performance expected and the library availability for each language.

  11. Best practices in bioinformatics training for life scientists.

    KAUST Repository

    Via, Allegra

    2013-06-25

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists.

  12. Bioinformatics process management: information flow via a computational journal

    Directory of Open Access Journals (Sweden)

    Lushington Gerald

    2007-12-01

    Full Text Available Abstract This paper presents the Bioinformatics Computational Journal (BCJ, a framework for conducting and managing computational experiments in bioinformatics and computational biology. These experiments often involve series of computations, data searches, filters, and annotations which can benefit from a structured environment. Systems to manage computational experiments exist, ranging from libraries with standard data models to elaborate schemes to chain together input and output between applications. Yet, although such frameworks are available, their use is not widespread–ad hoc scripts are often required to bind applications together. The BCJ explores another solution to this problem through a computer based environment suitable for on-site use, which builds on the traditional laboratory notebook paradigm. It provides an intuitive, extensible paradigm designed for expressive composition of applications. Extensive features facilitate sharing data, computational methods, and entire experiments. By focusing on the bioinformatics and computational biology domain, the scope of the computational framework was narrowed, permitting us to implement a capable set of features for this domain. This report discusses the features determined critical by our system and other projects, along with design issues. We illustrate the use of our implementation of the BCJ on two domain-specific examples.

  13. haploR: an R package for querying web-based annotation tools.

    Science.gov (United States)

    Zhbannikov, Ilya Y; Arbeev, Konstantin; Ukraintseva, Svetlana; Yashin, Anatoliy I

    2017-01-01

    We developed haploR , an R package for querying web based genome annotation tools HaploReg and RegulomeDB. haploR gathers information in a data frame which is suitable for downstream bioinformatic analyses. This will facilitate post-genome wide association studies streamline analysis for rapid discovery and interpretation of genetic associations.

  14. Instant web scraping with Java

    CERN Document Server

    Mitchell, Ryan

    2013-01-01

    This book is full of short, concise recipes to learn a variety of useful web scraping techniques using Java. You will start with a simple basic recipe of setting up your Java environment and gradually learn some more advanced recipes such as using complex Scrapers.Instant Web Scraping with Java is aimed at developers who, while not necessarily familiar with Java, are at least ready to dive into the complexities of this language with simple, step-by-step instructions leading the way. It is assumed that you have at least an intermediate knowledge of HTML, some knowledge of MySQL, and access to a

  15. Usare WebDewey

    OpenAIRE

    Baldi, Paolo

    2016-01-01

    This presentation shows how to use the WebDewey tool. Features of WebDewey. Italian WebDewey compared with American WebDewey. Querying Italian WebDewey. Italian WebDewey and MARC21. Italian WebDewey and UNIMARC. Numbers, captions, "equivalente verbale": Dewey decimal classification in Italian catalogues. Italian WebDewey and Nuovo soggettario. Italian WebDewey and LCSH. Italian WebDewey compared with printed version of Italian Dewey Classification (22. edition): advantages and disadvantages o...

  16. Semantic Web

    Directory of Open Access Journals (Sweden)

    Anna Lamandini

    2011-06-01

    Full Text Available The semantic Web is a technology at the service of knowledge which is aimed at accessibility and the sharing of content; facilitating interoperability between different systems and as such is one of the nine key technological pillars of TIC (technologies for information and communication within the third theme, programme specific cooperation of the seventh programme framework for research and development (7°PQRS, 2007-2013. As a system it seeks to overcome overload or excess of irrelevant information in Internet, in order to facilitate specific or pertinent research. It is an extension of the existing Web in which the aim is for cooperation between and the computer and people (the dream of Sir Tim Berners –Lee where machines can give more support to people when integrating and elaborating data in order to obtain inferences and a global sharing of data. It is a technology that is able to favour the development of a “data web” in other words the creation of a space in both sets of interconnected and shared data (Linked Data which allows users to link different types of data coming from different sources. It is a technology that will have great effect on everyday life since it will permit the planning of “intelligent applications” in various sectors such as education and training, research, the business world, public information, tourism, health, and e-government. It is an innovative technology that activates a social transformation (socio-semantic Web on a world level since it redefines the cognitive universe of users and enables the sharing not only of information but of significance (collective and connected intelligence.

  17. Intuitive web-based experimental design for high-throughput biomedical data.

    Science.gov (United States)

    Friedrich, Andreas; Kenar, Erhan; Kohlbacher, Oliver; Nahnsen, Sven

    2015-01-01

    Big data bioinformatics aims at drawing biological conclusions from huge and complex biological datasets. Added value from the analysis of big data, however, is only possible if the data is accompanied by accurate metadata annotation. Particularly in high-throughput experiments intelligent approaches are needed to keep track of the experimental design, including the conditions that are studied as well as information that might be interesting for failure analysis or further experiments in the future. In addition to the management of this information, means for an integrated design and interfaces for structured data annotation are urgently needed by researchers. Here, we propose a factor-based experimental design approach that enables scientists to easily create large-scale experiments with the help of a web-based system. We present a novel implementation of a web-based interface allowing the collection of arbitrary metadata. To exchange and edit information we provide a spreadsheet-based, humanly readable format. Subsequently, sample sheets with identifiers and metainformation for data generation facilities can be created. Data files created after measurement of the samples can be uploaded to a datastore, where they are automatically linked to the previously created experimental design model.

  18. An internet-based bioinformatics toolkit for plant biosecurity diagnosis and surveillance of viruses and viroids.

    Science.gov (United States)

    Barrero, Roberto A; Napier, Kathryn R; Cunnington, James; Liefting, Lia; Keenan, Sandi; Frampton, Rebekah A; Szabo, Tamas; Bulman, Simon; Hunter, Adam; Ward, Lisa; Whattam, Mark; Bellgard, Matthew I

    2017-01-11

    Detection and preventing entry of exotic viruses and viroids at the border is critical for protecting plant industries trade worldwide. Existing post entry quarantine screening protocols rely on time-consuming biological indicators and/or molecular assays that require knowledge of infecting viral pathogens. Plants have developed the ability to recognise and respond to viral infections through Dicer-like enzymes that cleave viral sequences into specific small RNA products. Many studies reported the use of a broad range of small RNAs encompassing the product sizes of several Dicer enzymes involved in distinct biological pathways. Here we optimise the assembly of viral sequences by using specific small RNA subsets. We sequenced the small RNA fractions of 21 plants held at quarantine glasshouse facilities in Australia and New Zealand. Benchmarking of several de novo assembler tools yielded SPAdes using a kmer of 19 to produce the best assembly outcomes. We also found that de novo assembly using 21-25 nt small RNAs can result in chimeric assemblies of viral sequences and plant host sequences. Such non-specific assemblies can be resolved by using 21-22 nt or 24 nt small RNAs subsets. Among the 21 selected samples, we identified contigs with sequence similarity to 18 viruses and 3 viroids in 13 samples. Most of the viruses were assembled using only 21-22 nt long virus-derived siRNAs (viRNAs), except for one Citrus endogenous pararetrovirus that was more efficiently assembled using 24 nt long viRNAs. All three viroids found in this study were fully assembled using either 21-22 nt or 24 nt viRNAs. Optimised analysis workflows were customised within the Yabi web-based analytical environment. We present a fully automated viral surveillance and diagnosis web-based bioinformatics toolkit that provides a flexible, user-friendly, robust and scalable interface for the discovery and diagnosis of viral pathogens. We have implemented an automated viral surveillance and

  19. Le web social – La complexité au service de l'apprentissage informel de l'anglais The social web: complexity in the service of the informal learning of English

    Directory of Open Access Journals (Sweden)

    Geoffrey Sockett

    2012-06-01

    Full Text Available Cet article présente les résultats d'un projet de recherche dans lequel de futurs enseignants de langues sont invités à écrire un blogue au sujet de leurs activités informelles de réseautage social en anglais. Les participants, qui ont suivi une formation dans des domaines tels que les théories de l'acquisition des langues, s'expriment en particulier sur l'impact de ces activités sur leur apprentissage de la langue cible. Le corpus ainsi obtenu est analysé et des résultats sont présentés. Le cadre théorique de cette étude est celui des systèmes complexes et dynamiques. Nous soulignons la pertinence de ce modèle pour décrire l'apprentissage informel de l'anglais en ligne. Les résultats de l'étude font apparaître que le très grand nombre d'interactions entre personnes, médias, langues et interfaces de réseautage social conduit à un développement langagier non linéaire et à des transitions entre différentes phases dans des domaines tels que l'aisance et l'acquisition du vocabulaire. Nous présentons des applications pratiques de cette recherche pour l'enseignement des langues, notamment dans la valorisation des pratiques informelles, et suggérons de futures orientations pour la recherche dans ce domaine.This article presents the results of a research project using blogs written by trainee language teachers to report on their informal online social networking activities in English. The respondents, who are trained in areas such as language acquisition theories, are invited to discuss how they might be learning English through these activities. The resulting corpus of blogs is analysed and the impact of these activities on learning of the target language is discussed. The theoretical framework for this study is dynamic systems theory, and it is argued that this model is particularly pertinent for the description of language learning in an informal online context. The results of the study suggest that the very large

  20. Public health and Web 2.0.

    Science.gov (United States)

    Hardey, Michael

    2008-07-01

    This article examines the nature and role of Web 2.0 resources and their impact on health information made available though the Internet. The transition of the Web from version one to Web 2.0 is described and the main features of the new Web examined. Two characteristic Web 2.0 resources are explored and the implications for the public and practitioners examined. First, what are known as 'user reviews' or 'user testimonials', which allow people to comment on the health services delivered to them, are described. Second, new mapping applications that take advantage of the interactive potential of Web 2.0 and provide tools to visualize complex data are examined. Following a discussion of the potential of Web 2.0, it is concluded that it offers considerable opportunities for disseminating health information and creating new sources of data, as well as generating new questions and dilemmas.

  1. Responsive web design workflow

    OpenAIRE

    LAAK, TIMO

    2013-01-01

    Responsive Web Design Workflow is a literature review about Responsive Web Design, a web standards based modern web design paradigm. The goals of this research were to define what responsive web design is, determine its importance in building modern websites and describe a workflow for responsive web design projects. Responsive web design is a paradigm to create adaptive websites, which respond to the properties of the media that is used to render them. The three key elements of responsi...

  2. Assessing computational genomics skills: Our experience in the H3ABioNet African bioinformatics network.

    Directory of Open Access Journals (Sweden)

    C Victor Jongeneel

    2017-06-01

    Full Text Available The H3ABioNet pan-African bioinformatics network, which is funded to support the Human Heredity and Health in Africa (H3Africa program, has developed node-assessment exercises to gauge the ability of its participating research and service groups to analyze typical genome-wide datasets being generated by H3Africa research groups. We describe a framework for the assessment of computational genomics analysis skills, which includes standard operating procedures, training and test datasets, and a process for administering the exercise. We present the experiences of 3 research groups that have taken the exercise and the impact on their ability to manage complex projects. Finally, we discuss the reasons why many H3ABioNet nodes have declined so far to participate and potential strategies to encourage them to do so.

  3. 8th International Conference on Practical Applications of Computational Biology & Bioinformatics

    CERN Document Server

    Rocha, Miguel; Fdez-Riverola, Florentino; Santana, Juan

    2014-01-01

    Biological and biomedical research are increasingly driven by experimental techniques that challenge our ability to analyse, process and extract meaningful knowledge from the underlying data. The impressive capabilities of next generation sequencing technologies, together with novel and ever evolving distinct types of omics data technologies, have put an increasingly complex set of challenges for the growing fields of Bioinformatics and Computational Biology. The analysis of the datasets produced and their integration call for new algorithms and approaches from fields such as Databases, Statistics, Data Mining, Machine Learning, Optimization, Computer Science and Artificial Intelligence. Clearly, Biology is more and more a science of information requiring tools from the computational sciences. In the last few years, we have seen the surge of a new generation of interdisciplinary scientists that have a strong background in the biological and computational sciences. In this context, the interaction of researche...

  4. 11th International Conference on Practical Applications of Computational Biology & Bioinformatics

    CERN Document Server

    Mohamad, Mohd; Rocha, Miguel; Paz, Juan; Pinto, Tiago

    2017-01-01

    Biological and biomedical research are increasingly driven by experimental techniques that challenge our ability to analyse, process and extract meaningful knowledge from the underlying data. The impressive capabilities of next-generation sequencing technologies, together with novel and constantly evolving, distinct types of omics data technologies, have created an increasingly complex set of challenges for the growing fields of Bioinformatics and Computational Biology. The analysis of the datasets produced and their integration call for new algorithms and approaches from fields such as Databases, Statistics, Data Mining, Machine Learning, Optimization, Computer Science and Artificial Intelligence. Clearly, Biology is more and more a science of information and requires tools from the computational sciences. In the last few years, we have seen the rise of a new generation of interdisciplinary scientists with a strong background in the biological and computational sciences. In this context, the interaction of r...

  5. Assessing computational genomics skills: Our experience in the H3ABioNet African bioinformatics network.

    Science.gov (United States)

    Jongeneel, C Victor; Achinike-Oduaran, Ovokeraye; Adebiyi, Ezekiel; Adebiyi, Marion; Adeyemi, Seun; Akanle, Bola; Aron, Shaun; Ashano, Efejiro; Bendou, Hocine; Botha, Gerrit; Chimusa, Emile; Choudhury, Ananyo; Donthu, Ravikiran; Drnevich, Jenny; Falola, Oluwadamila; Fields, Christopher J; Hazelhurst, Scott; Hendry, Liesl; Isewon, Itunuoluwa; Khetani, Radhika S; Kumuthini, Judit; Kimuda, Magambo Phillip; Magosi, Lerato; Mainzer, Liudmila Sergeevna; Maslamoney, Suresh; Mbiyavanga, Mamana; Meintjes, Ayton; Mugutso, Danny; Mpangase, Phelelani; Munthali, Richard; Nembaware, Victoria; Ndhlovu, Andrew; Odia, Trust; Okafor, Adaobi; Oladipo, Olaleye; Panji, Sumir; Pillay, Venesa; Rendon, Gloria; Sengupta, Dhriti; Mulder, Nicola

    2017-06-01

    The H3ABioNet pan-African bioinformatics network, which is funded to support the Human Heredity and Health in Africa (H3Africa) program, has developed node-assessment exercises to gauge the ability of its participating research and service groups to analyze typical genome-wide datasets being generated by H3Africa research groups. We describe a framework for the assessment of computational genomics analysis skills, which includes standard operating procedures, training and test datasets, and a process for administering the exercise. We present the experiences of 3 research groups that have taken the exercise and the impact on their ability to manage complex projects. Finally, we discuss the reasons why many H3ABioNet nodes have declined so far to participate and potential strategies to encourage them to do so.

  6. 10th International Conference on Practical Applications of Computational Biology & Bioinformatics

    CERN Document Server

    Rocha, Miguel; Fdez-Riverola, Florentino; Mayo, Francisco; Paz, Juan

    2016-01-01

    Biological and biomedical research are increasingly driven by experimental techniques that challenge our ability to analyse, process and extract meaningful knowledge from the underlying data. The impressive capabilities of next generation sequencing technologies, together with novel and ever evolving distinct types of omics data technologies, have put an increasingly complex set of challenges for the growing fields of Bioinformatics and Computational Biology. The analysis of the datasets produced and their integration call for new algorithms and approaches from fields such as Databases, Statistics, Data Mining, Machine Learning, Optimization, Computer Science and Artificial Intelligence. Clearly, Biology is more and more a science of information requiring tools from the computational sciences. In the last few years, we have seen the surge of a new generation of interdisciplinary scientists that have a strong background in the biological and computational sciences. In this context, the interaction of researche...

  7. Proposition and Organization of an Adaptive Learning Domain Based on Fusion from the Web

    Science.gov (United States)

    Chaoui, Mohammed; Laskri, Mohamed Tayeb

    2013-01-01

    The Web allows self-navigated education through interaction with large amounts of Web resources. While enjoying the flexibility of Web tools, authors may suffer from research and filtering Web resources, when they face various resources formats and complex structures. An adaptation of extracted Web resources must be assured by authors, to give…

  8. Head First Mobile Web

    CERN Document Server

    Gardner, Lyza; Grigsby, Jason

    2011-01-01

    Despite the huge number of mobile devices and apps in use today, your business still needs a website. You just need it to be mobile. Head First Mobile Web walks you through the process of making a conventional website work on a variety smartphones and tablets. Put your JavaScript, CSS media query, and HTML5 skills to work-then optimize your site to perform its best in the demanding mobile market. Along the way, you'll discover how to adapt your business strategy to target specific devices. Navigate the increasingly complex mobile landscapeTake both technical and strategic approaches to mobile

  9. Developing Web Applications

    CERN Document Server

    Moseley, Ralph

    2007-01-01

    Building applications for the Internet is a complex and fast-moving field which utilizes a variety of continually evolving technologies. Whether your perspective is from the client or server side, there are many languages to master - X(HTML), JavaScript, PHP, XML and CSS to name but a few. These languages have to work together cleanly, logically and in harmony with the systems they run on, and be compatible with any browsers with which they interact. Developing Web Applications presents script writing and good programming practice but also allows students to see how the individual technologi

  10. Secure Web Developers Needed!

    CERN Multimedia

    Computer Security Team

    2012-01-01

    You’re about to launch a new website? Cool!! With today’s web programming languages like PHP, Java, Python or Perl, complex websites can be created, easily fulfilling all your use cases. But hold on. Did you ever think about how easily this can be abused? Attackers today are already using automatic tools which can quickly and easily find and exploit vulnerable web applications.   Web applications often suffer from security vulnerabilities, i.e. design flaws or programming bugs that remained undetected during the whole software development cycle. In production these vulnerabilities become security holes, providing an opportunity for exploitation, and can pose immense security risks (and there is no reason to believe that CERN is immune to this). The costs associated with eliminating these bugs could be loosely described by the "1:10:100 rule", i.e. the relative costs for fixing are 1:10:100 for fixing them in the programming:testing:production phases. Thus, the...

  11. Developing library bioinformatics services in context: the Purdue University Libraries bioinformationist program.

    Science.gov (United States)

    Rein, Diane C

    2006-07-01

    Purdue University is a major agricultural, engineering, biomedical, and applied life science research institution with an increasing focus on bioinformatics research that spans multiple disciplines and campus academic units. The Purdue University Libraries (PUL) hired a molecular biosciences specialist to discover, engage, and support bioinformatics needs across the campus. After an extended period of information needs assessment and environmental scanning, the specialist developed a week of focused bioinformatics instruction (Bioinformatics Week) to launch system-wide, library-based bioinformatics services. The specialist employed a two-tiered approach to assess user information requirements and expectations. The first phase involved careful observation and collection of information needs in-context throughout the campus, attending laboratory meetings, interviewing department chairs and individual researchers, and engaging in strategic planning efforts. Based on the information gathered during the integration phase, several survey instruments were developed to facilitate more critical user assessment and the recovery of quantifiable data prior to planning. Given information gathered while working with clients and through formal needs assessments, as well as the success of instructional approaches used in Bioinformatics Week, the specialist is developing bioinformatics support services for the Purdue community. The specialist is also engaged in training PUL faculty librarians in bioinformatics to provide a sustaining culture of library-based bioinformatics support and understanding of Purdue's bioinformatics-related decision and policy making.

  12. Research Proposal for Distributed Deep Web Search

    NARCIS (Netherlands)

    Tjin-Kam-Jet, Kien

    2010-01-01

    This proposal identifies two main problems related to deep web search, and proposes a step by step solution for each of them. The first problem is about searching deep web content by means of a simple free-text interface (with just one input field, instead of a complex interface with many input

  13. Automatic invariant detection in dynamic web applications

    NARCIS (Netherlands)

    Groeneveld, F.; Mesbah, A.; Van Deursen, A.

    2010-01-01

    The complexity of modern web applications increases as client-side JavaScript and dynamic DOM programming are used to offer a more interactive web experience. In this paper, we focus on improving the dependability of such applications by automatically inferring invariants from the client-side and

  14. Googling DNA sequences on the World Wide Web.

    Science.gov (United States)

    Hajibabaei, Mehrdad; Singer, Gregory A C

    2009-11-10

    New web-based technologies provide an excellent opportunity for sharing and accessing information and using web as a platform for interaction and collaboration. Although several specialized tools are available for analyzing DNA sequence information, conventional web-based tools have not been utilized for bioinformatics applications. We have developed a novel algorithm and implemented it for searching species-specific genomic sequences, DNA barcodes, by using popular web-based methods such as Google. We developed an alignment independent character based algorithm based on dividing a sequence library (DNA barcodes) and query sequence to words. The actual search is conducted by conventional search tools such as freely available Google Desktop Search. We implemented our algorithm in two exemplar packages. We developed pre and post-processing software to provide customized input and output services, respectively. Our analysis of all publicly available DNA barcode sequences shows a high accuracy as well as rapid results. Our method makes use of conventional web-based technologies for specialized genetic data. It provides a robust and efficient solution for sequence search on the web. The integration of our search method for large-scale sequence libraries such as DNA barcodes provides an excellent web-based tool for accessing this information and linking it to other available categories of information on the web.

  15. MLH1 Promoter Methylation and Prediction/Prognosis of Gastric Cancer: A Systematic Review and Meta and Bioinformatic Analysis.

    Science.gov (United States)

    Shen, Shixuan; Chen, Xiaohui; Li, Hao; Sun, Liping; Yuan, Yuan

    2018-01-01

    Background: The promoter methylation of MLH1 gene and gastric cancer (GC)has been investigated previously. To get a more credible conclusion, we performed a systematic review and meta and bioinformatic analysis to clarify the role of MLH1 methylation in the prediction and prognosis of GC. Methods: Eligible studies were targeted after searching the PubMed, Web of Science, Embase, BIOSIS, CNKI and Wanfang Data to collect the information of MLH1 methylation and GC. The link strength between the two was estimated by odds ratio with its 95% confidence interval. The Newcastle-Ottawa scale was used for quantity assessment . Subgroup and sensitivity analysis were conducted to explore sources of heterogeneity. The Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) were employed for bioinformatics analysis on the correlation between MLH1 methylation and GC risk, clinicopathological behavior as well as prognosis. Results: 2365 GC and 1563 controls were included in the meta-analysis. The pooled OR of MLH1 methylation in GC was 4.895 (95% CI: 3.149-7.611, PMLH1 methylation enhanced GC risk but might not related with GC clinicopathological features and prognosis. Conclusion: MLH1 methylation is an alive biomarker for the prediction of GC and it might not affect GC behavior. Further study could be conducted to verify the impact of MLH1 methylation on GC prognosis.

  16. Bioinformatics, interaction network analysis, and neural networks to characterize gene expression of radicular cyst and periapical granuloma.

    Science.gov (United States)

    Poswar, Fabiano de Oliveira; Farias, Lucyana Conceição; Fraga, Carlos Alberto de Carvalho; Bambirra, Wilson; Brito-Júnior, Manoel; Sousa-Neto, Manoel Damião; Santos, Sérgio Henrique Souza; de Paula, Alfredo Maurício Batista; D'Angelo, Marcos Flávio Silveira Vasconcelos; Guimarães, André Luiz Sena

    2015-06-01

    Bioinformatics has emerged as an important tool to analyze the large amount of data generated by research in different diseases. In this study, gene expression for radicular cysts (RCs) and periapical granulomas (PGs) was characterized based on a leader gene approach. A validated bioinformatics algorithm was applied to identify leader genes for RCs and PGs. Genes related to RCs and PGs were first identified in PubMed, GenBank, GeneAtlas, and GeneCards databases. The Web-available STRING software (The European Molecular Biology Laboratory [EMBL], Heidelberg, Baden-Württemberg, Germany) was used in order to build the interaction map among the identified genes by a significance score named weighted number of links. Based on the weighted number of links, genes were clustered using k-means. The genes in the highest cluster were considered leader genes. Multilayer perceptron neural network analysis was used as a complementary supplement for gene classification. For RCs, the suggested leader genes were TP53 and EP300, whereas PGs were associated with IL2RG, CCL2, CCL4, CCL5, CCR1, CCR3, and CCR5 genes. Our data revealed different gene expression for RCs and PGs, suggesting that not only the inflammatory nature but also other biological processes might differentiate RCs and PGs. Copyright © 2015 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.

  17. [Pharmacogenetics II. Research molecular methods, bioinformatics and ethical concerns].

    Science.gov (United States)

    Daudén, E

    2007-01-01

    Pharmacogenetics refers to the study of the individual pharmacological response based on the genotype. Its objective is to optimize treatment in an individual basis, thereby creating a more efficient and safe personalized therapy. In the second part of this review, the molecular methods of study in pharmacogenetics, including microarray technology or DNA chips, are discussed. Among them we highlight the microarrays used to determine the gene expression that detect specific RNA sequences, and the microarrays employed to determine the genotype that detect specific DNA sequences, including polymorphisms, particularly single nucleotide polymorphisms (SNPs). The relationship between pharmacogenetics, bioinformatics and ethical concerns is reviewed.

  18. MicroRNA from tuberculosis RNA: A bioinformatics study

    OpenAIRE

    Wiwanitkit, Somsri; Wiwanitkit, Viroj

    2012-01-01

    The role of microRNA in the pathogenesis of pulmonary tuberculosis is the interesting topic in chest medicine at present. Recently, it was proposed that the microRNA can be a useful biomarker for monitoring of pulmonary tuberculosis and might be the important part in pathogenesis of disease. Here, the authors perform a bioinformatics study to assess the microRNA within known tuberculosis RNA. The microRNA part can be detected and this can be important key information in further study of the p...

  19. Application of bioinformatics on the detection of pathogens by Pcr

    International Nuclear Information System (INIS)

    Rezig, Slim; Sakhri, Saber

    2007-01-01

    Salmonellas are the main responsible agent for the frequent food-borne gastrointestinal diseases. Their detection using classical methods are laborious and their results take a lot of time to be revealed. In this context, we tried to set up a revealing technique of the invA virulence gene, found in the majority of Salmonella species. After amplification with PCR using specific primers created and verified by bioinformatics programs, two couples of primers were set up and they appeared to be very specific and sensitive for the detection of invA gene. (Author)

  20. Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center

    Science.gov (United States)

    Wattam, Alice R.; Davis, James J.; Assaf, Rida; Boisvert, Sébastien; Brettin, Thomas; Bun, Christopher; Conrad, Neal; Dietrich, Emily M.; Disz, Terry; Gabbard, Joseph L.; Gerdes, Svetlana; Henry, Christopher S.; Kenyon, Ronald W.; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K.; Olsen, Gary J.; Murphy-Olson, Daniel E.; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D.; Shukla, Maulik; Vonstein, Veronika; Warren, Andrew; Xia, Fangfang; Yoo, Hyunseung; Stevens, Rick L.

    2017-01-01

    The Pathosystems Resource Integration Center (PATRIC) is the bacterial Bioinformatics Resource Center (https://www.patricbrc.org). Recent changes to PATRIC include a redesign of the web interface and some new services that provide users with a platform that takes them from raw reads to an integrated analysis experience. The redesigned interface allows researchers direct access to tools and data, and the emphasis has changed to user-created genome-groups, with detailed summaries and views of the data that researchers have selected. Perhaps the biggest change has been the enhanced capability for researchers to analyze their private data and compare it to the available public data. Researchers can assemble their raw sequence reads and annotate the contigs using RASTtk. PATRIC also provides services for RNA-Seq, variation, model reconstruction and differential expression analysis, all delivered through an updated private workspace. Private data can be compared by ‘virtual integration’ to any of PATRIC's public data. The number of genomes available for comparison in PATRIC has expanded to over 80 000, with a special emphasis on genomes with antimicrobial resistance data. PATRIC uses this data to improve both subsystem annotation and k-mer classification, and tags new genomes as having signatures that indicate susceptibility or resistance to specific antibiotics. PMID:27899627

  1. FungiDB: An Integrated Bioinformatic Resource for Fungi and Oomycetes

    Directory of Open Access Journals (Sweden)

    Evelina Y. Basenko

    2018-03-01

    Full Text Available FungiDB (fungidb.org is a free online resource for data mining and functional genomics analysis for fungal and oomycete species. FungiDB is part of the Eukaryotic Pathogen Genomics Database Resource (EuPathDB, eupathdb.org platform that integrates genomic, transcriptomic, proteomic, and phenotypic datasets, and other types of data for pathogenic and nonpathogenic, free-living and parasitic organisms. FungiDB is one of the largest EuPathDB databases containing nearly 100 genomes obtained from GenBank, Aspergillus Genome Database (AspGD, The Broad Institute, Joint Genome Institute (JGI, Ensembl, and other sources. FungiDB offers a user-friendly web interface with embedded bioinformatics tools that support custom in silico experiments that leverage FungiDB-integrated data. In addition, a Galaxy-based workspace enables users to generate custom pipelines for large-scale data analysis (e.g., RNA-Seq, variant calling, etc.. This review provides an introduction to the FungiDB resources and focuses on available features, tools, and queries and how they can be used to mine data across a diverse range of integrated FungiDB datasets and records.

  2. OPPL-Galaxy, a Galaxy tool for enhancing ontology exploitation as part of bioinformatics workflows

    Science.gov (United States)

    2013-01-01

    Background Biomedical ontologies are key elements for building up the Life Sciences Semantic Web. Reusing and building biomedical ontologies requires flexible and versatile tools to manipulate them efficiently, in particular for enriching their axiomatic content. The Ontology Pre Processor Language (OPPL) is an OWL-based language for automating the changes to be performed in an ontology. OPPL augments the ontologists’ toolbox by providing a more efficient, and less error-prone, mechanism for enriching a biomedical ontology than that obtained by a manual treatment. Results We present OPPL-Galaxy, a wrapper for using OPPL within Galaxy. The functionality delivered by OPPL (i.e. automated ontology manipulation) can be combined with the tools and workflows devised within the Galaxy framework, resulting in an enhancement of OPPL. Use cases are provided in order to demonstrate OPPL-Galaxy’s capability for enriching, modifying and querying biomedical ontologies. Conclusions Coupling OPPL-Galaxy with other bioinformatics tools of the Galaxy framework results in a system that is more than the sum of its parts. OPPL-Galaxy opens a new dimension of analyses and exploitation of biomedical ontologies, including automated reasoning, paving the way towards advanced biological data analyses. PMID:23286517

  3. Mi-DISCOVERER: A bioinformatics tool for the detection of mi-RNA in human genome.

    Science.gov (United States)

    Arshad, Saadia; Mumtaz, Asia; Ahmad, Freed; Liaquat, Sadia; Nadeem, Shahid; Mehboob, Shahid; Afzal, Muhammad

    2010-11-27

    MicroRNAs (miRNAs) are 22 nucleotides non-coding RNAs that play pivotal regulatory roles in diverse organisms including the humans and are difficult to be identified due to lack of either sequence features or robust algorithms to efficiently identify. Therefore, we made a tool that is Mi-Discoverer for the detection of miRNAs in human genome. The tools used for the development of software are Microsoft Office Access 2003, the JDK version 1.6.0, BioJava version 1.0, and the NetBeans IDE version 6.0. All already made miRNAs softwares were web based; so the advantage of our project was to make a desktop facility to the user for sequence alignment search with already identified miRNAs of human genome present in the database. The user can also insert and update the newly discovered human miRNA in the database. Mi-Discoverer, a bioinformatics tool successfully identifies human miRNAs based on multiple sequence alignment searches. It's a non redundant database containing a large collection of publicly available human miRNAs.

  4. New Directions in Statistical Physics: Econophysics, Bioinformatics, and Pattern Recognition

    International Nuclear Information System (INIS)

    Grassberger, P

    2004-01-01

    This book contains 18 contributions from different authors. Its subtitle 'Econophysics, Bioinformatics, and Pattern Recognition' says more precisely what it is about: not so much about central problems of conventional statistical physics like equilibrium phase transitions and critical phenomena, but about its interdisciplinary applications. After a long period of specialization, physicists have, over the last few decades, found more and more satisfaction in breaking out of the limitations set by the traditional classification of sciences. Indeed, this classification had never been strict, and physicists in particular had always ventured into other fields. Helmholtz, in the middle of the 19th century, had considered himself a physicist when working on physiology, stressing that the physics of animate nature is as much a legitimate field of activity as the physics of inanimate nature. Later, Max Delbrueck and Francis Crick did for experimental biology what Schroedinger did for its theoretical foundation. And many of the experimental techniques used in chemistry, biology, and medicine were developed by a steady stream of talented physicists who left their proper discipline to venture out into the wider world of science. The development we have witnessed over the last thirty years or so is different. It started with neural networks where methods could be applied which had been developed for spin glasses, but todays list includes vehicular traffic (driven lattice gases), geology (self-organized criticality), economy (fractal stochastic processes and large scale simulations), engineering (dynamical chaos), and many others. By staying in the physics departments, these activities have transformed the physics curriculum and the view physicists have of themselves. In many departments there are now courses on econophysics or on biological physics, and some universities offer degrees in the physics of traffic or in econophysics. In order to document this change of attitude

  5. Web TA Production (WebTA)

    Data.gov (United States)

    US Agency for International Development — WebTA is a web-based time and attendance system that supports USAID payroll administration functions, and is designed to capture hours worked, leave used and...

  6. Web server attack analyzer

    OpenAIRE

    Mižišin, Michal

    2013-01-01

    Web server attack analyzer - Abstract The goal of this work was to create prototype of analyzer of injection flaws attacks on web server. Proposed solution combines capabilities of web application firewall and web server log analyzer. Analysis is based on configurable signatures defined by regular expressions. This paper begins with summary of web attacks, followed by detection techniques analysis on web servers, description and justification of selected implementation. In the end are charact...

  7. SCALEUS: Semantic Web Services Integration for Biomedical Applications.

    Science.gov (United States)

    Sernadela, Pedro; González-Castro, Lorena; Oliveira, José Luís

    2017-04-01

    In recent years, we have witnessed an explosion of biological data resulting largely from the demands of life science research. The vast majority of these data are freely available via diverse bioinformatics platforms, including relational databases and conventional keyword search applications. This type of approach has achieved great results in the last few years, but proved to be unfeasible when information needs to be combined or shared among different and scattered sources. During recent years, many of these data distribution challenges have been solved with the adoption of semantic web. Despite the evident benefits of this technology, its adoption introduced new challenges related with the migration process, from existent systems to the semantic level. To facilitate this transition, we have developed Scaleus, a semantic web migration tool that can be deployed on top of traditional systems in order to bring knowledge, inference rules, and query federation to the existent data. Targeted at the biomedical domain, this web-based platform offers, in a single package, straightforward data integration and semantic web services that help developers and researchers in the creation process of new semantically enhanced information systems. SCALEUS is available as open source at http://bioinformatics-ua.github.io/scaleus/ .

  8. Semantic web for dummies

    CERN Document Server

    Pollock, Jeffrey T

    2009-01-01

    Semantic Web technology is already changing how we interact with data on the Web. By connecting random information on the Internet in new ways, Web 3.0, as it is sometimes called, represents an exciting online evolution. Whether you're a consumer doing research online, a business owner who wants to offer your customers the most useful Web site, or an IT manager eager to understand Semantic Web solutions, Semantic Web For Dummies is the place to start! It will help you:Know how the typical Internet user will recognize the effects of the Semantic WebExplore all the benefits the data Web offers t

  9. Protecting innovation in bioinformatics and in-silico biology.

    Science.gov (United States)

    Harrison, Robert

    2003-01-01

    Commercial success or failure of innovation in bioinformatics and in-silico biology requires the appropriate use of legal tools for protecting and exploiting intellectual property. These tools include patents, copyrights, trademarks, design rights, and limiting information in the form of 'trade secrets'. Potentially patentable components of bioinformatics programmes include lines of code, algorithms, data content, data structure and user interfaces. In both the US and the European Union, copyright protection is granted for software as a literary work, and most other major industrial countries have adopted similar rules. Nonetheless, the grant of software patents remains controversial and is being challenged in some countries. Current debate extends to aspects such as whether patents can claim not only the apparatus and methods but also the data signals and/or products, such as a CD-ROM, on which the programme is stored. The patentability of substances discovered using in-silico methods is a separate debate that is unlikely to be resolved in the near future.

  10. Bioinformatic prediction and functional characterization of human KIAA0100 gene

    Directory of Open Access Journals (Sweden)

    He Cui

    2017-02-01

    Full Text Available Our previous study demonstrated that human KIAA0100 gene was a novel acute monocytic leukemia-associated antigen (MLAA gene. But the functional characterization of human KIAA0100 gene has remained unknown to date. Here, firstly, bioinformatic prediction of human KIAA0100 gene was carried out using online softwares; Secondly, Human KIAA0100 gene expression was downregulated by the clustered regularly interspaced short palindromic repeats (CRISPR/CRISPR-associated (Cas 9 system in U937 cells. Cell proliferation and apoptosis were next evaluated in KIAA0100-knockdown U937 cells. The bioinformatic prediction showed that human KIAA0100 gene was located on 17q11.2, and human KIAA0100 protein was located in the secretory pathway. Besides, human KIAA0100 protein contained a signalpeptide, a transmembrane region, three types of secondary structures (alpha helix, extended strand, and random coil , and four domains from mitochondrial protein 27 (FMP27. The observation on functional characterization of human KIAA0100 gene revealed that its downregulation inhibited cell proliferation, and promoted cell apoptosis in U937 cells. To summarize, these results suggest human KIAA0100 gene possibly comes within mitochondrial genome; moreover, it is a novel anti-apoptotic factor related to carcinogenesis or progression in acute monocytic leukemia, and may be a potential target for immunotherapy against acute monocytic leukemia.

  11. Shared Bioinformatics Databases within the Unipro UGENE Platform

    Directory of Open Access Journals (Sweden)

    Protsyuk Ivan V.

    2015-03-01

    Full Text Available Unipro UGENE is an open-source bioinformatics toolkit that integrates popular tools along with original instruments for molecular biologists within a unified user interface. Nowadays, most bioinformatics desktop applications, including UGENE, make use of a local data model while processing different types of data. Such an approach causes an inconvenience for scientists working cooperatively and relying on the same data. This refers to the need of making multiple copies of certain files for every workplace and maintaining synchronization between them in case of modifications. Therefore, we focused on delivering a collaborative work into the UGENE user experience. Currently, several UGENE installations can be connected to a designated shared database and users can interact with it simultaneously. Such databases can be created by UGENE users and be used at their discretion. Objects of each data type, supported by UGENE such as sequences, annotations, multiple alignments, etc., can now be easily imported from or exported to a remote storage. One of the main advantages of this system, compared to existing ones, is the almost simultaneous access of client applications to shared data regardless of their volume. Moreover, the system is capable of storing millions of objects. The storage itself is a regular database server so even an inexpert user is able to deploy it. Thus, UGENE may provide access to shared data for users located, for example, in the same laboratory or institution. UGENE is available at: http://ugene.net/download.html.

  12. mockrobiota: a Public Resource for Microbiome Bioinformatics Benchmarking.

    Science.gov (United States)

    Bokulich, Nicholas A; Rideout, Jai Ram; Mercurio, William G; Shiffer, Arron; Wolfe, Benjamin; Maurice, Corinne F; Dutton, Rachel J; Turnbaugh, Peter J; Knight, Rob; Caporaso, J Gregory

    2016-01-01

    Mock communities are an important tool for validating, optimizing, and comparing bioinformatics methods for microbial community analysis. We present mockrobiota, a public resource for sharing, validating, and documenting mock community data resources, available at http://caporaso-lab.github.io/mockrobiota/. The materials contained in mockrobiota include data set and sample metadata, expected composition data (taxonomy or gene annotations or reference sequences for mock community members), and links to raw data (e.g., raw sequence data) for each mock community data set. mockrobiota does not supply physical sample materials directly, but the data set metadata included for each mock community indicate whether physical sample materials are available. At the time of this writing, mockrobiota contains 11 mock community data sets with known species compositions, including bacterial, archaeal, and eukaryotic mock communities, analyzed by high-throughput marker gene sequencing. IMPORTANCE The availability of standard and public mock community data will facilitate ongoing method optimizations, comparisons across studies that share source data, and greater transparency and access and eliminate redundancy. These are also valuable resources for bioinformatics teaching and training. This dynamic resource is intended to expand and evolve to meet the changing needs of the omics community.

  13. Customisable Scientific Web Portal for Fusion Research

    Energy Technology Data Exchange (ETDEWEB)

    Abla, G; Kim, E; Schissel, D; Flannagan, S [General Atomics, San Diego (United States)

    2009-07-01

    The Web browser has become one of the major application interfaces for remotely participating in magnetic fusion. Web portals are used to present very diverse sources of information in a unified way. While a web portal has several benefits over other software interfaces, such as providing single point of access for multiple computational services, and eliminating the need for client software installation, the design and development of a web portal has unique challenges. One of the challenges is that a web portal needs to be fast and interactive despite a high volume of tools and information that it presents. Another challenge is the visual output on the web portal often is overwhelming due to the high volume of data generated by complex scientific instruments and experiments; therefore the applications and information should be customizable depending on the needs of users. An appropriate software architecture and web technologies can meet these problems. A web-portal has been designed to support the experimental activities of DIII-D researchers worldwide. It utilizes a multi-tier software architecture, and web 2.0 technologies, such as AJAX, Django, and Memcached, to develop a highly interactive and customizable user interface. It offers a customizable interface with personalized page layouts and list of services for users to select. Customizable services are: real-time experiment status monitoring, diagnostic data access, interactive data visualization. The web-portal also supports interactive collaborations by providing collaborative logbook, shared visualization and online instant message services. Furthermore, the web portal will provide a mechanism to allow users to create their own applications on the web portal as well as bridging capabilities to external applications such as Twitter and other social networks. In this series of slides, we describe the software architecture of this scientific web portal and our experiences in utilizing web 2.0 technologies. A

  14. SBMLmod: a Python-based web application and web service for efficient data integration and model simulation.

    Science.gov (United States)

    Schäuble, Sascha; Stavrum, Anne-Kristin; Bockwoldt, Mathias; Puntervoll, Pål; Heiland, Ines

    2017-06-24

    Systems Biology Markup Language (SBML) is the standard model representation and description language in systems biology. Enriching and analysing systems biology models by integrating the multitude of available data, increases the predictive power of these models. This may be a daunting task, which commonly requires bioinformatic competence and scripting. We present SBMLmod, a Python-based web application and service, that automates integration of high throughput data into SBML models. Subsequent steady state analysis is readily accessible via the web service COPASIWS. We illustrate the utility of SBMLmod by integrating gene expression data from different healthy tissues as well as from a cancer dataset into a previously published model of mammalian tryptophan metabolism. SBMLmod is a user-friendly platform for model modification and simulation. The web application is available at http://sbmlmod.uit.no , whereas the WSDL definition file for the web service is accessible via http://sbmlmod.uit.no/SBMLmod.wsdl . Furthermore, the entire package can be downloaded from https://github.com/MolecularBioinformatics/sbml-mod-ws . We envision that SBMLmod will make automated model modification and simulation available to a broader research community.

  15. AMPA: an automated web server for prediction of protein antimicrobial regions.

    Science.gov (United States)

    Torrent, Marc; Di Tommaso, Paolo; Pulido, David; Nogués, M Victòria; Notredame, Cedric; Boix, Ester; Andreu, David

    2012-01-01

    AMPA is a web application for assessing the antimicrobial domains of proteins, with a focus on the design on new antimicrobial drugs. The application provides fast discovery of antimicrobial patterns in proteins that can be used to develop new peptide-based drugs against pathogens. Results are shown in a user-friendly graphical interface and can be downloaded as raw data for later examination. AMPA is freely available on the web at http://tcoffee.crg.cat/apps/ampa. The source code is also available in the web. marc.torrent@upf.edu; david.andreu@upf.edu Supplementary data are available at Bioinformatics online.

  16. mORCA: ubiquitous access to life science web services.

    Science.gov (United States)

    Diaz-Del-Pino, Sergio; Trelles, Oswaldo; Falgueras, Juan

    2018-01-16

    Technical advances in mobile devices such as smartphones and tablets have produced an extraordinary increase in their use around the world and have become part of our daily lives. The possibility of carrying these devices in a pocket, particularly mobile phones, has enabled ubiquitous access to Internet resources. Furthermore, in the life sciences world there has been a vast proliferation of data types and services that finish as Web Services. This suggests the need for research into mobile clients to deal with life sciences applications for effective usage and exploitation. Analysing the current features in existing bioinformatics applications managing Web Services, we have devised, implemented, and deployed an easy-to-use web-based lightweight mobile client. This client is able to browse, select, compose parameters, invoke, and monitor the execution of Web Services stored in catalogues or central repositories. The client is also able to deal with huge amounts of data between external storage mounts. In addition, we also present a validation use case, which illustrates the usage of the application while executing, monitoring, and exploring the results of a registered workflow. The software its available in the Apple Store and Android Market and the source code is publicly available in Github. Mobile devices are becoming increasingly important in the scientific world due to their strong potential impact on scientific applications. Bioinformatics should not fall behind this trend. We present an original software client that deals with the intrinsic limitations of such devices and propose different guidelines to provide location-independent access to computational resources in bioinformatics and biomedicine. Its modular design makes it easily expandable with the inclusion of new repositories, tools, types of visualization, etc.

  17. Caustic Skeleton & Cosmic Web

    Science.gov (United States)

    Feldbrugge, Job; van de Weygaert, Rien; Hidding, Johan; Feldbrugge, Joost

    2018-05-01

    We present a general formalism for identifying the caustic structure of a dynamically evolving mass distribution, in an arbitrary dimensional space. The identification of caustics in fluids with Hamiltonian dynamics, viewed in Lagrangian space, corresponds to the classification of singularities in Lagrangian catastrophe theory. On the basis of this formalism we develop a theoretical framework for the dynamics of the formation of the cosmic web, and specifically those aspects that characterize its unique nature: its complex topological connectivity and multiscale spinal structure of sheetlike membranes, elongated filaments and compact cluster nodes. Given the collisionless nature of the gravitationally dominant dark matter component in the universe, the presented formalism entails an accurate description of the spatial organization of matter resulting from the gravitationally driven formation of cosmic structure. The present work represents a significant extension of the work by Arnol'd et al. [1], who classified the caustics that develop in one- and two-dimensional systems that evolve according to the Zel'dovich approximation. His seminal work established the defining role of emerging singularities in the formation of nonlinear structures in the universe. At the transition from the linear to nonlinear structure evolution, the first complex features emerge at locations where different fluid elements cross to establish multistream regions. Involving a complex folding of the 6-D sheetlike phase-space distribution, it manifests itself in the appearance of infinite density caustic features. The classification and characterization of these mass element foldings can be encapsulated in caustic conditions on the eigenvalue and eigenvector fields of the deformation tensor field. In this study we introduce an alternative and transparent proof for Lagrangian catastrophe theory. This facilitates the derivation of the caustic conditions for general Lagrangian fluids, with

  18. Hypermedia presentation generation for semantic web information systems

    NARCIS (Netherlands)

    Frasincar, F.

    2005-01-01

    Due to Web popularity many information systems have been made available through the Web, resulting in so-called Web Information Systems (WIS). Due to the complex requirements that WIS need to ful??ll, the design of these systems is not a trivial task. Design methodologies provide guidelines for the

  19. User Needs of Digital Service Web Portals: A Case Study

    Science.gov (United States)

    Heo, Misook; Song, Jung-Sook; Seol, Moon-Won

    2013-01-01

    The authors examined the needs of digital information service web portal users. More specifically, the needs of Korean cultural portal users were examined as a case study. The conceptual framework of a web-based portal is that it is a complex, web-based service application with characteristics of information systems and service agents. In…

  20. Bioinformatics Mining and Modeling Methods for the Identification of Disease Mechanisms in Neurodegenerative Disorders

    Directory of Open Access Journals (Sweden)

    Martin Hofmann-Apitius

    2015-12-01

    Full Text Available Since the decoding of the Human Genome, techniques from bioinformatics, statistics, and machine learning have been instrumental in uncovering patterns in increasing amounts and types of different data produced by technical profiling technologies applied to clinical samples, animal models, and cellular systems. Yet, progress on unravelling biological mechanisms, causally driving diseases, has been limited, in part due to the inherent complexity of biological systems. Whereas we have witnessed progress in the areas of cancer, cardiovascular and metabolic diseases, the area of neurodegenerative diseases has proved to be very challenging. This is in part because the aetiology of neurodegenerative diseases such as Alzheimer´s disease or Parkinson´s disease is unknown, rendering it very difficult to discern early causal events. Here we describe a panel of bioinformatics and modeling approaches that have recently been developed to identify candidate mechanisms of neurodegenerative diseases based on publicly available data and knowledge. We identify two complementary strategies—data mining techniques using genetic data as a starting point to be further enriched using other data-types, or alternatively to encode prior knowledge about disease mechanisms in a model based framework supporting reasoning and enrichment analysis. Our review illustrates the challenges entailed in integrating heterogeneous, multiscale and multimodal information in the area of neurology in general and neurodegeneration in particular. We conclude, that progress would be accelerated by increasing efforts on performing systematic collection of multiple data-types over time from each individual suffering from neurodegenerative disease. The work presented here has been driven by project AETIONOMY; a project funded in the course of the Innovative Medicines Initiative (IMI; which is a public-private partnership of the European Federation of Pharmaceutical Industry Associations

  1. Analysis Tool Web Services from the EMBL-EBI.

    Science.gov (United States)

    McWilliam, Hamish; Li, Weizhong; Uludag, Mahmut; Squizzato, Silvano; Park, Young Mi; Buso, Nicola; Cowley, Andrew Peter; Lopez, Rodrigo

    2013-07-01

    Since 2004 the European Bioinformatics Institute (EMBL-EBI) has provided access to a wide range of databases and analysis tools via Web Services interfaces. This comprises services to search across the databases available from the EMBL-EBI and to explore the network of cross-references present in the data (e.g. EB-eye), services to retrieve entry data in various data formats and to access the data in specific fields (e.g. dbfetch), and analysis tool services, for example, sequence similarity search (e.g. FASTA and NCBI BLAST), multiple sequence alignment (e.g. Clustal Omega and MUSCLE), pairwise sequence alignment and protein functional analysis (e.g. InterProScan and Phobius). The REST/SOAP Web Services (http://www.ebi.ac.uk/Tools/webservices/) interfaces to these databases and tools allow their integration into other tools, applications, web sites, pipeline processes and analytical workflows. To get users started using the Web Services, sample clients are provided covering a range of programming languages and popular Web Service tool kits, and a brief guide to Web Services technologies, including a set of tutorials, is available for those wishing to learn more and develop their own clients. Users of the Web Services are informed of improvements and updates via a range of methods.

  2. The Bioinformatics of Integrative Medical Insights: Proposals for an International PsychoSocial and Cultural Bioinformatics Project

    Directory of Open Access Journals (Sweden)

    Ernest Rossi

    2006-01-01

    Full Text Available We propose the formation of an International PsychoSocial and Cultural Bioinformatics Project (IPCBP to explore the research foundations of Integrative Medical Insights (IMI on all levels from the molecular-genomic to the psychological, cultural, social, and spiritual. Just as The Human Genome Project identified the molecular foundations of modern medicine with the new technology of sequencing DNA during the past decade, the IPCBP would extend and integrate this neuroscience knowledge base with the technology of gene expression via DNA/proteomic microarray research and brain imaging in development, stress, healing, rehabilitation, and the psychotherapeutic facilitation of existentional wellness. We anticipate that the IPCBP will require a unique international collaboration of, academic institutions, researchers, and clinical practioners for the creation of a new neuroscience of mind-body communication, brain plasticity, memory, learning, and creative processing during optimal experiential states of art, beauty, and truth. We illustrate this emerging integration of bioinformatics with medicine with a videotape of the classical 4-stage creative process in a neuroscience approach to psychotherapy.

  3. The Bioinformatics of Integrative Medical Insights: Proposals for an International Psycho-Social and Cultural Bioinformatics Project

    Directory of Open Access Journals (Sweden)

    Ernest Rossi

    2006-01-01

    Full Text Available We propose the formation of an International Psycho-Social and Cultural Bioinformatics Project (IPCBP to explore the research foundations of Integrative Medical Insights (IMI on all levels from the molecular-genomic to the psychological, cultural, social, and spiritual. Just as The Human Genome Project identified the molecular foundations of modern medicine with the new technology of sequencing DNA during the past decade, the IPCBP would extend and integrate this neuroscience knowledge base with the technology of gene expression via DNA/proteomic microarray research and brain imaging in development, stress, healing, rehabilitation, and the psychotherapeutic facilitation of existentional wellness. We anticipate that the IPCBP will require a unique international collaboration of, academic institutions, researchers, and clinical practioners for the creation of a new neuroscience of mind-body communication, brain plasticity, memory, learning, and creative processing during optimal experiential states of art, beauty, and truth. We illustrate this emerging integration of bioinformatics with medicine with a videotape of the classical 4-stage creative process in a neuroscience approach to psychotherapy.

  4. The value of the Semantic Web in the laboratory.

    Science.gov (United States)

    Frey, Jeremy G

    2009-06-01

    The Semantic Web is beginning to impact on the wider chemical and physical sciences, beyond the earlier adopted bio-informatics. While useful in large-scale data driven science with automated processing, these technologies can also help integrate the work of smaller scale laboratories producing diverse data. The semantics aid the discovery, reliable re-use of data, provide improved provenance and facilitate automated processing by increased resilience to changes in presentation and reduced ambiguity. The Semantic Web, its tools and collections are not yet competitive with well-established solutions to current problems. It is in the reduced cost of instituting solutions to new problems that the versatility of Semantic Web-enabled data and resources will make their mark once the more general-purpose tools are more available.

  5. Atlas – a data warehouse for integrative bioinformatics

    Directory of Open Access Journals (Sweden)

    Yuen Macaire MS

    2005-02-01

    Full Text Available Abstract Background We present a biological data warehouse called Atlas that locally stores and integrates biological sequences, molecular interactions, homology information, functional annotations of genes, and biological ontologies. The goal of the system is to provide data, as well as a software infrastructure for bioinformatics research and development. Description The Atlas system is based on relational data models that we developed for each of the source data types. Data stored within these relational models are managed through Structured Query Language (SQL calls that are implemented in a set of Application Programming Interfaces (APIs. The APIs include three languages: C++, Java, and Perl. The methods in these API libraries are used to construct a set of loader applications, which parse and load the source datasets into the Atlas database, and a set of toolbox applications which facilitate data retrieval. Atlas stores and integrates local instances of GenBank, RefSeq, UniProt, Human Protein Reference Database (HPRD, Biomolecular Interaction Network Database (BIND, Database of Interacting Proteins (DIP, Molecular Interactions Database (MINT, IntAct, NCBI Taxonomy, Gene Ontology (GO, Online Mendelian Inheritance in Man (OMIM, LocusLink, Entrez Gene and HomoloGene. The retrieval APIs and toolbox applications are critical components that offer end-users flexible, easy, integrated access to this data. We present use cases that use Atlas to integrate these sources for genome annotation, inference of molecular interactions across species, and gene-disease associations. Conclusion The Atlas biological data warehouse serves as data infrastructure for bioinformatics research and development. It forms the backbone of the research activities in our laboratory and facilitates the integration of disparate, heterogeneous biological sources of data enabling new scientific inferences. Atlas achieves integration of diverse data sets at two levels. First

  6. Het WEB leert begrijpen

    CERN Multimedia

    Stroeykens, Steven

    2004-01-01

    The WEB could be much more useful if the computers understood something of information on the Web pages. That explains the goal of the "semantic Web", a project in which takes part, amongst others, Tim Berners Lee, the inventor of the original WEB

  7. Instant responsive web design

    CERN Document Server

    Simmons, Cory

    2013-01-01

    A step-by-step tutorial approach which will teach the readers what responsive web design is and how it is used in designing a responsive web page.If you are a web-designer looking to expand your skill set by learning the quickly growing industry standard of responsive web design, this book is ideal for you. Knowledge of CSS is assumed.

  8. WebGimm: An integrated web-based platform for cluster analysis, functional analysis, and interactive visualization of results.

    Science.gov (United States)

    Joshi, Vineet K; Freudenberg, Johannes M; Hu, Zhen; Medvedovic, Mario

    2011-01-17

    Cluster analysis methods have been extensively researched, but the adoption of new methods is often hindered by technical barriers in their implementation and use. WebGimm is a free cluster analysis web-service, and an open source general purpose clustering web-server infrastructure designed to facilitate easy deployment of integrated cluster analysis servers based on clustering and functional annotation algorithms implemented in R. Integrated functional analyses and interactive browsing of both, clustering structure and functional annotations provides a complete analytical environment for cluster analysis and interpretation of results. The Java Web Start client-based interface is modeled after the familiar cluster/treeview packages making its use intuitive to a wide array of biomedical researchers. For biomedical researchers, WebGimm provides an avenue to access state of the art clustering procedures. For Bioinformatics methods developers, WebGimm offers a convenient avenue to deploy their newly developed clustering methods. WebGimm server, software and manuals can be freely accessed at http://ClusterAnalysis.org/.

  9. Geospatial semantic web

    CERN Document Server

    Zhang, Chuanrong; Li, Weidong

    2015-01-01

    This book covers key issues related to Geospatial Semantic Web, including geospatial web services for spatial data interoperability; geospatial ontology for semantic interoperability; ontology creation, sharing, and integration; querying knowledge and information from heterogeneous data source; interfaces for Geospatial Semantic Web, VGI (Volunteered Geographic Information) and Geospatial Semantic Web; challenges of Geospatial Semantic Web; and development of Geospatial Semantic Web applications. This book also describes state-of-the-art technologies that attempt to solve these problems such as WFS, WMS, RDF, OWL, and GeoSPARQL, and demonstrates how to use the Geospatial Semantic Web technologies to solve practical real-world problems such as spatial data interoperability.

  10. Functional Reactive Programming on the Web - A Practical Evaluation

    OpenAIRE

    Young, Christian Strand

    2015-01-01

    The web as an application platform is rising rapidly. With more complex solutions written in JavaScript that run client-side, as well as server-side, challenges related to JavaScript's asynchronous nature arise. This thesis explores and applies the Functional Reactive Programming paradigm (FRP) on the web as an alternative to traditional imperative programming. The potential of FRP in a context of the web is shown through case implementations of general practical real world problems that web ...

  11. Virtual Web Services

    OpenAIRE

    Rykowski, Jarogniew

    2007-01-01

    In this paper we propose an application of software agents to provide Virtual Web Services. A Virtual Web Service is a linked collection of several real and/or virtual Web Services, and public and private agents, accessed by the user in the same way as a single real Web Service. A Virtual Web Service allows unrestricted comparison, information merging, pipelining, etc., of data coming from different sources and in different forms. Detailed architecture and functionality of a single Virtual We...

  12. The Semantic Web Revisited

    OpenAIRE

    Shadbolt, Nigel; Berners-Lee, Tim; Hall, Wendy

    2006-01-01

    The original Scientific American article on the Semantic Web appeared in 2001. It described the evolution of a Web that consisted largely of documents for humans to read to one that included data and information for computers to manipulate. The Semantic Web is a Web of actionable information--information derived from data through a semantic theory for interpreting the symbols.This simple idea, however, remains largely unrealized. Shopbots and auction bots abound on the Web, but these are esse...

  13. Web Project Management

    OpenAIRE

    Suralkar, Sunita; Joshi, Nilambari; Meshram, B B

    2013-01-01

    This paper describes about the need for Web project management, fundamentals of project management for web projects: what it is, why projects go wrong, and what's different about web projects. We also discuss Cost Estimation Techniques based on Size Metrics. Though Web project development is similar to traditional software development applications, the special characteristics of Web Application development requires adaption of many software engineering approaches or even development of comple...

  14. Bioinformatics and the Politics of Innovation in the Life Sciences

    Science.gov (United States)

    Zhou, Yinhua; Datta, Saheli; Salter, Charlotte

    2016-01-01

    The governments of China, India, and the United Kingdom are unanimous in their belief that bioinformatics should supply the link between basic life sciences research and its translation into health benefits for the population and the economy. Yet at the same time, as ambitious states vying for position in the future global bioeconomy they differ considerably in the strategies adopted in pursuit of this goal. At the heart of these differences lies the interaction between epistemic change within the scientific community itself and the apparatus of the state. Drawing on desk-based research and thirty-two interviews with scientists and policy makers in the three countries, this article analyzes the politics that shape this interaction. From this analysis emerges an understanding of the variable capacities of different kinds of states and political systems to work with science in harnessing the potential of new epistemic territories in global life sciences innovation. PMID:27546935

  15. An Adaptive Hybrid Multiprocessor technique for bioinformatics sequence alignment

    KAUST Repository

    Bonny, Talal

    2012-07-28

    Sequence alignment algorithms such as the Smith-Waterman algorithm are among the most important applications in the development of bioinformatics. Sequence alignment algorithms must process large amounts of data which may take a long time. Here, we introduce our Adaptive Hybrid Multiprocessor technique to accelerate the implementation of the Smith-Waterman algorithm. Our technique utilizes both the graphics processing unit (GPU) and the central processing unit (CPU). It adapts to the implementation according to the number of CPUs given as input by efficiently distributing the workload between the processing units. Using existing resources (GPU and CPU) in an efficient way is a novel approach. The peak performance achieved for the platforms GPU + CPU, GPU + 2CPUs, and GPU + 3CPUs is 10.4 GCUPS, 13.7 GCUPS, and 18.6 GCUPS, respectively (with the query length of 511 amino acid). © 2010 IEEE.

  16. Meta-learning framework applied in bioinformatics inference system design.

    Science.gov (United States)

    Arredondo, Tomás; Ormazábal, Wladimir

    2015-01-01

    This paper describes a meta-learner inference system development framework which is applied and tested in the implementation of bioinformatic inference systems. These inference systems are used for the systematic classification of the best candidates for inclusion in bacterial metabolic pathway maps. This meta-learner-based approach utilises a workflow where the user provides feedback with final classification decisions which are stored in conjunction with analysed genetic sequences for periodic inference system training. The inference systems were trained and tested with three different data sets related to the bacterial degradation of aromatic compounds. The analysis of the meta-learner-based framework involved contrasting several different optimisation methods with various different parameters. The obtained inference systems were also contrasted with other standard classification methods with accurate prediction capabilities observed.

  17. ISEV position paper: extracellular vesicle RNA analysis and bioinformatics

    Directory of Open Access Journals (Sweden)

    Andrew F. Hill

    2013-12-01

    Full Text Available Extracellular vesicles (EVs are the collective term for the various vesicles that are released by cells into the extracellular space. Such vesicles include exosomes and microvesicles, which vary by their size and/or protein and genetic cargo. With the discovery that EVs contain genetic material in the form of RNA (evRNA has come the increased interest in these vesicles for their potential use as sources of disease biomarkers and potential therapeutic agents. Rapid developments in the availability of deep sequencing technologies have enabled the study of EV-related RNA in detail. In October 2012, the International Society for Extracellular Vesicles (ISEV held a workshop on “evRNA analysis and bioinformatics.” Here, we report the conclusions of one of the roundtable discussions where we discussed evRNA analysis technologies and provide some guidelines to researchers in the field to consider when performing such analysis.

  18. Bioinformatics Tools for the Discovery of New Nonribosomal Peptides

    DEFF Research Database (Denmark)

    Leclère, Valérie; Weber, Tilmann; Jacques, Philippe

    2016-01-01

    -dimensional structure of the peptides can be compared with the structural patterns of all known NRPs. The presented workflow leads to an efficient and rapid screening of genomic data generated by high throughput technologies. The exploration of such sequenced genomes may lead to the discovery of new drugs (i......This chapter helps in the use of bioinformatics tools relevant to the discovery of new nonribosomal peptides (NRPs) produced by microorganisms. The strategy described can be applied to draft or fully assembled genome sequences. It relies on the identification of the synthetase genes...... and the deciphering of the domain architecture of the nonribosomal peptide synthetases (NRPSs). In the next step, candidate peptides synthesized by these NRPSs are predicted in silico, considering the specificity of incorporated monomers together with their isomery. To assess their novelty, the two...

  19. DNA mimic proteins: functions, structures, and bioinformatic analysis.

    Science.gov (United States)

    Wang, Hao-Ching; Ho, Chun-Han; Hsu, Kai-Cheng; Yang, Jinn-Moon; Wang, Andrew H-J

    2014-05-13

    DNA mimic proteins have DNA-like negative surface charge distributions, and they function by occupying the DNA binding sites of DNA binding proteins to prevent these sites from being accessed by DNA. DNA mimic proteins control the activities of a variety of DNA binding proteins and are involved in a wide range of cellular mechanisms such as chromatin assembly, DNA repair, transcription regulation, and gene recombination. However, the sequences and structures of DNA mimic proteins are diverse, making them difficult to predict by bioinformatic search. To date, only a few DNA mimic proteins have been reported. These DNA mimics were not found by searching for functional motifs in their sequences but were revealed only by structural analysis of their charge distribution. This review highlights the biological roles and structures of 16 reported DNA mimic proteins. We also discuss approaches that might be used to discover new DNA mimic proteins.

  20. MFIB: a repository of protein complexes with mutual folding induced by binding.

    Science.gov (United States)

    Fichó, Erzsébet; Reményi, István; Simon, István; Mészáros, Bálint

    2017-11-15

    It is commonplace that intrinsically disordered proteins (IDPs) are involved in crucial interactions in the living cell. However, the study of protein complexes formed exclusively by IDPs is hindered by the lack of data and such analyses remain sporadic. Systematic studies benefited other types of protein-protein interactions paving a way from basic science to therapeutics; yet these efforts require reliable datasets that are currently lacking for synergistically folding complexes of IDPs. Here we present the Mutual Folding Induced by Binding (MFIB) database, the first systematic collection of complexes formed exclusively by IDPs. MFIB contains an order of magnitude more data than any dataset used in corresponding studies and offers a wide coverage of known IDP complexes in terms of flexibility, oligomeric composition and protein function from all domains of life. The included complexes are grouped using a hierarchical classification and are complemented with structural and functional annotations. MFIB is backed by a firm development team and infrastructure, and together with possible future community collaboration it will provide the cornerstone for structural and functional studies of IDP complexes. MFIB is freely accessible at http://mfib.enzim.ttk.mta.hu/. The MFIB application is hosted by Apache web server and was implemented in PHP. To enrich querying features and to enhance backend performance a MySQL database was also created. simon.istvan@ttk.mta.hu, meszaros.balint@ttk.mta.hu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.