WorldWideScience

Sample records for bioinformatics fte proposal

  1. Bioinformatics

    DEFF Research Database (Denmark)

    Baldi, Pierre; Brunak, Søren

    , and medicine will be particularly affected by the new results and the increased understanding of life at the molecular level. Bioinformatics is the development and application of computer methods for analysis, interpretation, and prediction, as well as for the design of experiments. It has emerged...

  2. The Bioinformatics of Integrative Medical Insights: Proposals for an International Psycho-Social and Cultural Bioinformatics Project

    Directory of Open Access Journals (Sweden)

    Ernest Rossi

    2006-01-01

    Full Text Available We propose the formation of an International Psycho-Social and Cultural Bioinformatics Project (IPCBP to explore the research foundations of Integrative Medical Insights (IMI on all levels from the molecular-genomic to the psychological, cultural, social, and spiritual. Just as The Human Genome Project identified the molecular foundations of modern medicine with the new technology of sequencing DNA during the past decade, the IPCBP would extend and integrate this neuroscience knowledge base with the technology of gene expression via DNA/proteomic microarray research and brain imaging in development, stress, healing, rehabilitation, and the psychotherapeutic facilitation of existentional wellness. We anticipate that the IPCBP will require a unique international collaboration of, academic institutions, researchers, and clinical practioners for the creation of a new neuroscience of mind-body communication, brain plasticity, memory, learning, and creative processing during optimal experiential states of art, beauty, and truth. We illustrate this emerging integration of bioinformatics with medicine with a videotape of the classical 4-stage creative process in a neuroscience approach to psychotherapy.

  3. The Bioinformatics of Integrative Medical Insights: Proposals for an International PsychoSocial and Cultural Bioinformatics Project

    Directory of Open Access Journals (Sweden)

    Ernest Rossi

    2006-01-01

    Full Text Available We propose the formation of an International PsychoSocial and Cultural Bioinformatics Project (IPCBP to explore the research foundations of Integrative Medical Insights (IMI on all levels from the molecular-genomic to the psychological, cultural, social, and spiritual. Just as The Human Genome Project identified the molecular foundations of modern medicine with the new technology of sequencing DNA during the past decade, the IPCBP would extend and integrate this neuroscience knowledge base with the technology of gene expression via DNA/proteomic microarray research and brain imaging in development, stress, healing, rehabilitation, and the psychotherapeutic facilitation of existentional wellness. We anticipate that the IPCBP will require a unique international collaboration of, academic institutions, researchers, and clinical practioners for the creation of a new neuroscience of mind-body communication, brain plasticity, memory, learning, and creative processing during optimal experiential states of art, beauty, and truth. We illustrate this emerging integration of bioinformatics with medicine with a videotape of the classical 4-stage creative process in a neuroscience approach to psychotherapy.

  4. What is bioinformatics? A proposed definition and overview of the field.

    Science.gov (United States)

    Luscombe, N M; Greenbaum, D; Gerstein, M

    2001-01-01

    The recent flood of data from genome sequences and functional genomics has given rise to new field, bioinformatics, which combines elements of biology and computer science. Here we propose a definition for this new field and review some of the research that is being pursued, particularly in relation to transcriptional regulatory systems. Our definition is as follows: Bioinformatics is conceptualizing biology in terms of macromolecules (in the sense of physical-chemistry) and then applying "informatics" techniques (derived from disciplines such as applied maths, computer science, and statistics) to understand and organize the information associated with these molecules, on a large-scale. Analyses in bioinformatics predominantly focus on three types of large datasets available in molecular biology: macromolecular structures, genome sequences, and the results of functional genomics experiments (e.g. expression data). Additional information includes the text of scientific papers and "relationship data" from metabolic pathways, taxonomy trees, and protein-protein interaction networks. Bioinformatics employs a wide range of computational techniques including sequence and structural alignment, database design and data mining, macromolecular geometry, phylogenetic tree construction, prediction of protein structure and function, gene finding, and expression data clustering. The emphasis is on approaches integrating a variety of computational methods and heterogeneous data sources. Finally, bioinformatics is a practical discipline. We survey some representative applications, such as finding homologues, designing drugs, and performing large-scale censuses. Additional information pertinent to the review is available over the web at http://bioinfo.mbb.yale.edu/what-is-it.

  5. Evaluation of a 5-tier scheme proposed for classification of sequence variants using bioinformatic and splicing assay data

    DEFF Research Database (Denmark)

    Walker, Logan C; Whiley, Phillip J; Houdayer, Claude

    2013-01-01

    Splicing assays are commonly undertaken in the clinical setting to assess the clinical relevance of sequence variants in disease predisposition genes. A 5-tier classification system incorporating both bioinformatic and splicing assay information was previously proposed as a method to provide...

  6. Postirradiation examination of Peach Bottom fuel test element FTE-4

    International Nuclear Information System (INIS)

    Wallroth, C.F.; Holzgraf, J.F.; Jensen, D.D.; Zumwalt, L.R.

    1977-07-01

    The report presents the irradiation results and their evaluation for Peach Bottom fuel test element FTE-4. It describes in detail the efforts by General Atomic Company over the last two years to establish a system for extracting meaningful performance information from a fuel test element. This has been done with the goal of making direct comparisons between as-measured data and core design code predictions. Special emphasis has been placed on determining the 95% confidence limits on most of the preirradiation and postirradiation measurements in order to allow a better comparison with GAUGE, FEVER, and TREVER code calculations which are used in HTGR core thermal and mechanical design

  7. Aptamer Bioinformatics

    Directory of Open Access Journals (Sweden)

    Andrew B. Kinghorn

    2017-11-01

    Full Text Available Aptamers are short nucleic acid sequences capable of specific, high-affinity molecular binding. They are isolated via SELEX (Systematic Evolution of Ligands by Exponential Enrichment, an evolutionary process that involves iterative rounds of selection and amplification before sequencing and aptamer characterization. As aptamers are genetic in nature, bioinformatic approaches have been used to improve both aptamers and their selection. This review will discuss the advancements made in several enclaves of aptamer bioinformatics, including simulation of aptamer selection, fragment-based aptamer design, patterning of libraries, identification of lead aptamers from high-throughput sequencing (HTS data and in silico aptamer optimization.

  8. Headcount and FTE data in the European health workforce monitoring and planning process.

    Science.gov (United States)

    Girasek, Edmond; Kovács, Eszter; Aszalós, Zoltán; Eke, Edit; Ragány, Károly; Kovács, Réka; Cserháti, Zoltán; Szócska, Miklós

    2016-07-16

    Health workforce (HWF) planning and monitoring processes face challenges regarding data and appropriate indicators. One such area fraught with difficulties is labour activity and, more specifically, defining headcount and full-time equivalent (FTE). This study aims to review national practices in FTE calculation formulas for selected EU Member States (MS). The research was conducted as a part of the Joint Action on European Health Workforce Planning and Forecasting. Definitions, categories and terms concerning the five sectoral professions were examined in 14 MS by conducting a survey. To gain a deeper understanding of the international data-reporting processes (Joint Questionnaire on Non-Monetary Health Care Statistics-JQ), six international expert interviews were conducted by using a semi-structured interview guide. Of the 14 investigated countries, four MS indicated that they report FTE to the JQ and that they also calculate FTE data for national planning purposes. The other countries do not use FTE data for national purposes, but most of them do use special calculations and/or estimation methods for converting headcount to FTE. The findings revealed significant differences between national calculation methods when reporting FTE data to the JQ. This diversity in terms of calculations and estimations can lead to biases with respect to international comparisons. This finding was reinforced by the expert interviews, since the experts agreed that the activities of healthcare professionals are a fundamental factor in HWF monitoring and planning. Experts underscored that activity should also be measured by FTE, and not only by headcount. FTE and headcount are significant factors in HWF planning and monitoring; therefore, national data collections should place emphasis on collecting data and calculating the appropriate indicators. National FTE could serve as a call to action for HWF planners due to the lack of matching international FTE data. At the international level

  9. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software.

    Science.gov (United States)

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians.

  10. Videopodcasts in der Hochschullehre – Angehende Lehrkräfte und praktische Mediengestaltung

    Directory of Open Access Journals (Sweden)

    Daniel Rode

    2014-09-01

    Full Text Available Neue Inhaltsbereiche in die Ausbildung von SportlehrerInnen integrieren? Daniel Rohde und Martin Stern stellen ein universitäres Lehrveranstaltungsformat vor, in dessen Rahmen angehende Lehrkräfte digitale Videopodcasts erstellen und skizzieren bildungstheoretische und mediendidaktische Anknüpfungspunkte.

  11. FTE MODIFICATION OF NURSING STAFF CALCULATING FORMULA WITH TEAM PROFESIONAL NURSING CARE MODEL

    Directory of Open Access Journals (Sweden)

    Erlin Kurnia

    2017-04-01

    Full Text Available Introduction: A variety of formulas that can be done to count the needs of nursing staff in inpatient rooms include Ministry of Health Republic of Indonesia method, Gillies, Nina Formulation, Douglas, and Full Time Equivalent (FTE. The purpose of this study was to recommend the formula for calculating nurse staff needs in implementation of team nursing model of care delivery. Method: The design used in this study was a time and motion study. Data was collected by observations and questionnaires. The population was the nurses who work at Kediri Baptist Hospital inpatient wards. The observation and questionnaires to the resource persons utilized as a data collection method. Two inpatient wards were the taken as simulation places, there were Ward A and Ward B. Ward A was taken as simulation place based on FTE method and Ward B was taken as simulation place based on Ministry of Health Republic of Indonesia method. Based on the calculation according to the Ministry of Health Republic of Indonesia method obtained the required number of nursing staff as many as 17 people in Ward A and 23 in Ward B. Meanwhile, according to FTE count obtained the number of nursing staff as many as 20 people in Ward A and 33 in Ward B. Result: The simulation results obtained an increase in performance of duties and job satisfaction of nurses in inpatient wards that were simulated using the FTE method. Discussion: The inpatient ward that is simulated using the Ministry of Health Republic of Indonesia method obtained an increase in performance of duties but a decrease in job satisfactions. It can be concluded that the FTE method is more appropriate to use than Ministry of Health Republic of Indonesia.

  12. Proliferation in the normal FTE is a hallmark of the follicular phase, not BRCA mutation status.

    Science.gov (United States)

    George, Sophia H L; Milea, Anca; Shaw, Patricia A

    2012-11-15

    Women who have inherited germline mutations of BRCA1/BRCA2 are at increased risk of developing high-grade serous carcinoma, and many of these cancers arise in the distal fimbriated end of the fallopian tube. We have previously shown that the fallopian tube epithelia of BRCA1 mutation carriers (FTE-BRCA) have altered signaling pathways compared to nonmutation carriers. In this study, we sought to determine whether these differences result in a proliferative advantage to the epithelia in this high-risk patient population and to investigate whether the postovulation environment of the FTE-BRCA compared to FTE from nonmutation carriers experiences a differential abundance of immune cells. Immunohistochemistry for Ki67, CD3, CD8, CD20, and CD68 was performed on histologically normal tubal epithelium (ampulla, n = 83), fimbria (n = 18) with known ovarian cycle status and germline mutation status and for Ki67 on fimbrial epithelium from women (n = 144) with and without BRCA1 or BRCA2 mutations who underwent risk-reducing salpingo-oophorectomy (RRSO). Serous tubal intraepithelial carcinomas (STIC) with concomitant cancer (n = 15) were also analyzed for presence of immune infiltrates. All slides were digitized and analyzed using automated image analysis software. There was no significant difference in the proliferative index in histologically normal FTE between BRCA1/BRCA2 and non-BRCA, in 144 fimbriae and 83 ampullae. The FTE-BRCA1 epithelia did not exhibit a differential presence of lymphocytes or macrophages, however more macrophages were present in the luteal phase compared to the follicular phase epithelia. In STICs macrophages were more abundant than lymphocytes with an incremental increase noted with disease progression. BRCA1/2 mutation carriers exhibited no significant increase in proliferation in the fallopian tube epithelial cells either in the ampulla or fimbriated ends of the tube. Rather, a significant proliferative increase was defined in the cases determined

  13. Generalized Centroid Estimators in Bioinformatics

    Science.gov (United States)

    Hamada, Michiaki; Kiryu, Hisanori; Iwasaki, Wataru; Asai, Kiyoshi

    2011-01-01

    In a number of estimation problems in bioinformatics, accuracy measures of the target problem are usually given, and it is important to design estimators that are suitable to those accuracy measures. However, there is often a discrepancy between an employed estimator and a given accuracy measure of the problem. In this study, we introduce a general class of efficient estimators for estimation problems on high-dimensional binary spaces, which represent many fundamental problems in bioinformatics. Theoretical analysis reveals that the proposed estimators generally fit with commonly-used accuracy measures (e.g. sensitivity, PPV, MCC and F-score) as well as it can be computed efficiently in many cases, and cover a wide range of problems in bioinformatics from the viewpoint of the principle of maximum expected accuracy (MEA). It is also shown that some important algorithms in bioinformatics can be interpreted in a unified manner. Not only the concept presented in this paper gives a useful framework to design MEA-based estimators but also it is highly extendable and sheds new light on many problems in bioinformatics. PMID:21365017

  14. Biggest challenges in bioinformatics.

    Science.gov (United States)

    Fuller, Jonathan C; Khoueiry, Pierre; Dinkel, Holger; Forslund, Kristoffer; Stamatakis, Alexandros; Barry, Joseph; Budd, Aidan; Soldatos, Theodoros G; Linssen, Katja; Rajput, Abdul Mateen

    2013-04-01

    The third Heidelberg Unseminars in Bioinformatics (HUB) was held on 18th October 2012, at Heidelberg University, Germany. HUB brought together around 40 bioinformaticians from academia and industry to discuss the 'Biggest Challenges in Bioinformatics' in a 'World Café' style event.

  15. Biggest challenges in bioinformatics

    OpenAIRE

    Fuller, Jonathan C; Khoueiry, Pierre; Dinkel, Holger; Forslund, Kristoffer; Stamatakis, Alexandros; Barry, Joseph; Budd, Aidan; Soldatos, Theodoros G; Linssen, Katja; Rajput, Abdul Mateen

    2013-01-01

    The third Heidelberg Unseminars in Bioinformatics (HUB) was held in October at Heidelberg University in Germany. HUB brought together around 40 bioinformaticians from academia and industry to discuss the ‘Biggest Challenges in Bioinformatics' in a ‘World Café' style event.

  16. Computational Biology and Bioinformatics in Nigeria

    Science.gov (United States)

    Fatumo, Segun A.; Adoga, Moses P.; Ojo, Opeolu O.; Oluwagbemi, Olugbenga; Adeoye, Tolulope; Ewejobi, Itunuoluwa; Adebiyi, Marion; Adebiyi, Ezekiel; Bewaji, Clement; Nashiru, Oyekanmi

    2014-01-01

    Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries. PMID:24763310

  17. Computational biology and bioinformatics in Nigeria.

    Directory of Open Access Journals (Sweden)

    Segun A Fatumo

    2014-04-01

    Full Text Available Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  18. Computational biology and bioinformatics in Nigeria.

    Science.gov (United States)

    Fatumo, Segun A; Adoga, Moses P; Ojo, Opeolu O; Oluwagbemi, Olugbenga; Adeoye, Tolulope; Ewejobi, Itunuoluwa; Adebiyi, Marion; Adebiyi, Ezekiel; Bewaji, Clement; Nashiru, Oyekanmi

    2014-04-01

    Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  19. Bioinformatics and Cancer

    Science.gov (United States)

    Researchers take on challenges and opportunities to mine "Big Data" for answers to complex biological questions. Learn how bioinformatics uses advanced computing, mathematics, and technological platforms to store, manage, analyze, and understand data.

  20. Deep learning in bioinformatics.

    Science.gov (United States)

    Min, Seonwoo; Lee, Byunghan; Yoon, Sungroh

    2017-09-01

    In the era of big data, transformation of biomedical big data into valuable knowledge has been one of the most important challenges in bioinformatics. Deep learning has advanced rapidly since the early 2000s and now demonstrates state-of-the-art performance in various fields. Accordingly, application of deep learning in bioinformatics to gain insight from data has been emphasized in both academia and industry. Here, we review deep learning in bioinformatics, presenting examples of current research. To provide a useful and comprehensive perspective, we categorize research both by the bioinformatics domain (i.e. omics, biomedical imaging, biomedical signal processing) and deep learning architecture (i.e. deep neural networks, convolutional neural networks, recurrent neural networks, emergent architectures) and present brief descriptions of each study. Additionally, we discuss theoretical and practical issues of deep learning in bioinformatics and suggest future research directions. We believe that this review will provide valuable insights and serve as a starting point for researchers to apply deep learning approaches in their bioinformatics studies. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  1. Postirradiation examination and evaluation of Peach Bottom fuel test element FTE-6

    International Nuclear Information System (INIS)

    Wallroth, C.F.; Holzgraf, J.F.; Jensen, D.D.

    1977-09-01

    Fuel test element FTE-6 was irradiated in the Peach Bottom high-temperature gas-cooled reactor (HTGR) for 645 equivalent full power days. Four fuel varieties, contained in H-327 graphite bodies, were tested. A primary result of this test has been to demonstrate acceptable performance even with calculated high stresses in the graphite bodies. Heterogeneous fuel loadings in the element caused local power peaking and azimuthal power variations, deforming the graphite fuel bodies and thereby causing bowing nearly five times as large as the diametral clearance within the sleeve. The axial stresses resulting from interference between the fuel bodies and sleeve were estimated to have reached 45% of the ultimate material strength at the end of the irradiation. Residual stresses from differential contraction within the fuel body resulted in probable in-plane stress levels of 130% of the material strength at the end-of-life shutdown and of up to 150% of the strength at shutdown during the irradiation cycle. The high in-plane stresses are local peaks at the corners of a sharp notch in the element, which may account for the stresses failing to cause damage. The lack of observable damage, however, indicates that the methods and data used for stress analysis give results that are either fairly accurate or conservative

  2. Teaching bioinformatics to engineers.

    Science.gov (United States)

    Mihalas, George I; Tudor, Anca; Paralescu, Sorin; Andor, Minodora; Stoicu-Tivadar, Lacramioara

    2014-01-01

    The paper refers to our methodology and experience in establishing the content of the course in bioinformatics introduced to the school of "Information Systems in Healthcare" (SIIS), master level. The syllabi of both lectures and laboratory works are presented and discussed.

  3. Bioinformatics Training Network (BTN): a community resource for bioinformatics trainers.

    Science.gov (United States)

    Schneider, Maria V; Walter, Peter; Blatter, Marie-Claude; Watson, James; Brazas, Michelle D; Rother, Kristian; Budd, Aidan; Via, Allegra; van Gelder, Celia W G; Jacob, Joachim; Fernandes, Pedro; Nyrönen, Tommi H; De Las Rivas, Javier; Blicher, Thomas; Jimenez, Rafael C; Loveland, Jane; McDowall, Jennifer; Jones, Phil; Vaughan, Brendan W; Lopez, Rodrigo; Attwood, Teresa K; Brooksbank, Catherine

    2012-05-01

    Funding bodies are increasingly recognizing the need to provide graduates and researchers with access to short intensive courses in a variety of disciplines, in order both to improve the general skills base and to provide solid foundations on which researchers may build their careers. In response to the development of 'high-throughput biology', the need for training in the field of bioinformatics, in particular, is seeing a resurgence: it has been defined as a key priority by many Institutions and research programmes and is now an important component of many grant proposals. Nevertheless, when it comes to planning and preparing to meet such training needs, tension arises between the reward structures that predominate in the scientific community which compel individuals to publish or perish, and the time that must be devoted to the design, delivery and maintenance of high-quality training materials. Conversely, there is much relevant teaching material and training expertise available worldwide that, were it properly organized, could be exploited by anyone who needs to provide training or needs to set up a new course. To do this, however, the materials would have to be centralized in a database and clearly tagged in relation to target audiences, learning objectives, etc. Ideally, they would also be peer reviewed, and easily and efficiently accessible for downloading. Here, we present the Bioinformatics Training Network (BTN), a new enterprise that has been initiated to address these needs and review it, respectively, to similar initiatives and collections.

  4. An integrative bioinformatics pipeline for the genomewide ...

    Indian Academy of Sciences (India)

    2013-12-06

    Dec 6, 2013 ... The majority of miRNAs in pig (Sus scrofa), an impor- tant domestic animal, remain unknown. From this perspec- tive, we attempted the genomewide identification of novel porcine miRNAs. Here, we propose a novel integrative bioinformatics pipeline to identify conservative and non- conservative novel ...

  5. Bioinformatics for Exploration

    Science.gov (United States)

    Johnson, Kathy A.

    2006-01-01

    For the purpose of this paper, bioinformatics is defined as the application of computer technology to the management of biological information. It can be thought of as the science of developing computer databases and algorithms to facilitate and expedite biological research. This is a crosscutting capability that supports nearly all human health areas ranging from computational modeling, to pharmacodynamics research projects, to decision support systems within autonomous medical care. Bioinformatics serves to increase the efficiency and effectiveness of the life sciences research program. It provides data, information, and knowledge capture which further supports management of the bioastronautics research roadmap - identifying gaps that still remain and enabling the determination of which risks have been addressed.

  6. Advance in structural bioinformatics

    CERN Document Server

    Wei, Dongqing; Zhao, Tangzhen; Dai, Hao

    2014-01-01

    This text examines in detail mathematical and physical modeling, computational methods and systems for obtaining and analyzing biological structures, using pioneering research cases as examples. As such, it emphasizes programming and problem-solving skills. It provides information on structure bioinformatics at various levels, with individual chapters covering introductory to advanced aspects, from fundamental methods and guidelines on acquiring and analyzing genomics and proteomics sequences, the structures of protein, DNA and RNA, to the basics of physical simulations and methods for conform

  7. Phylogenetic trees in bioinformatics

    Energy Technology Data Exchange (ETDEWEB)

    Burr, Tom L [Los Alamos National Laboratory

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  8. Crowdsourcing for bioinformatics.

    Science.gov (United States)

    Good, Benjamin M; Su, Andrew I

    2013-08-15

    Bioinformatics is faced with a variety of problems that require human involvement. Tasks like genome annotation, image analysis, knowledge-base population and protein structure determination all benefit from human input. In some cases, people are needed in vast quantities, whereas in others, we need just a few with rare abilities. Crowdsourcing encompasses an emerging collection of approaches for harnessing such distributed human intelligence. Recently, the bioinformatics community has begun to apply crowdsourcing in a variety of contexts, yet few resources are available that describe how these human-powered systems work and how to use them effectively in scientific domains. Here, we provide a framework for understanding and applying several different types of crowdsourcing. The framework considers two broad classes: systems for solving large-volume 'microtasks' and systems for solving high-difficulty 'megatasks'. Within these classes, we discuss system types, including volunteer labor, games with a purpose, microtask markets and open innovation contests. We illustrate each system type with successful examples in bioinformatics and conclude with a guide for matching problems to crowdsourcing solutions that highlights the positives and negatives of different approaches.

  9. Bioinformatics Training Network (BTN): a community resource for bioinformatics trainers

    DEFF Research Database (Denmark)

    Schneider, Maria V.; Walter, Peter; Blatter, Marie-Claude

    2012-01-01

    Funding bodies are increasingly recognizing the need to provide graduates and researchers with access to short intensive courses in a variety of disciplines, in order both to improve the general skills base and to provide solid foundations on which researchers may build their careers. In response...... to the development of ‘high-throughput biology’, the need for training in the field of bioinformatics, in particular, is seeing a resurgence: it has been defined as a key priority by many Institutions and research programmes and is now an important component of many grant proposals. Nevertheless, when it comes...... to planning and preparing to meet such training needs, tension arises between the reward structures that predominate in the scientific community which compel individuals to publish or perish, and the time that must be devoted to the design, delivery and maintenance of high-quality training materials...

  10. A Bioinformatics Reference Model: Towards a Framework for Developing and Organising Bioinformatic Resources

    Science.gov (United States)

    Hiew, Hong Liang; Bellgard, Matthew

    2007-11-01

    Life Science research faces the constant challenge of how to effectively handle an ever-growing body of bioinformatics software and online resources. The users and developers of bioinformatics resources have a diverse set of competing demands on how these resources need to be developed and organised. Unfortunately, there does not exist an adequate community-wide framework to integrate such competing demands. The problems that arise from this include unstructured standards development, the emergence of tools that do not meet specific needs of researchers, and often times a communications gap between those who use the tools and those who supply them. This paper presents an overview of the different functions and needs of bioinformatics stakeholders to determine what may be required in a community-wide framework. A Bioinformatics Reference Model is proposed as a basis for such a framework. The reference model outlines the functional relationship between research usage and technical aspects of bioinformatics resources. It separates important functions into multiple structured layers, clarifies how they relate to each other, and highlights the gaps that need to be addressed for progress towards a diverse, manageable, and sustainable body of resources. The relevance of this reference model to the bioscience research community, and its implications in progress for organising our bioinformatics resources, are discussed.

  11. Flow cytometry bioinformatics.

    Directory of Open Access Journals (Sweden)

    Kieran O'Neill

    Full Text Available Flow cytometry bioinformatics is the application of bioinformatics to flow cytometry data, which involves storing, retrieving, organizing, and analyzing flow cytometry data using extensive computational resources and tools. Flow cytometry bioinformatics requires extensive use of and contributes to the development of techniques from computational statistics and machine learning. Flow cytometry and related methods allow the quantification of multiple independent biomarkers on large numbers of single cells. The rapid growth in the multidimensionality and throughput of flow cytometry data, particularly in the 2000s, has led to the creation of a variety of computational analysis methods, data standards, and public databases for the sharing of results. Computational methods exist to assist in the preprocessing of flow cytometry data, identifying cell populations within it, matching those cell populations across samples, and performing diagnosis and discovery using the results of previous steps. For preprocessing, this includes compensating for spectral overlap, transforming data onto scales conducive to visualization and analysis, assessing data for quality, and normalizing data across samples and experiments. For population identification, tools are available to aid traditional manual identification of populations in two-dimensional scatter plots (gating, to use dimensionality reduction to aid gating, and to find populations automatically in higher dimensional space in a variety of ways. It is also possible to characterize data in more comprehensive ways, such as the density-guided binary space partitioning technique known as probability binning, or by combinatorial gating. Finally, diagnosis using flow cytometry data can be aided by supervised learning techniques, and discovery of new cell types of biological importance by high-throughput statistical methods, as part of pipelines incorporating all of the aforementioned methods. Open standards, data

  12. E-Learning und Blended-Learning-Angebote: Möglichkeiten beruflicher Weiterbildung für Kita-Fachkräfte

    Directory of Open Access Journals (Sweden)

    Christina Buschle

    2018-03-01

    Full Text Available Der vorliegende Artikel thematisiert die Chancen von E-Learning-Angeboten – und im speziellen Blended-Learning-Angeboten – für die berufsbezogene Weiterbildung frühpädagogischer Fachkräfte. Gleichzeitig zeigt er Schwierigkeiten und Herausforderungen auf, welche die Verwendung dieser Formate im Arbeitsfeld Kindertageseinrichtung (Kita und in der Weiterbildung mit sich bringen. Dazu wird ein Überblick über ausgewählte Weiterbildungsangebote gegeben, die mit beiden Formaten arbeiten. Ergänzend werden das Weiterbildungsverhalten der Fachkräfte und damit auch deren Bedarfe untersucht sowie der Forschungsstand zu Blended-Learning-Massnahmen in der Weiterbildung skizziert. Blended-Learning-Formate können berufsbezogene Weiterbildungsangebote für Kita-Fachkräfte gewinnbringend ergänzen. Allerdings sind bisher Angebot und Zielgruppe erst mässig aufeinander abgestimmt. In Zeiten knapper Personalressourcen darf jedoch Blended-Learning nicht als ökonomische Lösung zur Wahrung des Weiterbildungsbedarfs verstanden werden. Vielmehr müssen Frühpädagogik und Weiterbildung bezüglich ihrer Mediennutzung gleichermassen an den Modernisierungsprozessen partizipieren. Nur so können die Angebote auf eine für Einrichtung, Fachkräfte und Kinder sinnvolle Weise genutzt, die Transformationsprozesse im Arbeitsfeld Kita vorangetrieben und die gesellschaftliche Teilhabe und Partizipation der Kita-Fachkräfte gesichert werden.

  13. 4273π: Bioinformatics education on low cost ARM hardware

    Science.gov (United States)

    2013-01-01

    Background Teaching bioinformatics at universities is complicated by typical computer classroom settings. As well as running software locally and online, students should gain experience of systems administration. For a future career in biology or bioinformatics, the installation of software is a useful skill. We propose that this may be taught by running the course on GNU/Linux running on inexpensive Raspberry Pi computer hardware, for which students may be granted full administrator access. Results We release 4273π, an operating system image for Raspberry Pi based on Raspbian Linux. This includes minor customisations for classroom use and includes our Open Access bioinformatics course, 4273π Bioinformatics for Biologists. This is based on the final-year undergraduate module BL4273, run on Raspberry Pi computers at the University of St Andrews, Semester 1, academic year 2012–2013. Conclusions 4273π is a means to teach bioinformatics, including systems administration tasks, to undergraduates at low cost. PMID:23937194

  14. 4273π: bioinformatics education on low cost ARM hardware.

    Science.gov (United States)

    Barker, Daniel; Ferrier, David Ek; Holland, Peter Wh; Mitchell, John Bo; Plaisier, Heleen; Ritchie, Michael G; Smart, Steven D

    2013-08-12

    Teaching bioinformatics at universities is complicated by typical computer classroom settings. As well as running software locally and online, students should gain experience of systems administration. For a future career in biology or bioinformatics, the installation of software is a useful skill. We propose that this may be taught by running the course on GNU/Linux running on inexpensive Raspberry Pi computer hardware, for which students may be granted full administrator access. We release 4273π, an operating system image for Raspberry Pi based on Raspbian Linux. This includes minor customisations for classroom use and includes our Open Access bioinformatics course, 4273π Bioinformatics for Biologists. This is based on the final-year undergraduate module BL4273, run on Raspberry Pi computers at the University of St Andrews, Semester 1, academic year 2012-2013. 4273π is a means to teach bioinformatics, including systems administration tasks, to undergraduates at low cost.

  15. Emergent Computation Emphasizing Bioinformatics

    CERN Document Server

    Simon, Matthew

    2005-01-01

    Emergent Computation is concerned with recent applications of Mathematical Linguistics or Automata Theory. This subject has a primary focus upon "Bioinformatics" (the Genome and arising interest in the Proteome), but the closing chapter also examines applications in Biology, Medicine, Anthropology, etc. The book is composed of an organized examination of DNA, RNA, and the assembly of amino acids into proteins. Rather than examine these areas from a purely mathematical viewpoint (that excludes much of the biochemical reality), the author uses scientific papers written mostly by biochemists based upon their laboratory observations. Thus while DNA may exist in its double stranded form, triple stranded forms are not excluded. Similarly, while bases exist in Watson-Crick complements, mismatched bases and abasic pairs are not excluded, nor are Hoogsteen bonds. Just as there are four bases naturally found in DNA, the existence of additional bases is not ignored, nor amino acids in addition to the usual complement of...

  16. Bioinformatics meets parasitology.

    Science.gov (United States)

    Cantacessi, C; Campbell, B E; Jex, A R; Young, N D; Hall, R S; Ranganathan, S; Gasser, R B

    2012-05-01

    The advent and integration of high-throughput '-omics' technologies (e.g. genomics, transcriptomics, proteomics, metabolomics, glycomics and lipidomics) are revolutionizing the way biology is done, allowing the systems biology of organisms to be explored. These technologies are now providing unique opportunities for global, molecular investigations of parasites. For example, studies of a transcriptome (all transcripts in an organism, tissue or cell) have become instrumental in providing insights into aspects of gene expression, regulation and function in a parasite, which is a major step to understanding its biology. The purpose of this article was to review recent applications of next-generation sequencing technologies and bioinformatic tools to large-scale investigations of the transcriptomes of parasitic nematodes of socio-economic significance (particularly key species of the order Strongylida) and to indicate the prospects and implications of these explorations for developing novel methods of parasite intervention. © 2011 Blackwell Publishing Ltd.

  17. Virtual Bioinformatics Distance Learning Suite

    Science.gov (United States)

    Tolvanen, Martti; Vihinen, Mauno

    2004-01-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material…

  18. Interdisciplinary Introductory Course in Bioinformatics

    Science.gov (United States)

    Kortsarts, Yana; Morris, Robert W.; Utell, Janine M.

    2010-01-01

    Bioinformatics is a relatively new interdisciplinary field that integrates computer science, mathematics, biology, and information technology to manage, analyze, and understand biological, biochemical and biophysical information. We present our experience in teaching an interdisciplinary course, Introduction to Bioinformatics, which was developed…

  19. Agile parallel bioinformatics workflow management using Pwrake.

    Science.gov (United States)

    Mishima, Hiroyuki; Sasaki, Kensaku; Tanaka, Masahiro; Tatebe, Osamu; Yoshiura, Koh-Ichiro

    2011-09-08

    In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error.Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK) and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows. Furthermore, readability and maintainability of rakefiles

  20. Agile parallel bioinformatics workflow management using Pwrake

    Science.gov (United States)

    2011-01-01

    Background In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error. Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. Findings We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK) and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Conclusions Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows. Furthermore, readability

  1. Microbial bioinformatics 2020.

    Science.gov (United States)

    Pallen, Mark J

    2016-09-01

    Microbial bioinformatics in 2020 will remain a vibrant, creative discipline, adding value to the ever-growing flood of new sequence data, while embracing novel technologies and fresh approaches. Databases and search strategies will struggle to cope and manual curation will not be sustainable during the scale-up to the million-microbial-genome era. Microbial taxonomy will have to adapt to a situation in which most microorganisms are discovered and characterised through the analysis of sequences. Genome sequencing will become a routine approach in clinical and research laboratories, with fresh demands for interpretable user-friendly outputs. The "internet of things" will penetrate healthcare systems, so that even a piece of hospital plumbing might have its own IP address that can be integrated with pathogen genome sequences. Microbiome mania will continue, but the tide will turn from molecular barcoding towards metagenomics. Crowd-sourced analyses will collide with cloud computing, but eternal vigilance will be the price of preventing the misinterpretation and overselling of microbial sequence data. Output from hand-held sequencers will be analysed on mobile devices. Open-source training materials will address the need for the development of a skilled labour force. As we boldly go into the third decade of the twenty-first century, microbial sequence space will remain the final frontier! © 2016 The Author. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.

  2. Adapting bioinformatics curricula for big data

    Science.gov (United States)

    Greene, Anna C.; Giffin, Kristine A.; Greene, Casey S.

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. PMID:25829469

  3. Bringing Web 2.0 to bioinformatics.

    Science.gov (United States)

    Zhang, Zhang; Cheung, Kei-Hoi; Townsend, Jeffrey P

    2009-01-01

    Enabling deft data integration from numerous, voluminous and heterogeneous data sources is a major bioinformatic challenge. Several approaches have been proposed to address this challenge, including data warehousing and federated databasing. Yet despite the rise of these approaches, integration of data from multiple sources remains problematic and toilsome. These two approaches follow a user-to-computer communication model for data exchange, and do not facilitate a broader concept of data sharing or collaboration among users. In this report, we discuss the potential of Web 2.0 technologies to transcend this model and enhance bioinformatics research. We propose a Web 2.0-based Scientific Social Community (SSC) model for the implementation of these technologies. By establishing a social, collective and collaborative platform for data creation, sharing and integration, we promote a web services-based pipeline featuring web services for computer-to-computer data exchange as users add value. This pipeline aims to simplify data integration and creation, to realize automatic analysis, and to facilitate reuse and sharing of data. SSC can foster collaboration and harness collective intelligence to create and discover new knowledge. In addition to its research potential, we also describe its potential role as an e-learning platform in education. We discuss lessons from information technology, predict the next generation of Web (Web 3.0), and describe its potential impact on the future of bioinformatics studies.

  4. Comprehensive decision tree models in bioinformatics.

    Science.gov (United States)

    Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter

    2012-01-01

    Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class attributes and a high number of possibly

  5. Comprehensive decision tree models in bioinformatics.

    Directory of Open Access Journals (Sweden)

    Gregor Stiglic

    Full Text Available PURPOSE: Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. METHODS: This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. RESULTS: The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. CONCLUSIONS: The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets

  6. How to test bioinformatics software?

    Science.gov (United States)

    Kamali, Amir Hossein; Giannoulatou, Eleni; Chen, Tsong Yueh; Charleston, Michael A; McEwan, Alistair L; Ho, Joshua W K

    2015-09-01

    Bioinformatics is the application of computational, mathematical and statistical techniques to solve problems in biology and medicine. Bioinformatics programs developed for computational simulation and large-scale data analysis are widely used in almost all areas of biophysics. The appropriate choice of algorithms and correct implementation of these algorithms are critical for obtaining reliable computational results. Nonetheless, it is often very difficult to systematically test these programs as it is often hard to verify the correctness of the output, and to effectively generate failure-revealing test cases. Software testing is an important process of verification and validation of scientific software, but very few studies have directly dealt with the issues of bioinformatics software testing. In this work, we review important concepts and state-of-the-art methods in the field of software testing. We also discuss recent reports on adapting and implementing software testing methodologies in the bioinformatics field, with specific examples drawn from systems biology and genomic medicine.

  7. Designing XML schemas for bioinformatics.

    Science.gov (United States)

    Bruhn, Russel Elton; Burton, Philip John

    2003-06-01

    Data interchange bioinformatics databases will, in the future, most likely take place using extensible markup language (XML). The document structure will be described by an XML Schema rather than a document type definition (DTD). To ensure flexibility, the XML Schema must incorporate aspects of Object-Oriented Modeling. This impinges on the choice of the data model, which, in turn, is based on the organization of bioinformatics data by biologists. Thus, there is a need for the general bioinformatics community to be aware of the design issues relating to XML Schema. This paper, which is aimed at a general bioinformatics audience, uses examples to describe the differences between a DTD and an XML Schema and indicates how Unified Modeling Language diagrams may be used to incorporate Object-Oriented Modeling in the design of schema.

  8. Genome Exploitation and Bioinformatics Tools

    Science.gov (United States)

    de Jong, Anne; van Heel, Auke J.; Kuipers, Oscar P.

    Bioinformatic tools can greatly improve the efficiency of bacteriocin screening efforts by limiting the amount of strains. Different classes of bacteriocins can be detected in genomes by looking at different features. Finding small bacteriocins can be especially challenging due to low homology and because small open reading frames (ORFs) are often omitted from annotations. In this chapter, several bioinformatic tools/strategies to identify bacteriocins in genomes are discussed.

  9. Bioinformatics on the cloud computing platform Azure.

    Directory of Open Access Journals (Sweden)

    Hugh P Shanahan

    Full Text Available We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development.

  10. Bioinformatics on the cloud computing platform Azure.

    Science.gov (United States)

    Shanahan, Hugh P; Owen, Anne M; Harrison, Andrew P

    2014-01-01

    We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development.

  11. Emerging strengths in Asia Pacific bioinformatics

    OpenAIRE

    Ranganathan, Shoba; Hsu, Wen-Lian; Yang, Ueng-Cheng; Tan, Tin Wee

    2008-01-01

    The 2008 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998, was organized as the 7th International Conference on Bioinformatics (InCoB), jointly with the Bioinformatics and Systems Biology in Taiwan (BIT 2008) Conference, Oct. 20?23, 2008 at Taipei, Taiwan. Besides bringing together scientists from the field of bioinformatics in this region, InCoB is actively involving researchers from the area of systems biology,...

  12. Taking Bioinformatics to Systems Medicine.

    Science.gov (United States)

    van Kampen, Antoine H C; Moerland, Perry D

    2016-01-01

    Systems medicine promotes a range of approaches and strategies to study human health and disease at a systems level with the aim of improving the overall well-being of (healthy) individuals, and preventing, diagnosing, or curing disease. In this chapter we discuss how bioinformatics critically contributes to systems medicine. First, we explain the role of bioinformatics in the management and analysis of data. In particular we show the importance of publicly available biological and clinical repositories to support systems medicine studies. Second, we discuss how the integration and analysis of multiple types of omics data through integrative bioinformatics may facilitate the determination of more predictive and robust disease signatures, lead to a better understanding of (patho)physiological molecular mechanisms, and facilitate personalized medicine. Third, we focus on network analysis and discuss how gene networks can be constructed from omics data and how these networks can be decomposed into smaller modules. We discuss how the resulting modules can be used to generate experimentally testable hypotheses, provide insight into disease mechanisms, and lead to predictive models. Throughout, we provide several examples demonstrating how bioinformatics contributes to systems medicine and discuss future challenges in bioinformatics that need to be addressed to enable the advancement of systems medicine.

  13. Assessment of Data Reliability of Wireless Sensor Network for Bioinformatics

    Directory of Open Access Journals (Sweden)

    Ting Dong

    2017-09-01

    Full Text Available As a focal point of biotechnology, bioinformatics integrates knowledge from biology, mathematics, physics, chemistry, computer science and information science. It generally deals with genome informatics, protein structure and drug design. However, the data or information thus acquired from the main areas of bioinformatics may not be effective. Some researchers combined bioinformatics with wireless sensor network (WSN into biosensor and other tools, and applied them to such areas as fermentation, environmental monitoring, food engineering, clinical medicine and military. In the combination, the WSN is used to collect data and information. The reliability of the WSN in bioinformatics is the prerequisite to effective utilization of information. It is greatly influenced by factors like quality, benefits, service, timeliness and stability, some of them are qualitative and some are quantitative. Hence, it is necessary to develop a method that can handle both qualitative and quantitative assessment of information. A viable option is the fuzzy linguistic method, especially 2-tuple linguistic model, which has been extensively used to cope with such issues. As a result, this paper introduces 2-tuple linguistic representation to assist experts in giving their opinions on different WSNs in bioinformatics that involve multiple factors. Moreover, the author proposes a novel way to determine attribute weights and uses the method to weigh the relative importance of different influencing factors which can be considered as attributes in the assessment of the WSN in bioinformatics. Finally, an illustrative example is given to provide a reasonable solution for the assessment.

  14. Top considerations for creating bioinformatics software documentation.

    Science.gov (United States)

    Karimzadeh, Mehran; Hoffman, Michael M

    2017-01-14

    Investing in documenting your bioinformatics software well can increase its impact and save your time. To maximize the effectiveness of your documentation, we suggest following a few guidelines we propose here. We recommend providing multiple avenues for users to use your research software, including a navigable HTML interface with a quick start, useful help messages with detailed explanation and thorough examples for each feature of your software. By following these guidelines, you can assure that your hard work maximally benefits yourself and others. © The Author 2017. Published by Oxford University Press.

  15. Robust Bioinformatics Recognition with VLSI Biochip Microsystem

    Science.gov (United States)

    Lue, Jaw-Chyng L.; Fang, Wai-Chi

    2006-01-01

    A microsystem architecture for real-time, on-site, robust bioinformatic patterns recognition and analysis has been proposed. This system is compatible with on-chip DNA analysis means such as polymerase chain reaction (PCR)amplification. A corresponding novel artificial neural network (ANN) learning algorithm using new sigmoid-logarithmic transfer function based on error backpropagation (EBP) algorithm is invented. Our results show the trained new ANN can recognize low fluorescence patterns better than the conventional sigmoidal ANN does. A differential logarithmic imaging chip is designed for calculating logarithm of relative intensities of fluorescence signals. The single-rail logarithmic circuit and a prototype ANN chip are designed, fabricated and characterized.

  16. An innovative approach for testing bioinformatics programs using metamorphic testing

    Directory of Open Access Journals (Sweden)

    Liu Huai

    2009-01-01

    Full Text Available Abstract Background Recent advances in experimental and computational technologies have fueled the development of many sophisticated bioinformatics programs. The correctness of such programs is crucial as incorrectly computed results may lead to wrong biological conclusion or misguide downstream experimentation. Common software testing procedures involve executing the target program with a set of test inputs and then verifying the correctness of the test outputs. However, due to the complexity of many bioinformatics programs, it is often difficult to verify the correctness of the test outputs. Therefore our ability to perform systematic software testing is greatly hindered. Results We propose to use a novel software testing technique, metamorphic testing (MT, to test a range of bioinformatics programs. Instead of requiring a mechanism to verify whether an individual test output is correct, the MT technique verifies whether a pair of test outputs conform to a set of domain specific properties, called metamorphic relations (MRs, thus greatly increases the number and variety of test cases that can be applied. To demonstrate how MT is used in practice, we applied MT to test two open-source bioinformatics programs, namely GNLab and SeqMap. In particular we show that MT is simple to implement, and is effective in detecting faults in a real-life program and some artificially fault-seeded programs. Further, we discuss how MT can be applied to test programs from various domains of bioinformatics. Conclusion This paper describes the application of a simple, effective and automated technique to systematically test a range of bioinformatics programs. We show how MT can be implemented in practice through two real-life case studies. Since many bioinformatics programs, particularly those for large scale simulation and data analysis, are hard to test systematically, their developers may benefit from using MT as part of the testing strategy. Therefore our work

  17. Bioinformatics in translational drug discovery.

    Science.gov (United States)

    Wooller, Sarah K; Benstead-Hume, Graeme; Chen, Xiangrong; Ali, Yusuf; Pearl, Frances M G

    2017-08-31

    Bioinformatics approaches are becoming ever more essential in translational drug discovery both in academia and within the pharmaceutical industry. Computational exploitation of the increasing volumes of data generated during all phases of drug discovery is enabling key challenges of the process to be addressed. Here, we highlight some of the areas in which bioinformatics resources and methods are being developed to support the drug discovery pipeline. These include the creation of large data warehouses, bioinformatics algorithms to analyse 'big data' that identify novel drug targets and/or biomarkers, programs to assess the tractability of targets, and prediction of repositioning opportunities that use licensed drugs to treat additional indications. © 2017 The Author(s).

  18. The secondary metabolite bioinformatics portal

    DEFF Research Database (Denmark)

    Weber, Tilmann; Kim, Hyun Uk

    2016-01-01

    . In this context, this review gives a summary of tools and databases that currently are available to mine, identify and characterize natural product biosynthesis pathways and their producers based on ‘omics data. A web portal called Secondary Metabolite Bioinformatics Portal (SMBP at http......://www.secondarymetabolites.org) is introduced to provide a one-stop catalog and links to these bioinformatics resources. In addition, an outlook is presented how the existing tools and those to be developed will influence synthetic biology approaches in the natural products field....

  19. JV TASK 7-FIELD APPLICATION OF THE FREEZE-THAW/EVAPORATION (FTE) PROCESS FOR THE TREATMENT OF NATURAL GAS PRODUCED WATER IN WYOMING

    Energy Technology Data Exchange (ETDEWEB)

    James A. Sorensen; John Boysen; Deidre Boysen; Tim Larson

    2002-10-01

    The freeze-thaw/evaporation (FTE{reg_sign}) process treats oil and gas produced water so that the water can be beneficially used. The FTE{reg_sign} process is the coupling of evaporation and freeze-crystallization, and in climates where subfreezing temperatures seasonally occur, this coupling improves process economics compared to evaporation alone. An added benefit of the process is that water of a quality suited for a variety of beneficial uses is produced. The evolution, from concept to successful commercial deployment, of the FTE{reg_sign} process for the treatment of natural gas produced water has now been completed. In this document, the histories of two individual commercial deployments of the FTE{reg_sign} process are discussed. In Wyoming, as in many other states, the permitting and regulation of oil and gas produced water disposal and/or treatment facilities depend upon the legal relationship between owners of the facility and the owners of wells from which the water is produced. An ''owner-operated'' facility is regulated by the Wyoming Oil and Gas Conservation Commission (WOGCC) and is defined as an entity which only processes water which comes from the wells in fields of which they have an equity interest. However, if a facility processes water from wells in which the owners of the facility have no equity interest, the facility is considered a ''commercial'' facility and is permitted and regulated by the Wyoming Department of Environmental Quality. For this reason, of the two commercial FTE{reg_sign} process deployments discussed in this document, one is related to an ''owner-operated'' facility, and the other relates to a ''commercial'' facility. Case 1 summarizes the permitting, design, construction, operation, and performance of the FTE{reg_sign} process at an ''owner-operated'' facility located in the Jonah Field of southwestern Wyoming. This

  20. Taking Bioinformatics to Systems Medicine

    NARCIS (Netherlands)

    van Kampen, Antoine H. C.; Moerland, Perry D.

    2016-01-01

    Systems medicine promotes a range of approaches and strategies to study human health and disease at a systems level with the aim of improving the overall well-being of (healthy) individuals, and preventing, diagnosing, or curing disease. In this chapter we discuss how bioinformatics critically

  1. Bioinformatics and the Undergraduate Curriculum

    Science.gov (United States)

    Maloney, Mark; Parker, Jeffrey; LeBlanc, Mark; Woodard, Craig T.; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of…

  2. Reproducible Bioinformatics Research for Biologists

    Science.gov (United States)

    This book chapter describes the current Big Data problem in Bioinformatics and the resulting issues with performing reproducible computational research. The core of the chapter provides guidelines and summaries of current tools/techniques that a noncomputational researcher would need to learn to pe...

  3. The Aspergillus Mine - publishing bioinformatics

    DEFF Research Database (Denmark)

    Vesth, Tammi Camilla; Rasmussen, Jane Lind Nybo; Theobald, Sebastian

    with the Joint Genome Institute. The Aspergillus Mine is not intended as a genomic data sharing service but instead focuses on creating an environment where the results of bioinformatic analysis is made available for inspection. The data and code is public upon request and figures can be obtained directly from...

  4. Bioinformatics of genomic association mapping

    NARCIS (Netherlands)

    Vaez Barzani, Ahmad

    2015-01-01

    In this thesis we present an overview of bioinformatics-based approaches for genomic association mapping, with emphasis on human quantitative traits and their contribution to complex diseases. We aim to provide a comprehensive walk-through of the classic steps of genomic association mapping

  5. Computational intelligence techniques in bioinformatics.

    Science.gov (United States)

    Hassanien, Aboul Ella; Al-Shammari, Eiman Tamah; Ghali, Neveen I

    2013-12-01

    Computational intelligence (CI) is a well-established paradigm with current systems having many of the characteristics of biological computers and capable of performing a variety of tasks that are difficult to do using conventional techniques. It is a methodology involving adaptive mechanisms and/or an ability to learn that facilitate intelligent behavior in complex and changing environments, such that the system is perceived to possess one or more attributes of reason, such as generalization, discovery, association and abstraction. The objective of this article is to present to the CI and bioinformatics research communities some of the state-of-the-art in CI applications to bioinformatics and motivate research in new trend-setting directions. In this article, we present an overview of the CI techniques in bioinformatics. We will show how CI techniques including neural networks, restricted Boltzmann machine, deep belief network, fuzzy logic, rough sets, evolutionary algorithms (EA), genetic algorithms (GA), swarm intelligence, artificial immune systems and support vector machines, could be successfully employed to tackle various problems such as gene expression clustering and classification, protein sequence classification, gene selection, DNA fragment assembly, multiple sequence alignment, and protein function prediction and its structure. We discuss some representative methods to provide inspiring examples to illustrate how CI can be utilized to address these problems and how bioinformatics data can be characterized by CI. Challenges to be addressed and future directions of research are also presented and an extensive bibliography is included. Copyright © 2013 Elsevier Ltd. All rights reserved.

  6. Door reflectie uit de spagaat tussen 0,3 fte en het ambacht

    Directory of Open Access Journals (Sweden)

    Akkelien Canrinus

    2006-12-01

    Full Text Available Akkelien Canrinus tries to come up with an alternative for a Tayloristic management in the sectors of care, welfare and education, by taking up the challenge to work out the concept of professionalism and by showing how street level workers in youth care can bring their professionalism to the fore. She illustrates this with a case of a welfare organisation which was approached by a housing corporation to offer support to a group of young residents of temporary studio’s. Along the way, the assignment changes; the welfare workers now only are supposed to report about trouble caused by the youth, in stead of fulfilling a broader signalling- and supportive function. Management and professional youth workers subsequently end up in juxtaposition. Management is mainly focused on the economic side of the story and reasons that each lucrative assignment needs to be taken up. The workers are mainly focused on the content of the job and find that this kind of work does not fit within their concept of what the profession entails. By acknowledging each others positions, by looking at the proposal from different perspectives and engaging in ‘a suitable dance’ with each other, the welfare organisation in the end manages to find a modus in which all parties involved can do their piece of the work in a satisfactory way. This only became possible after the youth care team had indeed taken its own professionalism seriously and took up a pro-active attitude. In stead of waiting for management to steer top-down, the team took the power themselves and used their own professionalism to generate a solution bottom up.

  7. EURASIP journal on bioinformatics & systems biology

    National Research Council Canada - National Science Library

    2006-01-01

    "The overall aim of "EURASIP Journal on Bioinformatics and Systems Biology" is to publish research results related to signal processing and bioinformatics theories and techniques relevant to a wide...

  8. FIELD DEPLOYMENT EVALUATION OF THE FREEZE-THAW/EVAPORATION (FTE) PROCESS TO TREAT OIL AND GAS PRODUCED WATERS. Task 45. Final topical report; FINAL

    International Nuclear Information System (INIS)

    Ames A. Grisanti; James A. Sorensen

    1999-01-01

    TASK 45 FIELD DEPLOYMENT EVALUATION OF THE FREEZE-THAW/ EVAPORATION (FTE ) PROCESS TO TREAT OIL AND GAS PRODUCED WATERS coupling evaporation with freezing. This offers operators a year- round method for treating produced water. Treating water with the FTE process reduces the volume of water to be disposed of as well as purifying the water to a level acceptable for watering livestock and agricultural lands. This process is currently used at two evaporation facilities, one in the San Juan Basin in New Mexico and one in the Green River Basin in Wyoming. the freezing point below that of pure water. When such a solution is cooled below 32EF, relatively pure ice crystals form, along with an unfrozen brine solution that contains elevated concentrations of salts. Because of the brine's high concentration of these constituents, its density is greater than that of the ice, and the purified ice and brine are easily separated. Coupling the natural processes of freezing and evaporation makes the FTE process a more cost- effective and efficient method for the treatment and disposal of produced water and allows for year-round operation of an FTE facility. drops below 32 F, produced water is automatically pumped from a holding pond and sprayed onto a freezing pad. The freezing pad consists of an elevated framework of piping with regularly placed, upright, extendable spray heads similar to those used to irrigate lawns. As the spray freezes, an ice pile forms over the elevated framework of pipes, and the brine, with an elevated constituent concentration, drains from the ice pile. The high-salinity brine, identified by its high electrical conductivity, is separated using automatic valves and pumped to a pond where it can subsequently be disposed of by conventional methods. As the ice pile increases in height, the sprayers are extended. When the ice on the freezing pad melts, the relatively pure water is pumped from the freezing pad and discharged or stored for later use . No new

  9. Preface to Introduction to Structural Bioinformatics

    NARCIS (Netherlands)

    Feenstra, K. Anton; Abeln, Sanne

    2018-01-01

    While many good textbooks are available on Protein Structure, Molecular Simulations, Thermodynamics and Bioinformatics methods in general, there is no good introductory level book for the field of Structural Bioinformatics. This book aims to give an introduction into Structural Bioinformatics, which

  10. Bioinformatics in the information age

    Energy Technology Data Exchange (ETDEWEB)

    Spengler, Sylvia J.

    2000-02-01

    There is a well-known story about the blind man examining the elephant: the part of the elephant examined determines his perception of the whole beast. Perhaps bioinformatics--the shotgun marriage between biology and mathematics, computer science, and engineering--is like an elephant that occupies a large chair in the scientific living room. Given the demand for and shortage of researchers with the computer skills to handle large volumes of biological data, where exactly does the bioinformatics elephant sit? There are probably many biologists who feel that a major product of this bioinformatics elephant is large piles of waste material. If you have tried to plow through Web sites and software packages in search of a specific tool for analyzing and collating large amounts of research data, you may well feel the same way. But there has been progress with major initiatives to develop more computing power, educate biologists about computers, increase funding, and set standards. For our purposes, bioinformatics is not simply a biologically inclined rehash of information theory (1) nor is it a hodgepodge of computer science techniques for building, updating, and accessing biological data. Rather bioinformatics incorporates both of these capabilities into a broad interdisciplinary science that involves both conceptual and practical tools for the understanding, generation, processing, and propagation of biological information. As such, bioinformatics is the sine qua non of 21st-century biology. Analyzing gene expression using cDNA microarrays immobilized on slides or other solid supports (gene chips) is set to revolutionize biology and medicine and, in so doing, generate vast quantities of data that have to be accurately interpreted (Fig. 1). As discussed at a meeting a few months ago (Microarray Algorithms and Statistical Analysis: Methods and Standards; Tahoe City, California; 9-12 November 1999), experiments with cDNA arrays must be subjected to quality control

  11. Bioinformatics for Genome Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gary J. Olsen

    2005-06-30

    as we feel that our other proposed approaches will ultimately be better.

  12. Utility library for structural bioinformatics.

    Science.gov (United States)

    Gront, Dominik; Kolinski, Andrzej

    2008-02-15

    In this Note we present a new software library for structural bioinformatics. The library contains programs, computing sequence- and profile-based alignments and a variety of structural calculations with user-friendly handling of various data formats. The software organization is very flexible. Algorithms are written in Java language and may be used by Java programs. Moreover the modules can be accessed from Jython (Python scripting language implemented in Java) scripts. Finally, the new version of BioShell delivers several utility programs that can do typical bioinformatics task from a command-line level. Availability The software is available for download free of charge from its website: http://bioshell.chem.uw.edu.pl. This website provides also numerous examples, code snippets and API documentation.

  13. A Bioinformatics Facility for NASA

    Science.gov (United States)

    Schweighofer, Karl; Pohorille, Andrew

    2006-01-01

    Building on an existing prototype, we have fielded a facility with bioinformatics technologies that will help NASA meet its unique requirements for biological research. This facility consists of a cluster of computers capable of performing computationally intensive tasks, software tools, databases and knowledge management systems. Novel computational technologies for analyzing and integrating new biological data and already existing knowledge have been developed. With continued development and support, the facility will fulfill strategic NASA s bioinformatics needs in astrobiology and space exploration. . As a demonstration of these capabilities, we will present a detailed analysis of how spaceflight factors impact gene expression in the liver and kidney for mice flown aboard shuttle flight STS-108. We have found that many genes involved in signal transduction, cell cycle, and development respond to changes in microgravity, but that most metabolic pathways appear unchanged.

  14. Bioinformatics in Africa: The Rise of Ghana?

    Directory of Open Access Journals (Sweden)

    Thomas K Karikari

    2015-09-01

    Full Text Available Until recently, bioinformatics, an important discipline in the biological sciences, was largely limited to countries with advanced scientific resources. Nonetheless, several developing countries have lately been making progress in bioinformatics training and applications. In Africa, leading countries in the discipline include South Africa, Nigeria, and Kenya. However, one country that is less known when it comes to bioinformatics is Ghana. Here, I provide a first description of the development of bioinformatics activities in Ghana and how these activities contribute to the overall development of the discipline in Africa. Over the past decade, scientists in Ghana have been involved in publications incorporating bioinformatics analyses, aimed at addressing research questions in biomedical science and agriculture. Scarce research funding and inadequate training opportunities are some of the challenges that need to be addressed for Ghanaian scientists to continue developing their expertise in bioinformatics.

  15. Bioinformatics in Africa: The Rise of Ghana?

    Science.gov (United States)

    Karikari, Thomas K

    2015-09-01

    Until recently, bioinformatics, an important discipline in the biological sciences, was largely limited to countries with advanced scientific resources. Nonetheless, several developing countries have lately been making progress in bioinformatics training and applications. In Africa, leading countries in the discipline include South Africa, Nigeria, and Kenya. However, one country that is less known when it comes to bioinformatics is Ghana. Here, I provide a first description of the development of bioinformatics activities in Ghana and how these activities contribute to the overall development of the discipline in Africa. Over the past decade, scientists in Ghana have been involved in publications incorporating bioinformatics analyses, aimed at addressing research questions in biomedical science and agriculture. Scarce research funding and inadequate training opportunities are some of the challenges that need to be addressed for Ghanaian scientists to continue developing their expertise in bioinformatics.

  16. Search strategies in structural bioinformatics.

    Science.gov (United States)

    Oakley, Mark T; Barthel, Daniel; Bykov, Yuri; Garibaldi, Jonathan M; Burke, Edmund K; Krasnogor, Natalio; Hirst, Jonathan D

    2008-06-01

    Optimisation problems pervade structural bioinformatics. In this review, we describe recent work addressing a selection of bioinformatics challenges. We begin with a discussion of research into protein structure comparison, and highlight the utility of Kolmogorov complexity as a measure of structural similarity. We then turn to research into de novo protein structure prediction, in which structures are generated from first principles. In this endeavour, there is a compromise between the detail of the model and the extent to which the conformational space of the protein can be sampled. We discuss some developments in this area, including off-lattice structure prediction using the great deluge algorithm. One strategy to reduce the size of the search space is to restrict the protein chain to sites on a regular lattice. In this context, we highlight the use of memetic algorithms, which combine genetic algorithms with local optimisation, to the study of simple protein models on the two-dimensional square lattice and the face-centred cubic lattice.

  17. Emerging strengths in Asia Pacific bioinformatics.

    Science.gov (United States)

    Ranganathan, Shoba; Hsu, Wen-Lian; Yang, Ueng-Cheng; Tan, Tin Wee

    2008-12-12

    The 2008 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998, was organized as the 7th International Conference on Bioinformatics (InCoB), jointly with the Bioinformatics and Systems Biology in Taiwan (BIT 2008) Conference, Oct. 20-23, 2008 at Taipei, Taiwan. Besides bringing together scientists from the field of bioinformatics in this region, InCoB is actively involving researchers from the area of systems biology, to facilitate greater synergy between these two groups. Marking the 10th Anniversary of APBioNet, this InCoB 2008 meeting followed on from a series of successful annual events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea), New Delhi (India) and Hong Kong. Additionally, tutorials and the Workshop on Education in Bioinformatics and Computational Biology (WEBCB) immediately prior to the 20th Federation of Asian and Oceanian Biochemists and Molecular Biologists (FAOBMB) Taipei Conference provided ample opportunity for inducting mainstream biochemists and molecular biologists from the region into a greater level of awareness of the importance of bioinformatics in their craft. In this editorial, we provide a brief overview of the peer-reviewed manuscripts accepted for publication herein, grouped into thematic areas. As the regional research expertise in bioinformatics matures, the papers fall into thematic areas, illustrating the specific contributions made by APBioNet to global bioinformatics efforts.

  18. Fuzzy Logic in Medicine and Bioinformatics

    Directory of Open Access Journals (Sweden)

    Angela Torres

    2006-01-01

    Full Text Available The purpose of this paper is to present a general view of the current applications of fuzzy logic in medicine and bioinformatics. We particularly review the medical literature using fuzzy logic. We then recall the geometrical interpretation of fuzzy sets as points in a fuzzy hypercube and present two concrete illustrations in medicine (drug addictions and in bioinformatics (comparison of genomes.

  19. Online Bioinformatics Tutorials | Office of Cancer Genomics

    Science.gov (United States)

    Bioinformatics is a scientific discipline that applies computer science and information technology to help understand biological processes. The NIH provides a list of free online bioinformatics tutorials, either generated by the NIH Library or other institutes, which includes introductory lectures and "how to" videos on using various tools.

  20. Using "Arabidopsis" Genetic Sequences to Teach Bioinformatics

    Science.gov (United States)

    Zhang, Xiaorong

    2009-01-01

    This article describes a new approach to teaching bioinformatics using "Arabidopsis" genetic sequences. Several open-ended and inquiry-based laboratory exercises have been designed to help students grasp key concepts and gain practical skills in bioinformatics, using "Arabidopsis" leucine-rich repeat receptor-like kinase (LRR…

  1. A Mathematical Optimization Problem in Bioinformatics

    Science.gov (United States)

    Heyer, Laurie J.

    2008-01-01

    This article describes the sequence alignment problem in bioinformatics. Through examples, we formulate sequence alignment as an optimization problem and show how to compute the optimal alignment with dynamic programming. The examples and sample exercises have been used by the author in a specialized course in bioinformatics, but could be adapted…

  2. Bioinformatics clouds for big data manipulation.

    Science.gov (United States)

    Dai, Lin; Gao, Xin; Guo, Yan; Xiao, Jingfa; Zhang, Zhang

    2012-11-28

    As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.

  3. The 2016 Bioinformatics Open Source Conference (BOSC).

    Science.gov (United States)

    Harris, Nomi L; Cock, Peter J A; Chapman, Brad; Fields, Christopher J; Hokamp, Karsten; Lapp, Hilmar; Muñoz-Torres, Monica; Wiencko, Heather

    2016-01-01

    Message from the ISCB: The Bioinformatics Open Source Conference (BOSC) is a yearly meeting organized by the Open Bioinformatics Foundation (OBF), a non-profit group dedicated to promoting the practice and philosophy of Open Source software development and Open Science within the biological research community. BOSC has been run since 2000 as a two-day Special Interest Group (SIG) before the annual ISMB conference. The 17th annual BOSC ( http://www.open-bio.org/wiki/BOSC_2016) took place in Orlando, Florida in July 2016. As in previous years, the conference was preceded by a two-day collaborative coding event open to the bioinformatics community. The conference brought together nearly 100 bioinformatics researchers, developers and users of open source software to interact and share ideas about standards, bioinformatics software development, and open and reproducible science.

  4. Bioinformatics clouds for big data manipulation

    Directory of Open Access Journals (Sweden)

    Dai Lin

    2012-11-01

    Full Text Available Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS, Software as a Service (SaaS, Platform as a Service (PaaS, and Infrastructure as a Service (IaaS, and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.

  5. Bioinformatics clouds for big data manipulation

    Science.gov (United States)

    2012-01-01

    Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor. PMID:23190475

  6. Rising Strengths Hong Kong SAR in Bioinformatics.

    Science.gov (United States)

    Chakraborty, Chiranjib; George Priya Doss, C; Zhu, Hailong; Agoramoorthy, Govindasamy

    2017-06-01

    Hong Kong's bioinformatics sector is attaining new heights in combination with its economic boom and the predominance of the working-age group in its population. Factors such as a knowledge-based and free-market economy have contributed towards a prominent position on the world map of bioinformatics. In this review, we have considered the educational measures, landmark research activities and the achievements of bioinformatics companies and the role of the Hong Kong government in the establishment of bioinformatics as strength. However, several hurdles remain. New government policies will assist computational biologists to overcome these hurdles and further raise the profile of the field. There is a high expectation that bioinformatics in Hong Kong will be a promising area for the next generation.

  7. Bioinformatics clouds for big data manipulation

    KAUST Repository

    Dai, Lin

    2012-11-28

    As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics.This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor. 2012 Dai et al.; licensee BioMed Central Ltd.

  8. When cloud computing meets bioinformatics: a review.

    Science.gov (United States)

    Zhou, Shuigeng; Liao, Ruiqi; Guan, Jihong

    2013-10-01

    In the past decades, with the rapid development of high-throughput technologies, biology research has generated an unprecedented amount of data. In order to store and process such a great amount of data, cloud computing and MapReduce were applied to many fields of bioinformatics. In this paper, we first introduce the basic concepts of cloud computing and MapReduce, and their applications in bioinformatics. We then highlight some problems challenging the applications of cloud computing and MapReduce to bioinformatics. Finally, we give a brief guideline for using cloud computing in biology research.

  9. Ten Year Overview of HCC FTE Enrollment: Fiscal Year 1976 to Fiscal Year 1985 [and] Thirteen Percent of HCC Students Have Earned Previous Degrees [and] Percent of Student Population by Age. Flash Facts IR Mini Reports; Volume 1, Numbers 1-3.

    Science.gov (United States)

    Radcliffe, Susan; And Others

    These mini-reports present summaries of data and research findings about the students of Howard Community College (HCC) in Maryland. In the first report, trends in full-time equivalent (FTE), credit, and continuing education enrollments are charted from 1976 to 1985, showing a 92% increase in total FTE enrollment. The second report focuses on HCC…

  10. Coronavirus Genomics and Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Kwok-Yung Yuen

    2010-08-01

    Full Text Available The drastic increase in the number of coronaviruses discovered and coronavirus genomes being sequenced have given us an unprecedented opportunity to perform genomics and bioinformatics analysis on this family of viruses. Coronaviruses possess the largest genomes (26.4 to 31.7 kb among all known RNA viruses, with G + C contents varying from 32% to 43%. Variable numbers of small ORFs are present between the various conserved genes (ORF1ab, spike, envelope, membrane and nucleocapsid and downstream to nucleocapsid gene in different coronavirus lineages. Phylogenetically, three genera, Alphacoronavirus, Betacoronavirus and Gammacoronavirus, with Betacoronavirus consisting of subgroups A, B, C and D, exist. A fourth genus, Deltacoronavirus, which includes bulbul coronavirus HKU11, thrush coronavirus HKU12 and munia coronavirus HKU13, is emerging. Molecular clock analysis using various gene loci revealed that the time of most recent common ancestor of human/civet SARS related coronavirus to be 1999-2002, with estimated substitution rate of 4´10-4 to 2´10-2 substitutions per site per year. Recombination in coronaviruses was most notable between different strains of murine hepatitis virus (MHV, between different strains of infectious bronchitis virus, between MHV and bovine coronavirus, between feline coronavirus (FCoV type I and canine coronavirus generating FCoV type II, and between the three genotypes of human coronavirus HKU1 (HCoV-HKU1. Codon usage bias in coronaviruses were observed, with HCoV-HKU1 showing the most extreme bias, and cytosine deamination and selection of CpG suppressed clones are the two major independent biological forces that shape such codon usage bias in coronaviruses.

  11. Deciphering psoriasis. A bioinformatic approach.

    Science.gov (United States)

    Melero, Juan L; Andrades, Sergi; Arola, Lluís; Romeu, Antoni

    2018-02-01

    Psoriasis is an immune-mediated, inflammatory and hyperproliferative disease of the skin and joints. The cause of psoriasis is still unknown. The fundamental feature of the disease is the hyperproliferation of keratinocytes and the recruitment of cells from the immune system in the region of the affected skin, which leads to deregulation of many well-known gene expressions. Based on data mining and bioinformatic scripting, here we show a new dimension of the effect of psoriasis at the genomic level. Using our own pipeline of scripts in Perl and MySql and based on the freely available NCBI Gene Expression Omnibus (GEO) database: DataSet Record GDS4602 (Series GSE13355), we explore the extent of the effect of psoriasis on gene expression in the affected tissue. We give greater insight into the effects of psoriasis on the up-regulation of some genes in the cell cycle (CCNB1, CCNA2, CCNE2, CDK1) or the dynamin system (GBPs, MXs, MFN1), as well as the down-regulation of typical antioxidant genes (catalase, CAT; superoxide dismutases, SOD1-3; and glutathione reductase, GSR). We also provide a complete list of the human genes and how they respond in a state of psoriasis. Our results show that psoriasis affects all chromosomes and many biological functions. If we further consider the stable and mitotically inheritable character of the psoriasis phenotype, and the influence of environmental factors, then it seems that psoriasis has an epigenetic origin. This fit well with the strong hereditary character of the disease as well as its complex genetic background. Copyright © 2017 Japanese Society for Investigative Dermatology. Published by Elsevier B.V. All rights reserved.

  12. Bioinformatics for cancer immunotherapy target discovery

    DEFF Research Database (Denmark)

    Olsen, Lars Rønn; Campos, Benito; Barnkob, Mike Stein

    2014-01-01

    cancer immunotherapies has yet to be fulfilled. The insufficient efficacy of existing treatments can be attributed to a number of biological and technical issues. In this review, we detail the current limitations of immunotherapy target selection and design, and review computational methods to streamline...... therapy target discovery in a bioinformatics analysis pipeline. We describe specialized bioinformatics tools and databases for three main bottlenecks in immunotherapy target discovery: the cataloging of potentially antigenic proteins, the identification of potential HLA binders, and the selection epitopes...

  13. Navigating the changing learning landscape: perspective from bioinformatics.ca

    OpenAIRE

    Brazas, Michelle D.; Ouellette, B. F. Francis

    2013-01-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable...

  14. Development of Bioinformatics Infrastructure for Genomics Research.

    Science.gov (United States)

    Mulder, Nicola J; Adebiyi, Ezekiel; Adebiyi, Marion; Adeyemi, Seun; Ahmed, Azza; Ahmed, Rehab; Akanle, Bola; Alibi, Mohamed; Armstrong, Don L; Aron, Shaun; Ashano, Efejiro; Baichoo, Shakuntala; Benkahla, Alia; Brown, David K; Chimusa, Emile R; Fadlelmola, Faisal M; Falola, Dare; Fatumo, Segun; Ghedira, Kais; Ghouila, Amel; Hazelhurst, Scott; Isewon, Itunuoluwa; Jung, Segun; Kassim, Samar Kamal; Kayondo, Jonathan K; Mbiyavanga, Mamana; Meintjes, Ayton; Mohammed, Somia; Mosaku, Abayomi; Moussa, Ahmed; Muhammd, Mustafa; Mungloo-Dilmohamud, Zahra; Nashiru, Oyekanmi; Odia, Trust; Okafor, Adaobi; Oladipo, Olaleye; Osamor, Victor; Oyelade, Jellili; Sadki, Khalid; Salifu, Samson Pandam; Soyemi, Jumoke; Panji, Sumir; Radouani, Fouzia; Souiai, Oussama; Tastan Bishop, Özlem

    2017-06-01

    Although pockets of bioinformatics excellence have developed in Africa, generally, large-scale genomic data analysis has been limited by the availability of expertise and infrastructure. H3ABioNet, a pan-African bioinformatics network, was established to build capacity specifically to enable H3Africa (Human Heredity and Health in Africa) researchers to analyze their data in Africa. Since the inception of the H3Africa initiative, H3ABioNet's role has evolved in response to changing needs from the consortium and the African bioinformatics community. H3ABioNet set out to develop core bioinformatics infrastructure and capacity for genomics research in various aspects of data collection, transfer, storage, and analysis. Various resources have been developed to address genomic data management and analysis needs of H3Africa researchers and other scientific communities on the continent. NetMap was developed and used to build an accurate picture of network performance within Africa and between Africa and the rest of the world, and Globus Online has been rolled out to facilitate data transfer. A participant recruitment database was developed to monitor participant enrollment, and data is being harmonized through the use of ontologies and controlled vocabularies. The standardized metadata will be integrated to provide a search facility for H3Africa data and biospecimens. Because H3Africa projects are generating large-scale genomic data, facilities for analysis and interpretation are critical. H3ABioNet is implementing several data analysis platforms that provide a large range of bioinformatics tools or workflows, such as Galaxy, the Job Management System, and eBiokits. A set of reproducible, portable, and cloud-scalable pipelines to support the multiple H3Africa data types are also being developed and dockerized to enable execution on multiple computing infrastructures. In addition, new tools have been developed for analysis of the uniquely divergent African data and for

  15. MicroRNA from tuberculosis RNA: A bioinformatics study

    OpenAIRE

    Wiwanitkit, Somsri; Wiwanitkit, Viroj

    2012-01-01

    The role of microRNA in the pathogenesis of pulmonary tuberculosis is the interesting topic in chest medicine at present. Recently, it was proposed that the microRNA can be a useful biomarker for monitoring of pulmonary tuberculosis and might be the important part in pathogenesis of disease. Here, the authors perform a bioinformatics study to assess the microRNA within known tuberculosis RNA. The microRNA part can be detected and this can be important key information in further study of the p...

  16. Translational Bioinformatics and Clinical Research (Biomedical) Informatics.

    Science.gov (United States)

    Sirintrapun, S Joseph; Zehir, Ahmet; Syed, Aijazuddin; Gao, JianJiong; Schultz, Nikolaus; Cheng, Donavan T

    2015-06-01

    Translational bioinformatics and clinical research (biomedical) informatics are the primary domains related to informatics activities that support translational research. Translational bioinformatics focuses on computational techniques in genetics, molecular biology, and systems biology. Clinical research (biomedical) informatics involves the use of informatics in discovery and management of new knowledge relating to health and disease. This article details 3 projects that are hybrid applications of translational bioinformatics and clinical research (biomedical) informatics: The Cancer Genome Atlas, the cBioPortal for Cancer Genomics, and the Memorial Sloan Kettering Cancer Center clinical variants and results database, all designed to facilitate insights into cancer biology and clinical/therapeutic correlations. Copyright © 2015 Elsevier Inc. All rights reserved.

  17. Planning bioinformatics workflows using an expert system.

    Science.gov (United States)

    Chen, Xiaoling; Chang, Jeffrey T

    2017-04-15

    Bioinformatic analyses are becoming formidably more complex due to the increasing number of steps required to process the data, as well as the proliferation of methods that can be used in each step. To alleviate this difficulty, pipelines are commonly employed. However, pipelines are typically implemented to automate a specific analysis, and thus are difficult to use for exploratory analyses requiring systematic changes to the software or parameters used. To automate the development of pipelines, we have investigated expert systems. We created the Bioinformatics ExperT SYstem (BETSY) that includes a knowledge base where the capabilities of bioinformatics software is explicitly and formally encoded. BETSY is a backwards-chaining rule-based expert system comprised of a data model that can capture the richness of biological data, and an inference engine that reasons on the knowledge base to produce workflows. Currently, the knowledge base is populated with rules to analyze microarray and next generation sequencing data. We evaluated BETSY and found that it could generate workflows that reproduce and go beyond previously published bioinformatics results. Finally, a meta-investigation of the workflows generated from the knowledge base produced a quantitative measure of the technical burden imposed by each step of bioinformatics analyses, revealing the large number of steps devoted to the pre-processing of data. In sum, an expert system approach can facilitate exploratory bioinformatic analysis by automating the development of workflows, a task that requires significant domain expertise. https://github.com/jefftc/changlab. jeffrey.t.chang@uth.tmc.edu. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  18. Planning bioinformatics workflows using an expert system

    Science.gov (United States)

    Chen, Xiaoling; Chang, Jeffrey T.

    2017-01-01

    Abstract Motivation: Bioinformatic analyses are becoming formidably more complex due to the increasing number of steps required to process the data, as well as the proliferation of methods that can be used in each step. To alleviate this difficulty, pipelines are commonly employed. However, pipelines are typically implemented to automate a specific analysis, and thus are difficult to use for exploratory analyses requiring systematic changes to the software or parameters used. Results: To automate the development of pipelines, we have investigated expert systems. We created the Bioinformatics ExperT SYstem (BETSY) that includes a knowledge base where the capabilities of bioinformatics software is explicitly and formally encoded. BETSY is a backwards-chaining rule-based expert system comprised of a data model that can capture the richness of biological data, and an inference engine that reasons on the knowledge base to produce workflows. Currently, the knowledge base is populated with rules to analyze microarray and next generation sequencing data. We evaluated BETSY and found that it could generate workflows that reproduce and go beyond previously published bioinformatics results. Finally, a meta-investigation of the workflows generated from the knowledge base produced a quantitative measure of the technical burden imposed by each step of bioinformatics analyses, revealing the large number of steps devoted to the pre-processing of data. In sum, an expert system approach can facilitate exploratory bioinformatic analysis by automating the development of workflows, a task that requires significant domain expertise. Availability and Implementation: https://github.com/jefftc/changlab Contact: jeffrey.t.chang@uth.tmc.edu PMID:28052928

  19. A bioinformatics approach to marker development

    NARCIS (Netherlands)

    Tang, J.

    2008-01-01

    The thesis focuses on two bioinformatics research topics: the development of tools for an efficient and reliable identification of single nucleotides polymorphisms (SNPs) and polymorphic simple sequence repeats (SSRs) from expressed sequence tags (ESTs) (Chapter 2, 3 and 4), and the subsequent

  20. SPECIES DATABASES AND THE BIOINFORMATICS REVOLUTION.

    Science.gov (United States)

    Biological databases are having a growth spurt. Much of this results from research in genetics and biodiversity, coupled with fast-paced developments in information technology. The revolution in bioinformatics, defined by Sugden and Pennisi (2000) as the "tools and techniques for...

  1. An integrative bioinformatics pipeline for the genomewide ...

    Indian Academy of Sciences (India)

    An integrative bioinformatics pipeline for the genomewide identification of novel porcine microRNA genes. Wei Fang, Na Zhou, Dengyun Li, Zhigang Chen, Pengfei Jiang and Deli Zhang. J. Genet. 92,587 593. Figure 1. Primary sequence of the predicted SSc-mir-2053 precursor and locations of some terms in the secondary ...

  2. Novel bioinformatic developments for exome sequencing

    NARCIS (Netherlands)

    Lelieveld, S.H.; Veltman, J.A.; Gilissen, C.F.

    2016-01-01

    With the widespread adoption of next generation sequencing technologies by the genetics community and the rapid decrease in costs per base, exome sequencing has become a standard within the repertoire of genetic experiments for both research and diagnostics. Although bioinformatics now offers

  3. Development and implementation of a bioinformatics online ...

    African Journals Online (AJOL)

    Thus, there is the need for appropriate strategies of introducing the basic components of this emerging scientific field to part of the African populace through the development of an online distance education learning tool. This study involved the design of a bioinformatics online distance educative tool an implementation of ...

  4. Bioinformatic tools for PCR Primer design

    African Journals Online (AJOL)

    ES

    reaction (PCR), oligo hybridization and DNA sequencing. Proper primer design is actually one of the most important factors/steps in successful DNA sequencing. Various bioinformatics programs are available for selection of primer pairs from a template sequence. The plethora programs for PCR primer design reflects the.

  5. Implementing bioinformatic workflows within the bioextract server

    Science.gov (United States)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows typically require the integrated use of multiple, distributed data sources and analytic tools. The BioExtract Server (http://bioextract.org) is a distributed servi...

  6. "Extreme Programming" in a Bioinformatics Class

    Science.gov (United States)

    Kelley, Scott; Alger, Christianna; Deutschman, Douglas

    2009-01-01

    The importance of Bioinformatics tools and methodology in modern biological research underscores the need for robust and effective courses at the college level. This paper describes such a course designed on the principles of cooperative learning based on a computer software industry production model called "Extreme Programming" (EP).…

  7. Protein raftophilicity. How bioinformatics can help membranologists

    DEFF Research Database (Denmark)

    Nielsen, Henrik; Sperotto, Maria Maddalena

    )-based bioinformatics approach. The ANN was trained to recognize feature-based patterns in proteins that are considered to be associated with lipid rafts. The trained ANN was then used to predict protein raftophilicity. We found that, in the case of α-helical membrane proteins, their hydrophobic length does not affect...

  8. Bioinformatics: A History of Evolution "In Silico"

    Science.gov (United States)

    Ondrej, Vladan; Dvorak, Petr

    2012-01-01

    Bioinformatics, biological databases, and the worldwide use of computers have accelerated biological research in many fields, such as evolutionary biology. Here, we describe a primer of nucleotide sequence management and the construction of a phylogenetic tree with two examples; the two selected are from completely different groups of organisms:…

  9. Privacy Preserving PCA on Distributed Bioinformatics Datasets

    Science.gov (United States)

    Li, Xin

    2011-01-01

    In recent years, new bioinformatics technologies, such as gene expression microarray, genome-wide association study, proteomics, and metabolomics, have been widely used to simultaneously identify a huge number of human genomic/genetic biomarkers, generate a tremendously large amount of data, and dramatically increase the knowledge on human…

  10. Bioinformatics in Undergraduate Education: Practical Examples

    Science.gov (United States)

    Boyle, John A.

    2004-01-01

    Bioinformatics has emerged as an important research tool in recent years. The ability to mine large databases for relevant information has become increasingly central to many different aspects of biochemistry and molecular biology. It is important that undergraduates be introduced to the available information and methodologies. We present a…

  11. Bioinformatics applications in proteomics data analysis

    NARCIS (Netherlands)

    Peng, M.

    2015-01-01

    In this thesis, I detail my 4-year efforts in developing bioinformatics tools and algorithms to address the growing demands of current proteomics endeavors, covering a range of facets such as large-scale protein expression profiling, charting post-translation modifications as well as

  12. Navigating the changing learning landscape: perspective from bioinformatics.ca.

    Science.gov (United States)

    Brazas, Michelle D; Ouellette, B F Francis

    2013-09-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable in the learning continuum. Bioinformatics.ca, which hosts the Canadian Bioinformatics Workshops, has blended more traditional learning styles with current online and social learning styles. Here we share our growing experiences over the past 12 years and look toward what the future holds for bioinformatics training programs.

  13. Bioinformatics and its application in animal health: a review | Soetan ...

    African Journals Online (AJOL)

    The aim of this review is to discuss the importance of bioinformatics and emphasize the need to acquire bioinformatics training and skills so as to maximize its potentials for improved delivery of animal health. In this review, bioinformatics is introduced, challenges to effective animal disease diagnosis, prevention and control, ...

  14. Component-Based Approach for Educating Students in Bioinformatics

    Science.gov (United States)

    Poe, D.; Venkatraman, N.; Hansen, C.; Singh, G.

    2009-01-01

    There is an increasing need for an effective method of teaching bioinformatics. Increased progress and availability of computer-based tools for educating students have led to the implementation of a computer-based system for teaching bioinformatics as described in this paper. Bioinformatics is a recent, hybrid field of study combining elements of…

  15. Bioinformatics and systems biology research update from the 15th International Conference on Bioinformatics (InCoB2016).

    Science.gov (United States)

    Schönbach, Christian; Verma, Chandra; Bond, Peter J; Ranganathan, Shoba

    2016-12-22

    The International Conference on Bioinformatics (InCoB) has been publishing peer-reviewed conference papers in BMC Bioinformatics since 2006. Of the 44 articles accepted for publication in supplement issues of BMC Bioinformatics, BMC Genomics, BMC Medical Genomics and BMC Systems Biology, 24 articles with a bioinformatics or systems biology focus are reviewed in this editorial. InCoB2017 is scheduled to be held in Shenzen, China, September 20-22, 2017.

  16. Bioinformatics Approaches for Human Gut Microbiome Research

    Directory of Open Access Journals (Sweden)

    Zhijun Zheng

    2016-07-01

    Full Text Available The human microbiome has received much attention because many studies have reported that the human gut microbiome is associated with several diseases. The very large datasets that are produced by these kinds of studies means that bioinformatics approaches are crucial for their analysis. Here, we systematically reviewed bioinformatics tools that are commonly used in microbiome research, including a typical pipeline and software for sequence alignment, abundance profiling, enterotype determination, taxonomic diversity, identifying differentially abundant species/genes, gene cataloging, and functional analyses. We also summarized the algorithms and methods used to define metagenomic species and co-abundance gene groups to expand our understanding of unclassified and poorly understood gut microbes that are undocumented in the current genome databases. Additionally, we examined the methods used to identify metagenomic biomarkers based on the gut microbiome, which might help to expand the knowledge and approaches for disease detection and monitoring.

  17. Concepts and introduction to RNA bioinformatics

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Hofacker, Ivo L.; Ruzzo, Walter L.

    2014-01-01

    RNA bioinformatics and computational RNA biology have emerged from implementing methods for predicting the secondary structure of single sequences. The field has evolved to exploit multiple sequences to take evolutionary information into account, such as compensating (and structure preserving) base...... changes. These methods have been developed further and applied for computational screens of genomic sequence. Furthermore, a number of additional directions have emerged. These include methods to search for RNA 3D structure, RNA-RNA interactions, and design of interfering RNAs (RNAi) as well as methods...... for interactions between RNA and proteins.Here, we introduce the basic concepts of predicting RNA secondary structure relevant to the further analyses of RNA sequences. We also provide pointers to methods addressing various aspects of RNA bioinformatics and computational RNA biology....

  18. Applying instructional design theories to bioinformatics education in microarray analysis and primer design workshops.

    Science.gov (United States)

    Shachak, Aviv; Ophir, Ron; Rubin, Eitan

    2005-01-01

    The need to support bioinformatics training has been widely recognized by scientists, industry, and government institutions. However, the discussion of instructional methods for teaching bioinformatics is only beginning. Here we report on a systematic attempt to design two bioinformatics workshops for graduate biology students on the basis of Gagne's Conditions of Learning instructional design theory. This theory, although first published in the early 1970s, is still fundamental in instructional design and instructional technology. First, top-level as well as prerequisite learning objectives for a microarray analysis workshop and a primer design workshop were defined. Then a hierarchy of objectives for each workshop was created. Hands-on tutorials were designed to meet these objectives. Finally, events of learning proposed by Gagne's theory were incorporated into the hands-on tutorials. The resultant manuals were tested on a small number of trainees, revised, and applied in 1-day bioinformatics workshops. Based on this experience and on observations made during the workshops, we conclude that Gagne's Conditions of Learning instructional design theory provides a useful framework for developing bioinformatics training, but may not be optimal as a method for teaching it.

  19. G2LC: Resources Autoscaling for Real Time Bioinformatics Applications in IaaS

    Directory of Open Access Journals (Sweden)

    Rongdong Hu

    2015-01-01

    Full Text Available Cloud computing has started to change the way how bioinformatics research is being carried out. Researchers who have taken advantage of this technology can process larger amounts of data and speed up scientific discovery. The variability in data volume results in variable computing requirements. Therefore, bioinformatics researchers are pursuing more reliable and efficient methods for conducting sequencing analyses. This paper proposes an automated resource provisioning method, G2LC, for bioinformatics applications in IaaS. It enables application to output the results in a real time manner. Its main purpose is to guarantee applications performance, while improving resource utilization. Real sequence searching data of BLAST is used to evaluate the effectiveness of G2LC. Experimental results show that G2LC guarantees the application performance, while resource is saved up to 20.14%.

  20. G2LC: Resources Autoscaling for Real Time Bioinformatics Applications in IaaS.

    Science.gov (United States)

    Hu, Rongdong; Liu, Guangming; Jiang, Jingfei; Wang, Lixin

    2015-01-01

    Cloud computing has started to change the way how bioinformatics research is being carried out. Researchers who have taken advantage of this technology can process larger amounts of data and speed up scientific discovery. The variability in data volume results in variable computing requirements. Therefore, bioinformatics researchers are pursuing more reliable and efficient methods for conducting sequencing analyses. This paper proposes an automated resource provisioning method, G2LC, for bioinformatics applications in IaaS. It enables application to output the results in a real time manner. Its main purpose is to guarantee applications performance, while improving resource utilization. Real sequence searching data of BLAST is used to evaluate the effectiveness of G2LC. Experimental results show that G2LC guarantees the application performance, while resource is saved up to 20.14%.

  1. VLSI Microsystem for Rapid Bioinformatic Pattern Recognition

    Science.gov (United States)

    Fang, Wai-Chi; Lue, Jaw-Chyng

    2009-01-01

    A system comprising very-large-scale integrated (VLSI) circuits is being developed as a means of bioinformatics-oriented analysis and recognition of patterns of fluorescence generated in a microarray in an advanced, highly miniaturized, portable genetic-expression-assay instrument. Such an instrument implements an on-chip combination of polymerase chain reactions and electrochemical transduction for amplification and detection of deoxyribonucleic acid (DNA).

  2. The growing need for microservices in bioinformatics

    Directory of Open Access Journals (Sweden)

    Christopher L Williams

    2016-01-01

    Full Text Available Objective: Within the information technology (IT industry, best practices and standards are constantly evolving and being refined. In contrast, computer technology utilized within the healthcare industry often evolves at a glacial pace, with reduced opportunities for justified innovation. Although the use of timely technology refreshes within an enterprise′s overall technology stack can be costly, thoughtful adoption of select technologies with a demonstrated return on investment can be very effective in increasing productivity and at the same time, reducing the burden of maintenance often associated with older and legacy systems. In this brief technical communication, we introduce the concept of microservices as applied to the ecosystem of data analysis pipelines. Microservice architecture is a framework for dividing complex systems into easily managed parts. Each individual service is limited in functional scope, thereby conferring a higher measure of functional isolation and reliability to the collective solution. Moreover, maintenance challenges are greatly simplified by virtue of the reduced architectural complexity of each constitutive module. This fact notwithstanding, rendered overall solutions utilizing a microservices-based approach provide equal or greater levels of functionality as compared to conventional programming approaches. Bioinformatics, with its ever-increasing demand for performance and new testing algorithms, is the perfect use-case for such a solution. Moreover, if promulgated within the greater development community as an open-source solution, such an approach holds potential to be transformative to current bioinformatics software development. Context: Bioinformatics relies on nimble IT framework which can adapt to changing requirements. Aims: To present a well-established software design and deployment strategy as a solution for current challenges within bioinformatics Conclusions: Use of the microservices framework

  3. ballaxy: web services for structural bioinformatics.

    Science.gov (United States)

    Hildebrandt, Anna Katharina; Stöckel, Daniel; Fischer, Nina M; de la Garza, Luis; Krüger, Jens; Nickels, Stefan; Röttig, Marc; Schärfe, Charlotta; Schumann, Marcel; Thiel, Philipp; Lenhof, Hans-Peter; Kohlbacher, Oliver; Hildebrandt, Andreas

    2015-01-01

    Web-based workflow systems have gained considerable momentum in sequence-oriented bioinformatics. In structural bioinformatics, however, such systems are still relatively rare; while commercial stand-alone workflow applications are common in the pharmaceutical industry, academic researchers often still rely on command-line scripting to glue individual tools together. In this work, we address the problem of building a web-based system for workflows in structural bioinformatics. For the underlying molecular modelling engine, we opted for the BALL framework because of its extensive and well-tested functionality in the field of structural bioinformatics. The large number of molecular data structures and algorithms implemented in BALL allows for elegant and sophisticated development of new approaches in the field. We hence connected the versatile BALL library and its visualization and editing front end BALLView with the Galaxy workflow framework. The result, which we call ballaxy, enables the user to simply and intuitively create sophisticated pipelines for applications in structure-based computational biology, integrated into a standard tool for molecular modelling.  ballaxy consists of three parts: some minor modifications to the Galaxy system, a collection of tools and an integration into the BALL framework and the BALLView application for molecular modelling. Modifications to Galaxy will be submitted to the Galaxy project, and the BALL and BALLView integrations will be integrated in the next major BALL release. After acceptance of the modifications into the Galaxy project, we will publish all ballaxy tools via the Galaxy toolshed. In the meantime, all three components are available from http://www.ball-project.org/ballaxy. Also, docker images for ballaxy are available at https://registry.hub.docker.com/u/anhi/ballaxy/dockerfile/. ballaxy is licensed under the terms of the GPL. © The Author 2014. Published by Oxford University Press. All rights reserved. For

  4. Geotail observations of FTE velocities

    Directory of Open Access Journals (Sweden)

    G. I. Korotova

    2009-01-01

    Full Text Available We discuss the plasma velocity signatures expected in association with flux transfer events (FTEs. Events moving faster than or opposite the ambient media should generate bipolar inward/outward (outward/inward flow perturbations normal to the nominal magnetopause in the magnetosphere (magnetosheath. Flow perturbations directly upstream and downstream from the events should be in the direction of event motion. Flows on the flanks should be in the direction opposite the motion of events moving at subsonic and subAlfvénic speeds relative to the ambient plasma. Events moving with the ambient flow should generate no flow perturbations in the ambient plasma. Alfvén waves propagating parallel (antiparallel to the axial magnetic field of FTEs may generate anticorrelated (correlated magnetic field and flow perturbations within the core region of FTEs. We present case studies illustrating many of these signatures. In the examples considered, Alfvén waves propagate along event axes away from the inferred reconnection site. A statistical study of FTEs observed by Geotail over a 3.5-year period reveals that FTEs within the magnetosphere invariably move faster than the ambient flow, while those in the magnetosheath move both faster and slower than the ambient flow.

  5. Targeting folate metabolism for therapeutic option: A bioinformatics approach.

    Science.gov (United States)

    Hande, Sneha; Goswami, Kalyan; Sharma, Richa; Bhoj, Priyanka; Jena, Lingaraj; Reddy, Maryada Venkata Rami

    2015-11-01

    Lymphatic filariasis, commonly called elephantiasis, poses a burden of estimated level of 5.09 million disability adjusted life year. Limitations of its sole drug, diethylcarbamazine (DEC) drive exploration of effective filarial target. A few plant extracts having polyphenolic ingredients and some synthetic compounds possess potential dihydrofolate reductase (DHFR) inhibitory effect. Here, we postulated a plausible link between folates and polyphenolics based on their common precursor in shikimate metabolism. Considering its implication in structural resemblance based antagonism, we have attempted to validate parasitic DHFR protein as a target. The bioinformatics approach, in the absence of crystal structure of the proposed target, used to authenticate and for virtual docking with suitable tested compounds, showed remarkably lower thermodynamic parameters as opposed to the positive control. A comparative docking analysis between human and Brugia malayi DHFR also showed effective binding parameters with lower inhibition constants of these ligands with parasitic target, but not with human counterpart highlighting safety and efficacy. This study suggests that DHFR could be a valid drug target for lymphatic filariasis, and further reveal that bioinformatics may be an effective tool in reverse pharmacological approach for drug design.

  6. Chapter 16: text mining for translational bioinformatics.

    Science.gov (United States)

    Cohen, K Bretonnel; Hunter, Lawrence E

    2013-04-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.

  7. Bioinformatics training: a review of challenges, actions and support requirements.

    Science.gov (United States)

    Schneider, Maria Victoria; Watson, James; Attwood, Teresa; Rother, Kristian; Budd, Aidan; McDowall, Jennifer; Via, Allegra; Fernandes, Pedro; Nyronen, Tommy; Blicher, Thomas; Jones, Phil; Blatter, Marie-Claude; De Las Rivas, Javier; Judge, David Phillip; van der Gool, Wouter; Brooksbank, Cath

    2010-11-01

    As bioinformatics becomes increasingly central to research in the molecular life sciences, the need to train non-bioinformaticians to make the most of bioinformatics resources is growing. Here, we review the key challenges and pitfalls to providing effective training for users of bioinformatics services, and discuss successful training strategies shared by a diverse set of bioinformatics trainers. We also identify steps that trainers in bioinformatics could take together to advance the state of the art in current training practices. The ideas presented in this article derive from the first Trainer Networking Session held under the auspices of the EU-funded SLING Integrating Activity, which took place in November 2009.

  8. Bioinformatics Training: A Review of Challenges, Actions and Support Requirements

    DEFF Research Database (Denmark)

    Schneider, M.V.; Watson, J.; Attwood, T.

    2010-01-01

    As bioinformatics becomes increasingly central to research in the molecular life sciences, the need to train non-bioinformaticians to make the most of bioinformatics resources is growing. Here, we review the key challenges and pitfalls to providing effective training for users of bioinformatics...... services, and discuss successful training strategies shared by a diverse set of bioinformatics trainers. We also identify steps that trainers in bioinformatics could take together to advance the state of the art in current training practices. The ideas presented in this article derive from the first...

  9. Logical comparison over RDF resources in bio-informatics.

    Science.gov (United States)

    Colucci, S; Donini, F M; Di Sciascio, E

    2017-12-01

    Comparison of resources is a frequent task in different bio-informatics applications, including drug-target interaction, drug repositioning and mechanism of action understanding, among others. This paper proposes a general method for the logical comparison of resources modeled in Resource Description Framework and shows its distinguishing features with reference to the comparison of drugs. In particular, the method returns a description of the commonalities between resources, rather than a numerical value estimating their similarity and/or relatedness. The approach is domain-independent and may be flexibly adapted to heterogeneous use cases, according to a process for setting parameters which is completely explicit. The paper also presents an experiment using the dataset Bioportal as knowledge source; the experiment is fully reproducible, thanks to the elicitation of criteria and values for parameter customization. Copyright © 2017 Elsevier Inc. All rights reserved.

  10. Bioinformatic analysis of whole genome sequencing data

    OpenAIRE

    Maqbool, Khurram

    2014-01-01

    Evolution has shaped the life forms for billion of years. Domestication is an accelerated process that can be used as a model for evolutionary changes. The aim of this thesis project has been to carry out extensive bioinformatic analyses of whole genome sequencing data to reveal SNPs, InDels and selective sweeps in the chicken, pig and dog genome. Pig genome sequencing revealed loci under selection for elongation of back and increased number of vertebrae, associated with the NR6A1, PLAG1,...

  11. Introducing bioinformatics, the biosciences' genomic revolution

    CERN Document Server

    Zanella, Paolo

    1999-01-01

    The general audience for these lectures is mainly physicists, computer scientists, engineers or the general public wanting to know more about what’s going on in the biosciences. What’s bioinformatics and why is all this fuss being made about it ? What’s this revolution triggered by the human genome project ? Are there any results yet ? What are the problems ? What new avenues of research have been opened up ? What about the technology ? These new developments will be compared with what happened at CERN earlier in its evolution, and it is hoped that the similiraties and contrasts will stimulate new curiosity and provoke new thoughts.

  12. Multiobjective optimization in bioinformatics and computational biology.

    Science.gov (United States)

    Handl, Julia; Kell, Douglas B; Knowles, Joshua

    2007-01-01

    This paper reviews the application of multiobjective optimization in the fields of bioinformatics and computational biology. A survey of existing work, organized by application area, forms the main body of the review, following an introduction to the key concepts in multiobjective optimization. An original contribution of the review is the identification of five distinct "contexts," giving rise to multiple objectives: These are used to explain the reasons behind the use of multiobjective optimization in each application area and also to point the way to potential future uses of the technique.

  13. Teaching the ABCs of bioinformatics: a brief introduction to the Applied Bioinformatics Course.

    Science.gov (United States)

    Luo, Jingchu

    2014-11-01

    With the development of the Internet and the growth of online resources, bioinformatics training for wet-lab biologists became necessary as a part of their education. This article describes a one-semester course 'Applied Bioinformatics Course' (ABC, http://abc.cbi.pku.edu.cn/) that the author has been teaching to biological graduate students at the Peking University and the Chinese Academy of Agricultural Sciences for the past 13 years. ABC is a hands-on practical course to teach students to use online bioinformatics resources to solve biological problems related to their ongoing research projects in molecular biology. With a brief introduction to the background of the course, detailed information about the teaching strategies of the course are outlined in the 'How to teach' section. The contents of the course are briefly described in the 'What to teach' section with some real examples. The author wishes to share his teaching experiences and the online teaching materials with colleagues working in bioinformatics education both in local and international universities. © The Author 2013. Published by Oxford University Press.

  14. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    Science.gov (United States)

    Magana, Alejandra J.; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the…

  15. Bioinformatics tools for predicting GPCR gene functions.

    Science.gov (United States)

    Suwa, Makiko

    2014-01-01

    The automatic classification of GPCRs by bioinformatics methodology can provide functional information for new GPCRs in the whole 'GPCR proteome' and this information is important for the development of novel drugs. Since GPCR proteome is classified hierarchically, general ways for GPCR function prediction are based on hierarchical classification. Various computational tools have been developed to predict GPCR functions; those tools use not simple sequence searches but more powerful methods, such as alignment-free methods, statistical model methods, and machine learning methods used in protein sequence analysis, based on learning datasets. The first stage of hierarchical function prediction involves the discrimination of GPCRs from non-GPCRs and the second stage involves the classification of the predicted GPCR candidates into family, subfamily, and sub-subfamily levels. Then, further classification is performed according to their protein-protein interaction type: binding G-protein type, oligomerized partner type, etc. Those methods have achieved predictive accuracies of around 90 %. Finally, I described the future subject of research of the bioinformatics technique about functional prediction of GPCR.

  16. repo: an R package for data-centered management of bioinformatic pipelines.

    Science.gov (United States)

    Napolitano, Francesco

    2017-02-16

    Reproducibility in Data Analysis research has long been a significant concern, particularly in the areas of Bioinformatics and Computational Biology. Towards the aim of developing reproducible and reusable processes, Data Analysis management tools can help giving structure and coherence to complex data flows. Nonetheless, improved software quality comes at the cost of additional design and planning effort, which may become impractical in rapidly changing development environments. I propose that an adjustment of focus from processes to data in the management of Bioinformatic pipelines may help improving reproducibility with minimal impact on preexisting development practices. In this paper I introduce the repo R package for bioinformatic analysis management. The tool supports a data-centered philosophy that aims at improving analysis reproducibility and reusability with minimal design overhead. The core of repo lies in its support for easy data storage, retrieval, distribution and annotation. In repo the data analysis flow is derived a posteriori from dependency annotations. The repo package constitutes an unobtrusive data and flow management extension of the R statistical language. Its adoption, together with good development practices, can help improving data analysis management, sharing and reproducibility, especially in the fields of Bioinformatics and Computational Biology.

  17. Applied bioinformatics: Genome annotation and transcriptome analysis

    DEFF Research Database (Denmark)

    Gupta, Vikas

    and dhurrin, which have not previously been characterized in blueberries. There are more than 44,500 spider species with distinct habitats and unique characteristics. Spiders are masters of producing silk webs to catch prey and using venom to neutralize. The exploration of the genetics behind these properties...... japonicus (Lotus), Vaccinium corymbosum (blueberry), Stegodyphus mimosarum (spider) and Trifolium occidentale (clover). From a bioinformatics data analysis perspective, my work can be divided into three parts; genome annotation, small RNA, and gene expression analysis. Lotus is a legume of significant...... has just started. We have assembled and annotated the first two spider genomes to facilitate our understanding of spiders at the molecular level. The need for analyzing the large and increasing amount of sequencing data has increased the demand for efficient, user friendly, and broadly applicable...

  18. Bioinformatic and Biometric Methods in Plant Morphology

    Directory of Open Access Journals (Sweden)

    Surangi W. Punyasena

    2014-08-01

    Full Text Available Recent advances in microscopy, imaging, and data analyses have permitted both the greater application of quantitative methods and the collection of large data sets that can be used to investigate plant morphology. This special issue, the first for Applications in Plant Sciences, presents a collection of papers highlighting recent methods in the quantitative study of plant form. These emerging biometric and bioinformatic approaches to plant sciences are critical for better understanding how morphology relates to ecology, physiology, genotype, and evolutionary and phylogenetic history. From microscopic pollen grains and charcoal particles, to macroscopic leaves and whole root systems, the methods presented include automated classification and identification, geometric morphometrics, and skeleton networks, as well as tests of the limits of human assessment. All demonstrate a clear need for these computational and morphometric approaches in order to increase the consistency, objectivity, and throughput of plant morphological studies.

  19. Applied bioinformatics: Genome annotation and transcriptome analysis

    DEFF Research Database (Denmark)

    Gupta, Vikas

    japonicus (Lotus), Vaccinium corymbosum (blueberry), Stegodyphus mimosarum (spider) and Trifolium occidentale (clover). From a bioinformatics data analysis perspective, my work can be divided into three parts; genome annotation, small RNA, and gene expression analysis. Lotus is a legume of significant...... biology and genetics studies. We present an improved Lotus genome assembly and annotation, a catalog of natural variation based on re-sequencing of 29 accessions, and describe the involvement of small RNAs in the plant-bacteria symbiosis. Blueberries contain anthocyanins, other pigments and various...... polyphenolic compounds, which have been linked to protection against diabetes, cardiovascular disease and age-related cognitive decline. We present the first genome- guided approach in blueberry to identify genes involved in the synthesis of health-protective compounds. Using RNA-Seq data from five stages...

  20. Academic Training - Bioinformatics: Decoding the Genome

    CERN Multimedia

    Chris Jones

    2006-01-01

    ACADEMIC TRAINING LECTURE SERIES 27, 28 February 1, 2, 3 March 2006 from 11:00 to 12:00 - Auditorium, bldg. 500 Decoding the Genome A special series of 5 lectures on: Recent extraordinary advances in the life sciences arising through new detection technologies and bioinformatics The past five years have seen an extraordinary change in the information and tools available in the life sciences. The sequencing of the human genome, the discovery that we possess far fewer genes than foreseen, the measurement of the tiny changes in the genomes that differentiate us, the sequencing of the genomes of many pathogens that lead to diseases such as malaria are all examples of completely new information that is now available in the quest for improved healthcare. New tools have allowed similar strides in the discovery of the associated protein structures, providing invaluable information for those searching for new drugs. New DNA microarray chips permit simultaneous measurement of the state of expression of tens...

  1. Combining multiple decisions: applications to bioinformatics

    International Nuclear Information System (INIS)

    Yukinawa, N; Ishii, S; Takenouchi, T; Oba, S

    2008-01-01

    Multi-class classification is one of the fundamental tasks in bioinformatics and typically arises in cancer diagnosis studies by gene expression profiling. This article reviews two recent approaches to multi-class classification by combining multiple binary classifiers, which are formulated based on a unified framework of error-correcting output coding (ECOC). The first approach is to construct a multi-class classifier in which each binary classifier to be aggregated has a weight value to be optimally tuned based on the observed data. In the second approach, misclassification of each binary classifier is formulated as a bit inversion error with a probabilistic model by making an analogy to the context of information transmission theory. Experimental studies using various real-world datasets including cancer classification problems reveal that both of the new methods are superior or comparable to other multi-class classification methods

  2. Bioinformatic Analysis of Strawberry GSTF12 Gene

    Science.gov (United States)

    Wang, Xiran; Jiang, Leiyu; Tang, Haoru

    2018-01-01

    GSTF12 has always been known as a key factor of proanthocyanins accumulate in plant testa. Through bioinformatics analysis of the nucleotide and encoded protein sequence of GSTF12, it is more advantageous to the study of genes related to anthocyanin biosynthesis accumulation pathway. Therefore, we chosen GSTF12 gene of 11 kinds species, downloaded their nucleotide and protein sequence from NCBI as the research object, found strawberry GSTF12 gene via bioinformation analyse, constructed phylogenetic tree. At the same time, we analysed the strawberry GSTF12 gene of physical and chemical properties and its protein structure and so on. The phylogenetic tree showed that Strawberry and petunia were closest relative. By the protein prediction, we found that the protein owed one proper signal peptide without obvious transmembrane regions.

  3. Rust-Bio: a fast and safe bioinformatics library

    NARCIS (Netherlands)

    J. Köster (Johannes)

    2015-01-01

    textabstractWe present Rust-Bio, the first general purpose bioinformatics library for the innovative Rust programming language. Rust-Bio leverages the unique combination of speed, memory safety and high-level syntax offered by Rust to provide a fast and safe set of bioinformatics algorithms and data

  4. Current status and future perspectives of bioinformatics in Tanzania ...

    African Journals Online (AJOL)

    The main bottleneck in advancing genomics in present times is the lack of expertise in using bioinformatics tools and approaches for data mining in raw DNA sequences generated by modern high throughput technologies such as next generation sequencing. Although bioinformatics has been making major progress and ...

  5. The 2015 Bioinformatics Open Source Conference (BOSC 2015.

    Directory of Open Access Journals (Sweden)

    Nomi L Harris

    2016-02-01

    Full Text Available The Bioinformatics Open Source Conference (BOSC is organized by the Open Bioinformatics Foundation (OBF, a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG before the annual Intelligent Systems in Molecular Biology (ISMB conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included "Data Science;" "Standards and Interoperability;" "Open Science and Reproducibility;" "Translational Bioinformatics;" "Visualization;" and "Bioinformatics Open Source Project Updates". In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled "Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community," that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule.

  6. Recent developments in life sciences research: Role of bioinformatics

    African Journals Online (AJOL)

    Life sciences research and development has opened up new challenges and opportunities for bioinformatics. The contribution of bioinformatics advances made possible the mapping of the entire human genome and genomes of many other organisms in just over a decade. These discoveries, along with current efforts to ...

  7. The 2015 Bioinformatics Open Source Conference (BOSC 2015).

    Science.gov (United States)

    Harris, Nomi L; Cock, Peter J A; Lapp, Hilmar; Chapman, Brad; Davey, Rob; Fields, Christopher; Hokamp, Karsten; Munoz-Torres, Monica

    2016-02-01

    The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included "Data Science;" "Standards and Interoperability;" "Open Science and Reproducibility;" "Translational Bioinformatics;" "Visualization;" and "Bioinformatics Open Source Project Updates". In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled "Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community," that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule.

  8. Perspectives on presentation and pedagogy in aid of bioinformatics education.

    Science.gov (United States)

    Buttigieg, Pier Luigi

    2010-11-01

    Using live presentation to communicate the interdisciplinary and abstract content of bioinformatics to its educationally diverse studentship is a sizeable challenge. This review collects a number of perspectives on multimedia presentation, visual communication and pedagogy. The aim is to encourage educators to reflect on the great potential of live presentation in facilitating bioinformatics education.

  9. Concepts Of Bioinformatics And Its Application In Veterinary ...

    African Journals Online (AJOL)

    Bioinformatics has advanced the course of research and future veterinary vaccines development because it has provided new tools for identification of vaccine targets from sequenced biological data of organisms. In Nigeria, there is lack of bioinformatics training in the universities, expect for short training courses in which ...

  10. Assessment of a Bioinformatics across Life Science Curricula Initiative

    Science.gov (United States)

    Howard, David R.; Miskowski, Jennifer A.; Grunwald, Sandra K.; Abler, Michael L.

    2007-01-01

    At the University of Wisconsin-La Crosse, we have undertaken a program to integrate the study of bioinformatics across the undergraduate life science curricula. Our efforts have included incorporating bioinformatics exercises into courses in the biology, microbiology, and chemistry departments, as well as coordinating the efforts of faculty within…

  11. Generative Topic Modeling in Image Data Mining and Bioinformatics Studies

    Science.gov (United States)

    Chen, Xin

    2012-01-01

    Probabilistic topic models have been developed for applications in various domains such as text mining, information retrieval and computer vision and bioinformatics domain. In this thesis, we focus on developing novel probabilistic topic models for image mining and bioinformatics studies. Specifically, a probabilistic topic-connection (PTC) model…

  12. Evaluating an Inquiry-Based Bioinformatics Course Using Q Methodology

    Science.gov (United States)

    Ramlo, Susan E.; McConnell, David; Duan, Zhong-Hui; Moore, Francisco B.

    2008-01-01

    Faculty at a Midwestern metropolitan public university recently developed a course on bioinformatics that emphasized collaboration and inquiry. Bioinformatics, essentially the application of computational tools to biological data, is inherently interdisciplinary. Thus part of the challenge of creating this course was serving the needs and…

  13. Is there room for ethics within bioinformatics education?

    Science.gov (United States)

    Taneri, Bahar

    2011-07-01

    When bioinformatics education is considered, several issues are addressed. At the undergraduate level, the main issue revolves around conveying information from two main and different fields: biology and computer science. At the graduate level, the main issue is bridging the gap between biology students and computer science students. However, there is an educational component that is rarely addressed within the context of bioinformatics education: the ethics component. Here, a different perspective is provided on bioinformatics education, and the current status of ethics is analyzed within the existing bioinformatics programs. Analysis of the existing undergraduate and graduate programs, in both Europe and the United States, reveals the minimal attention given to ethics within bioinformatics education. Given that bioinformaticians speedily and effectively shape the biomedical sciences and hence their implications for society, here redesigning of the bioinformatics curricula is suggested in order to integrate the necessary ethics education. Unique ethical problems awaiting bioinformaticians and bioinformatics ethics as a separate field of study are discussed. In addition, a template for an "Ethics in Bioinformatics" course is provided.

  14. The development and application of bioinformatics core competencies to improve bioinformatics training and education

    Science.gov (United States)

    Brooksbank, Cath; Morgan, Sarah L.; Rosenwald, Anne; Warnow, Tandy; Welch, Lonnie

    2018-01-01

    Bioinformatics is recognized as part of the essential knowledge base of numerous career paths in biomedical research and healthcare. However, there is little agreement in the field over what that knowledge entails or how best to provide it. These disagreements are compounded by the wide range of populations in need of bioinformatics training, with divergent prior backgrounds and intended application areas. The Curriculum Task Force of the International Society of Computational Biology (ISCB) Education Committee has sought to provide a framework for training needs and curricula in terms of a set of bioinformatics core competencies that cut across many user personas and training programs. The initial competencies developed based on surveys of employers and training programs have since been refined through a multiyear process of community engagement. This report describes the current status of the competencies and presents a series of use cases illustrating how they are being applied in diverse training contexts. These use cases are intended to demonstrate how others can make use of the competencies and engage in the process of their continuing refinement and application. The report concludes with a consideration of remaining challenges and future plans. PMID:29390004

  15. LXtoo: an integrated live Linux distribution for the bioinformatics community.

    Science.gov (United States)

    Yu, Guangchuang; Wang, Li-Gen; Meng, Xiao-Hua; He, Qing-Yu

    2012-07-19

    Recent advances in high-throughput technologies dramatically increase biological data generation. However, many research groups lack computing facilities and specialists. This is an obstacle that remains to be addressed. Here, we present a Linux distribution, LXtoo, to provide a flexible computing platform for bioinformatics analysis. Unlike most of the existing live Linux distributions for bioinformatics limiting their usage to sequence analysis and protein structure prediction, LXtoo incorporates a comprehensive collection of bioinformatics software, including data mining tools for microarray and proteomics, protein-protein interaction analysis, and computationally complex tasks like molecular dynamics. Moreover, most of the programs have been configured and optimized for high performance computing. LXtoo aims to provide well-supported computing environment tailored for bioinformatics research, reducing duplication of efforts in building computing infrastructure. LXtoo is distributed as a Live DVD and freely available at http://bioinformatics.jnu.edu.cn/LXtoo.

  16. Learning structural bioinformatics and evolution with a snake puzzle

    Directory of Open Access Journals (Sweden)

    Gonzalo S. Nido

    2016-12-01

    Full Text Available We propose here a working unit for teaching basic concepts of structural bioinformatics and evolution through the example of a wooden snake puzzle, strikingly similar to toy models widely used in the literature of protein folding. In our experience, developed at a Master’s course at the Universidad Autónoma de Madrid (Spain, the concreteness of this example helps to overcome difficulties caused by the interdisciplinary nature of this field and its high level of abstraction, in particular for students coming from traditional disciplines. The puzzle will allow us discussing a simple algorithm for finding folded solutions, through which we will introduce the concept of the configuration space and the contact matrix representation. This is a central tool for comparing protein structures, for studying simple models of protein energetics, and even for a qualitative discussion of folding kinetics, through the concept of the Contact Order. It also allows a simple representation of misfolded conformations and their free energy. These concepts will motivate evolutionary questions, which we will address by simulating a structurally constrained model of protein evolution, again modelled on the snake puzzle. In this way, we can discuss the analogy between evolutionary concepts and statistical mechanics that facilitates the understanding of both concepts. The proposed examples and literature are accessible, and we provide supplementary material (see ‘Data Availability’ to reproduce the numerical experiments. We also suggest possible directions to expand the unit. We hope that this work will further stimulate the adoption of games in teaching practice.

  17. Continuing Education Workshops in Bioinformatics Positively Impact Research and Careers.

    Science.gov (United States)

    Brazas, Michelle D; Ouellette, B F Francis

    2016-06-01

    Bioinformatics.ca has been hosting continuing education programs in introductory and advanced bioinformatics topics in Canada since 1999 and has trained more than 2,000 participants to date. These workshops have been adapted over the years to keep pace with advances in both science and technology as well as the changing landscape in available learning modalities and the bioinformatics training needs of our audience. Post-workshop surveys have been a mandatory component of each workshop and are used to ensure appropriate adjustments are made to workshops to maximize learning. However, neither bioinformatics.ca nor others offering similar training programs have explored the long-term impact of bioinformatics continuing education training. Bioinformatics.ca recently initiated a look back on the impact its workshops have had on the career trajectories, research outcomes, publications, and collaborations of its participants. Using an anonymous online survey, bioinformatics.ca analyzed responses from those surveyed and discovered its workshops have had a positive impact on collaborations, research, publications, and career progression.

  18. Bioinformatic landscapes for plant transcription factor system research.

    Science.gov (United States)

    Wang, Yijun; Lu, Wenjie; Deng, Dexiang

    2016-02-01

    Diverse bioinformatic resources have been developed for plant transcription factor (TF) research. This review presents the bioinformatic resources and methodologies for the elucidation of plant TF-mediated biological events. Such information is helpful to dissect the transcriptional regulatory systems in the three reference plants Arabidopsis , rice, and maize and translation to other plants. Transcription factors (TFs) orchestrate diverse biological programs by the modulation of spatiotemporal patterns of gene expression via binding cis-regulatory elements. Advanced sequencing platforms accompanied by emerging bioinformatic tools revolutionize the scope and extent of TF research. The system-level integration of bioinformatic resources is beneficial to the decoding of TF-involved networks. Herein, we first briefly introduce general and specialized databases for TF research in three reference plants Arabidopsis, rice, and maize. Then, as proof of concept, we identified and characterized heat shock transcription factor (HSF) members through the TF databases. Finally, we present how the integration of bioinformatic resources at -omics layers can aid the dissection of TF-mediated pathways. We also suggest ways forward to improve the bioinformatic resources of plant TFs. Leveraging these bioinformatic resources and methodologies opens new avenues for the elucidation of transcriptional regulatory systems in the three model systems and translation to other plants.

  19. Bioinformatics approaches for identifying new therapeutic bioactive peptides in food

    Directory of Open Access Journals (Sweden)

    Nora Khaldi

    2012-10-01

    Full Text Available ABSTRACT:The traditional methods for mining foods for bioactive peptides are tedious and long. Similar to the drug industry, the length of time to identify and deliver a commercial health ingredient that reduces disease symptoms can take anything between 5 to 10 years. Reducing this time and effort is crucial in order to create new commercially viable products with clear and important health benefits. In the past few years, bioinformatics, the science that brings together fast computational biology, and efficient genome mining, is appearing as the long awaited solution to this problem. By quickly mining food genomes for characteristics of certain food therapeutic ingredients, researchers can potentially find new ones in a matter of a few weeks. Yet, surprisingly, very little success has been achieved so far using bioinformatics in mining for food bioactives.The absence of food specific bioinformatic mining tools, the slow integration of both experimental mining and bioinformatics, and the important difference between different experimental platforms are some of the reasons for the slow progress of bioinformatics in the field of functional food and more specifically in bioactive peptide discovery.In this paper I discuss some methods that could be easily translated, using a rational peptide bioinformatics design, to food bioactive peptide mining. I highlight the need for an integrated food peptide database. I also discuss how to better integrate experimental work with bioinformatics in order to improve the mining of food for bioactive peptides, therefore achieving a higher success rates.

  20. Bioinformatics Tools for Extracellular Vesicles Research.

    Science.gov (United States)

    Keerthikumar, Shivakumar; Gangoda, Lahiru; Gho, Yong Song; Mathivanan, Suresh

    2017-01-01

    Extracellular vesicles (EVs) are a class of membranous vesicles that are released by multiple cell types into the extracellular environment. This unique class of extracellular organelles which play pivotal role in intercellular communication are conserved across prokaryotes and eukaryotes. Depending upon the cell origin and the functional state, the molecular cargo including proteins, lipids, and RNA within the EVs are modulated. Owing to this, EVs are considered as a subrepertoire of the host cell and are rich reservoirs of disease biomarkers. In addition, the availability of EVs in multiple bodily fluids including blood has created significant interest in biomarker and signaling research. With the advancement in high-throughput techniques, multiple EV studies have embarked on profiling the molecular cargo. To benefit the scientific community, existing free Web-based resources including ExoCarta, EVpedia, and Vesiclepedia catalog multiple datasets. These resources aid in elucidating molecular mechanism and pathophysiology underlying different disease conditions from which EVs are isolated. Here, the existing bioinformatics tools to perform integrated analysis to identify key functional components in the EV datasets are discussed.

  1. Bioinformatics study of the mangrove actin genes

    Science.gov (United States)

    Basyuni, M.; Wasilah, M.; Sumardi

    2017-01-01

    This study describes the bioinformatics methods to analyze eight actin genes from mangrove plants on DDBJ/EMBL/GenBank as well as predicted the structure, composition, subcellular localization, similarity, and phylogenetic. The physical and chemical properties of eight mangroves showed variation among the genes. The percentage of the secondary structure of eight mangrove actin genes followed the order of a helix > random coil > extended chain structure for BgActl, KcActl, RsActl, and A. corniculatum Act. In contrast to this observation, the remaining actin genes were random coil > extended chain structure > a helix. This study, therefore, shown the prediction of secondary structure was performed for necessary structural information. The values of chloroplast or signal peptide or mitochondrial target were too small, indicated that no chloroplast or mitochondrial transit peptide or signal peptide of secretion pathway in mangrove actin genes. These results suggested the importance of understanding the diversity and functional of properties of the different amino acids in mangrove actin genes. To clarify the relationship among the mangrove actin gene, a phylogenetic tree was constructed. Three groups of mangrove actin genes were formed, the first group contains B. gymnorrhiza BgAct and R. stylosa RsActl. The second cluster which consists of 5 actin genes the largest group, and the last branch consist of one gene, B. sexagula Act. The present study, therefore, supported the previous results that plant actin genes form distinct clusters in the tree.

  2. Bioinformatics in cancer therapy and drug design

    International Nuclear Information System (INIS)

    Horbach, D.Y.; Usanov, S.A.

    2005-01-01

    One of the mechanisms of external signal transduction (ionizing radiation, toxicants, stress) to the target cell is the existence of membrane and intracellular proteins with intrinsic tyrosine kinase activity. No wonder that etiology of malignant growth links to abnormalities in signal transduction through tyrosine kinases. The epidermal growth factor receptor (EGFR) tyrosine kinases play fundamental roles in development, proliferation and differentiation of tissues of epithelial, mesenchymal and neuronal origin. There are four types of EGFR: EGF receptor (ErbB1/HER1), ErbB2/Neu/HER2, ErbB3/HER3 and ErbB4/HER4. Abnormal expression of EGFR, appearance of receptor mutants with changed ability to protein-protein interactions or increased tyrosine kinase activity have been implicated in the malignancy of different types of human tumors. Bioinformatics is currently using in investigation on design and selection of drugs that can make alterations in structure or competitively bind with receptors and so display antagonistic characteristics. (authors)

  3. Evolution of web services in bioinformatics.

    Science.gov (United States)

    Neerincx, Pieter B T; Leunissen, Jack A M

    2005-06-01

    Bioinformaticians have developed large collections of tools to make sense of the rapidly growing pool of molecular biological data. Biological systems tend to be complex and in order to understand them, it is often necessary to link many data sets and use more than one tool. Therefore, bioinformaticians have experimented with several strategies to try to integrate data sets and tools. Owing to the lack of standards for data sets and the interfaces of the tools this is not a trivial task. Over the past few years building services with web-based interfaces has become a popular way of sharing the data and tools that have resulted from many bioinformatics projects. This paper discusses the interoperability problem and how web services are being used to try to solve it, resulting in the evolution of tools with web interfaces from HTML/web form-based tools not suited for automatic workflow generation to a dynamic network of XML-based web services that can easily be used to create pipelines.

  4. An Overview of Bioinformatics Tools and Resources in Allergy.

    Science.gov (United States)

    Fu, Zhiyan; Lin, Jing

    2017-01-01

    The rapidly increasing number of characterized allergens has created huge demands for advanced information storage, retrieval, and analysis. Bioinformatics and machine learning approaches provide useful tools for the study of allergens and epitopes prediction, which greatly complement traditional laboratory techniques. The specific applications mainly include identification of B- and T-cell epitopes, and assessment of allergenicity and cross-reactivity. In order to facilitate the work of clinical and basic researchers who are not familiar with bioinformatics, we review in this chapter the most important databases, bioinformatic tools, and methods with relevance to the study of allergens.

  5. cl-dash: rapid configuration and deployment of Hadoop clusters for bioinformatics research in the cloud.

    Science.gov (United States)

    Hodor, Paul; Chawla, Amandeep; Clark, Andrew; Neal, Lauren

    2016-01-15

    : One of the solutions proposed for addressing the challenge of the overwhelming abundance of genomic sequence and other biological data is the use of the Hadoop computing framework. Appropriate tools are needed to set up computational environments that facilitate research of novel bioinformatics methodology using Hadoop. Here, we present cl-dash, a complete starter kit for setting up such an environment. Configuring and deploying new Hadoop clusters can be done in minutes. Use of Amazon Web Services ensures no initial investment and minimal operation costs. Two sample bioinformatics applications help the researcher understand and learn the principles of implementing an algorithm using the MapReduce programming pattern. Source code is available at https://bitbucket.org/booz-allen-sci-comp-team/cl-dash.git. hodor_paul@bah.com. © The Author 2015. Published by Oxford University Press.

  6. A middleware-based platform for the integration of bioinformatic services

    Directory of Open Access Journals (Sweden)

    Guzmán Llambías

    2015-08-01

    Full Text Available Performing Bioinformatic´s experiments involve an intensive access to distributed services and information resources through Internet. Although existing tools facilitate the implementation of workflow-oriented applications, they lack of capabilities to integrate services beyond low-scale applications, particularly integrating services with heterogeneous interaction patterns and in a larger scale. This is particularly required to enable a large-scale distributed processing of biological data generated by massive sequencing technologies. On the other hand, such integration mechanisms are provided by middleware products like Enterprise Service Buses (ESB, which enable to integrate distributed systems following a Service Oriented Architecture. This paper proposes an integration platform, based on enterprise middleware, to integrate Bioinformatics services. It presents a multi-level reference architecture and focuses on ESB-based mechanisms to provide asynchronous communications, event-based interactions and data transformation capabilities. The paper presents a formal specification of the platform using the Event-B model.

  7. The potential of translational bioinformatics approaches for pharmacology research.

    Science.gov (United States)

    Li, Lang

    2015-10-01

    The field of bioinformatics has allowed the interpretation of massive amounts of biological data, ushering in the era of 'omics' to biomedical research. Its potential impact on pharmacology research is enormous and it has shown some emerging successes. A full realization of this potential, however, requires standardized data annotation for large health record databases and molecular data resources. Improved standardization will further stimulate the development of system pharmacology models, using translational bioinformatics methods. This new translational bioinformatics paradigm is highly complementary to current pharmacological research fields, such as personalized medicine, pharmacoepidemiology and drug discovery. In this review, I illustrate the application of transformational bioinformatics to research in numerous pharmacology subdisciplines. © 2015 The British Pharmacological Society.

  8. Development of a cloud-based Bioinformatics Training Platform

    Science.gov (United States)

    Revote, Jerico; Watson-Haigh, Nathan S.; Quenette, Steve; Bethwaite, Blair; McGrath, Annette

    2017-01-01

    Abstract The Bioinformatics Training Platform (BTP) has been developed to provide access to the computational infrastructure required to deliver sophisticated hands-on bioinformatics training courses. The BTP is a cloud-based solution that is in active use for delivering next-generation sequencing training to Australian researchers at geographically dispersed locations. The BTP was built to provide an easy, accessible, consistent and cost-effective approach to delivering workshops at host universities and organizations with a high demand for bioinformatics training but lacking the dedicated bioinformatics training suites required. To support broad uptake of the BTP, the platform has been made compatible with multiple cloud infrastructures. The BTP is an open-source and open-access resource. To date, 20 training workshops have been delivered to over 700 trainees at over 10 venues across Australia using the BTP. PMID:27084333

  9. In silico cloning and bioinformatic analysis of PEPCK gene in ...

    African Journals Online (AJOL)

    Phosphoenolpyruvate carboxykinase (PEPCK), a critical gluconeogenic enzyme, catalyzes the first committed step in the diversion of tricarboxylic acid cycle intermediates toward gluconeogenesis. According to the relative conservation of homologous gene, a bioinformatics strategy was applied to clone Fusarium ...

  10. Microsoft Biology Initiative: .NET Bioinformatics Platform and Tools

    Science.gov (United States)

    Diaz Acosta, B.

    2011-01-01

    The Microsoft Biology Initiative (MBI) is an effort in Microsoft Research to bring new technology and tools to the area of bioinformatics and biology. This initiative is comprised of two primary components, the Microsoft Biology Foundation (MBF) and the Microsoft Biology Tools (MBT). MBF is a language-neutral bioinformatics toolkit built as an extension to the Microsoft .NET Framework—initially aimed at the area of Genomics research. Currently, it implements a range of parsers for common bioinformatics file formats; a range of algorithms for manipulating DNA, RNA, and protein sequences; and a set of connectors to biological web services such as NCBI BLAST. MBF is available under an open source license, and executables, source code, demo applications, documentation and training materials are freely downloadable from http://research.microsoft.com/bio. MBT is a collection of tools that enable biology and bioinformatics researchers to be more productive in making scientific discoveries.

  11. Bioinformatics Education in Pathology Training: Current Scope and Future Direction

    Directory of Open Access Journals (Sweden)

    Michael R Clay

    2017-04-01

    Full Text Available Training anatomic and clinical pathology residents in the principles of bioinformatics is a challenging endeavor. Most residents receive little to no formal exposure to bioinformatics during medical education, and most of the pathology training is spent interpreting histopathology slides using light microscopy or focused on laboratory regulation, management, and interpretation of discrete laboratory data. At a minimum, residents should be familiar with data structure, data pipelines, data manipulation, and data regulations within clinical laboratories. Fellowship-level training should incorporate advanced principles unique to each subspecialty. Barriers to bioinformatics education include the clinical apprenticeship training model, ill-defined educational milestones, inadequate faculty expertise, and limited exposure during medical training. Online educational resources, case-based learning, and incorporation into molecular genomics education could serve as effective educational strategies. Overall, pathology bioinformatics training can be incorporated into pathology resident curricula, provided there is motivation to incorporate, institutional support, educational resources, and adequate faculty expertise.

  12. Bioinformatics Analyst | Center for Cancer Research

    Science.gov (United States)

    BASIC QUALIFICATIONS To be considered for this position, you must minimally meet the knowledge, skills, and abilities listed below: Bachelor’s degree in life science/bioinformatics/math/physics/computer related field from an accredited college or university according to the Council for Higher Education Accreditation (CHEA). (Additional qualifying experience may be substituted for the required education). Foreign degrees must be evaluated for U.S. equivalency. In addition to the educational requirements, a minimum of five (5) years of progressively responsible relevant experience. Must be able to obtain and maintain a security clearance. PREFERRED QUALIFICATIONS Candidates with these desired skills will be given preferential consideration: A Masters’ or PhD degree in any quantitative science is preferred. Commitment to solving biological problems and communicating these solutions. Ability to multi-task across projects. Experience in submitting data sets to public repositories. Management of large genomic data sets including integration with data available from public sources. Prior customer-facing role. Record of scientific achievements including journal publications and conference presentations. Expected Competencies: Deep understanding of and experience in processing high throughput biomedical data: data cleaning, normalization, analysis, interpretation and visualization. Ability to understand and analyze data from complex experimental designs. Proficiency in at least two of the following programming languages: Perl, Python, R, Java and C/C++. Experience in at least two of the following areas: metagenomics, ChIPSeq, RNASeq, ExomeSeq, DHS-Seq, microarray analysis. Familiarity with public databases: NCBI, Ensembl, TCGA, cBioPortal, Broad FireHose. Knowledge of working in a cluster environment.

  13. BOWS (bioinformatics open web services) to centralize bioinformatics tools in web services.

    Science.gov (United States)

    Velloso, Henrique; Vialle, Ricardo A; Ortega, J Miguel

    2015-06-02

    Bioinformaticians face a range of difficulties to get locally-installed tools running and producing results; they would greatly benefit from a system that could centralize most of the tools, using an easy interface for input and output. Web services, due to their universal nature and widely known interface, constitute a very good option to achieve this goal. Bioinformatics open web services (BOWS) is a system based on generic web services produced to allow programmatic access to applications running on high-performance computing (HPC) clusters. BOWS intermediates the access to registered tools by providing front-end and back-end web services. Programmers can install applications in HPC clusters in any programming language and use the back-end service to check for new jobs and their parameters, and then to send the results to BOWS. Programs running in simple computers consume the BOWS front-end service to submit new processes and read results. BOWS compiles Java clients, which encapsulate the front-end web service requisitions, and automatically creates a web page that disposes the registered applications and clients. Bioinformatics open web services registered applications can be accessed from virtually any programming language through web services, or using standard java clients. The back-end can run in HPC clusters, allowing bioinformaticians to remotely run high-processing demand applications directly from their machines.

  14. Bioconductor: open software development for computational biology and bioinformatics

    DEFF Research Database (Denmark)

    Gentleman, R.C.; Carey, V.J.; Bates, D.M.

    2004-01-01

    into interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples.......The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry...

  15. Use of BEXA Family Algorithms in Bioinformatics Data Classification

    OpenAIRE

    Gasparoviča-Asīte, M; Aleksejeva, L; Gersons, V

    2012-01-01

    This article studies the possibilities of BEXA family classification algorithms – BEXA, FuzzyBexa and FuzzyBexa II in data, especially bioinformatics data, classification. Three different types of data sets were used in the study – data sets often used in the literature (like Iris data set), UCI data repository real life data sets (like breast cancer data set) and real bioinformatics data sets that have the specific character – a large number of attributes (several thousands) and a small numb...

  16. Bioinformatics analysis of circulating cell-free DNA sequencing data.

    Science.gov (United States)

    Chan, Landon L; Jiang, Peiyong

    2015-10-01

    The discovery of cell-free DNA molecules in plasma has opened up numerous opportunities in noninvasive diagnosis. Cell-free DNA molecules have become increasingly recognized as promising biomarkers for detection and management of many diseases. The advent of next generation sequencing has provided unprecedented opportunities to scrutinize the characteristics of cell-free DNA molecules in plasma in a genome-wide fashion and at single-base resolution. Consequently, clinical applications of circulating cell-free DNA analysis have not only revolutionized noninvasive prenatal diagnosis but also facilitated cancer detection and monitoring toward an era of blood-based personalized medicine. With the remarkably increasing throughput and lowering cost of next generation sequencing, bioinformatics analysis becomes increasingly demanding to understand the large amount of data generated by these sequencing platforms. In this Review, we highlight the major bioinformatics algorithms involved in the analysis of cell-free DNA sequencing data. Firstly, we briefly describe the biological properties of these molecules and provide an overview of the general bioinformatics approach for the analysis of cell-free DNA. Then, we discuss the specific upstream bioinformatics considerations concerning the analysis of sequencing data of circulating cell-free DNA, followed by further detailed elaboration on each key clinical situation in noninvasive prenatal diagnosis and cancer management where downstream bioinformatics analysis is heavily involved. We also discuss bioinformatics analysis as well as clinical applications of the newly developed massively parallel bisulfite sequencing of cell-free DNA. Finally, we offer our perspectives on the future development of bioinformatics in noninvasive diagnosis. Copyright © 2015 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.

  17. What can bioinformatics do for Natural History museums?

    Directory of Open Access Journals (Sweden)

    Becerra, José María

    2003-06-01

    Full Text Available We propose the founding of a Natural History bioinformatics framework, which would solve one of the main problems in Natural History: data which is scattered around in many incompatible systems (not only computer systems, but also paper ones. This framework consists of computer resources (hardware and software, methodologies that ease the circulation of data, and staff expert in dealing with computers, who will develop software solutions to the problems encountered by naturalists. This system is organized in three layers: acquisition, data and analysis. Each layer is described, and an account of the elements that constitute it given.

    Se presentan las bases de una estructura bioinformática para Historia Natural, que trata de resolver uno de los principales problemas en ésta: la presencia de datos distribuidos a lo largo de muchos sistemas incompatibles entre sí (y no sólo hablamos de sistemas informáticos, sino también en papel. Esta estructura se sustenta en recursos informáticos (en sus dos vertientes: hardware y software, en metodologías que permitan la fácil circulación de los datos, y personal experto en el uso de ordenadores que se encargue de desarrollar soluciones software a los problemas que plantean los naturalistas. Este sistema estaría organizado en tres capas: de adquisición, de datos y de análisis. Cada una de estas capas se describe, indicando los elementos que la componen.

  18. The bioinformatics of psychosocial genomics in alternative and complementary medicine.

    Science.gov (United States)

    Rossi, E

    2003-06-01

    The bioinformatics of alternative and complementary medicine is outlined in 3 hypotheses that extend the molecular-genomic revolution initiated by Watson and Crick 50 years ago to include psychology in the new discipline of psychosocial and cultural genomics. Stress-induced changes in the alternative splicing of genes demonstrate how psychosomatic stress in humans modulates activity-dependent gene expression, protein formation, physiological function, and psychological experience. The molecular messengers generated by stress, injury, and disease can activate immediate early genes within stem cells so that they then signal the target genes required to synthesize the proteins that will transform (differentiate) stem cells into mature well-functioning tissues. Such activity-dependent gene expression and its consequent activity-dependent neurogenesis and stem cell healing is proposed as the molecular-genomic-cellular basis of rehabilitative medicine, physical, and occupational therapy as well as the many alternative and complementary approaches to mind-body healing. The therapeutic replaying of enriching life experiences that evoke the novelty-numinosum-neurogenesis effect during creative moments of art, music, dance, drama, humor, literature, poetry, and spirituality, as well as cultural rituals of life transitions (birth, puberty, marriage, illness, healing, and death) can optimize consciousness, personal relationships, and healing in a manner that has much in common with the psychogenomic foundations of naturalistic and complementary medicine. The entire history of alternative and complementary approaches to healing is consistent with this new neuroscience world view about the role of psychological arousal and fascination in modulating gene expression, neurogenesis, and healing via the psychosocial and cultural rites of human societies. Copyright 2003 S. Karger GmbH, Freiburg

  19. Widening participation would be key in enhancing bioinformatics and genomics research in Africa

    Science.gov (United States)

    Karikari, Thomas K.; Quansah, Emmanuel; Mohamed, Wael M.Y.

    2015-01-01

    Bioinformatics and genome science (BGS) are gradually gaining roots in Africa, contributing to studies that are leading to improved understanding of health, disease, agriculture and food security. While a few African countries have established foundations for research and training in these areas, BGS appear to be limited to only a few institutions in specific African countries. However, improving the disciplines in Africa will require pragmatic efforts to expand training and research partnerships to scientists in yet-unreached institutions. Here, we discuss the need to expand BGS programmes in Africa, and propose mechanisms to do so. PMID:26767163

  20. Widening participation would be key in enhancing bioinformatics and genomics research in Africa

    Directory of Open Access Journals (Sweden)

    Thomas K. Karikari

    2015-09-01

    Full Text Available Bioinformatics and genome science (BGS are gradually gaining roots in Africa, contributing to studies that are leading to improved understanding of health, disease, agriculture and food security. While a few African countries have established foundations for research and training in these areas, BGS appear to be limited to only a few institutions in specific African countries. However, improving the disciplines in Africa will require pragmatic efforts to expand training and research partnerships to scientists in yet-unreached institutions. Here, we discuss the need to expand BGS programmes in Africa, and propose mechanisms to do so.

  1. Survey of MapReduce frame operation in bioinformatics.

    Science.gov (United States)

    Zou, Quan; Li, Xu-Bin; Jiang, Wen-Rui; Lin, Zi-Yu; Li, Gui-Lin; Chen, Ke

    2014-07-01

    Bioinformatics is challenged by the fact that traditional analysis tools have difficulty in processing large-scale data from high-throughput sequencing. The open source Apache Hadoop project, which adopts the MapReduce framework and a distributed file system, has recently given bioinformatics researchers an opportunity to achieve scalable, efficient and reliable computing performance on Linux clusters and on cloud computing services. In this article, we present MapReduce frame-based applications that can be employed in the next-generation sequencing and other biological domains. In addition, we discuss the challenges faced by this field as well as the future works on parallel computing in bioinformatics. © The Author 2013. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  2. Developing expertise in bioinformatics for biomedical research in Africa.

    Science.gov (United States)

    Karikari, Thomas K; Quansah, Emmanuel; Mohamed, Wael M Y

    2015-09-01

    Research in bioinformatics has a central role in helping to advance biomedical research. However, its introduction to Africa has been met with some challenges (such as inadequate infrastructure, training opportunities, research funding, human resources, biorepositories and databases) that have contributed to the slow pace of development in this field across the continent. Fortunately, recent improvements in areas such as research funding, infrastructural support and capacity building are helping to develop bioinformatics into an important discipline in Africa. These contributions are leading to the establishment of world-class research facilities, biorepositories, training programmes, scientific networks and funding schemes to improve studies into disease and health in Africa. With increased contribution from all stakeholders, these developments could be further enhanced. Here, we discuss how the recent developments are contributing to the advancement of bioinformatics in Africa.

  3. Developing expertise in bioinformatics for biomedical research in Africa

    Directory of Open Access Journals (Sweden)

    Thomas K. Karikari

    2015-09-01

    Full Text Available Research in bioinformatics has a central role in helping to advance biomedical research. However, its introduction to Africa has been met with some challenges (such as inadequate infrastructure, training opportunities, research funding, human resources, biorepositories and databases that have contributed to the slow pace of development in this field across the continent. Fortunately, recent improvements in areas such as research funding, infrastructural support and capacity building are helping to develop bioinformatics into an important discipline in Africa. These contributions are leading to the establishment of world-class research facilities, biorepositories, training programmes, scientific networks and funding schemes to improve studies into disease and health in Africa. With increased contribution from all stakeholders, these developments could be further enhanced. Here, we discuss how the recent developments are contributing to the advancement of bioinformatics in Africa.

  4. GOBLET: the Global Organisation for Bioinformatics Learning, Education and Training.

    Science.gov (United States)

    Attwood, Teresa K; Atwood, Teresa K; Bongcam-Rudloff, Erik; Brazas, Michelle E; Corpas, Manuel; Gaudet, Pascale; Lewitter, Fran; Mulder, Nicola; Palagi, Patricia M; Schneider, Maria Victoria; van Gelder, Celia W G

    2015-04-01

    In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy--paradoxically, many are actually closing "niche" bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all.

  5. Bioinformatics in the Netherlands: the value of a nationwide community.

    Science.gov (United States)

    van Gelder, Celia W G; Hooft, Rob W W; van Rijswijk, Merlijn N; van den Berg, Linda; Kok, Ruben G; Reinders, Marcel; Mons, Barend; Heringa, Jaap

    2017-09-15

    This review provides a historical overview of the inception and development of bioinformatics research in the Netherlands. Rooted in theoretical biology by foundational figures such as Paulien Hogeweg (at Utrecht University since the 1970s), the developments leading to organizational structures supporting a relatively large Dutch bioinformatics community will be reviewed. We will show that the most valuable resource that we have built over these years is the close-knit national expert community that is well engaged in basic and translational life science research programmes. The Dutch bioinformatics community is accustomed to facing the ever-changing landscape of data challenges and working towards solutions together. In addition, this community is the stable factor on the road towards sustainability, especially in times where existing funding models are challenged and change rapidly. © The Author 2017. Published by Oxford University Press.

  6. Bioinformatic training needs at a health sciences campus.

    Science.gov (United States)

    Oliver, Jeffrey C

    2017-01-01

    Health sciences research is increasingly focusing on big data applications, such as genomic technologies and precision medicine, to address key issues in human health. These approaches rely on biological data repositories and bioinformatic analyses, both of which are growing rapidly in size and scope. Libraries play a key role in supporting researchers in navigating these and other information resources. With the goal of supporting bioinformatics research in the health sciences, the University of Arizona Health Sciences Library established a Bioinformation program. To shape the support provided by the library, I developed and administered a needs assessment survey to the University of Arizona Health Sciences campus in Tucson, Arizona. The survey was designed to identify the training topics of interest to health sciences researchers and the preferred modes of training. Survey respondents expressed an interest in a broad array of potential training topics, including "traditional" information seeking as well as interest in analytical training. Of particular interest were training in transcriptomic tools and the use of databases linking genotypes and phenotypes. Staff were most interested in bioinformatics training topics, while faculty were the least interested. Hands-on workshops were significantly preferred over any other mode of training. The University of Arizona Health Sciences Library is meeting those needs through internal programming and external partnerships. The results of the survey demonstrate a keen interest in a variety of bioinformatic resources; the challenge to the library is how to address those training needs. The mode of support depends largely on library staff expertise in the numerous subject-specific databases and tools. Librarian-led bioinformatic training sessions provide opportunities for engagement with researchers at multiple points of the research life cycle. When training needs exceed library capacity, partnering with intramural and

  7. Efficient Feature Selection and Classification of Protein Sequence Data in Bioinformatics

    Science.gov (United States)

    Faye, Ibrahima; Samir, Brahim Belhaouari; Md Said, Abas

    2014-01-01

    Bioinformatics has been an emerging area of research for the last three decades. The ultimate aims of bioinformatics were to store and manage the biological data, and develop and analyze computational tools to enhance their understanding. The size of data accumulated under various sequencing projects is increasing exponentially, which presents difficulties for the experimental methods. To reduce the gap between newly sequenced protein and proteins with known functions, many computational techniques involving classification and clustering algorithms were proposed in the past. The classification of protein sequences into existing superfamilies is helpful in predicting the structure and function of large amount of newly discovered proteins. The existing classification results are unsatisfactory due to a huge size of features obtained through various feature encoding methods. In this work, a statistical metric-based feature selection technique has been proposed in order to reduce the size of the extracted feature vector. The proposed method of protein classification shows significant improvement in terms of performance measure metrics: accuracy, sensitivity, specificity, recall, F-measure, and so forth. PMID:25045727

  8. Information Visualization Techniques in Bioinformatics during the Postgenomic Era

    Science.gov (United States)

    Tao, Ying; Liu, Yang; Friedman, Carol

    2010-01-01

    Information visualization techniques, which take advantage of the bandwidth of human vision, are powerful tools for organizing and analyzing a large amount of data. In the postgenomic era, information visualization tools are indispensable for biomedical research. This paper aims to present an overview of current applications of information visualization techniques in bioinformatics for visualizing different types of biological data, such as from genomics, proteomics, expression profiling and structural studies. Finally, we discuss the challenges of information visualization in bioinformatics related to dealing with more complex biological information in the emerging fields of systems biology and systems medicine. PMID:20976032

  9. Bioinformatic scaling of allosteric interactions in biomedical isozymes

    Science.gov (United States)

    Phillips, J. C.

    2016-09-01

    Allosteric (long-range) interactions can be surprisingly strong in proteins of biomedical interest. Here we use bioinformatic scaling to connect prior results on nonsteroidal anti-inflammatory drugs to promising new drugs that inhibit cancer cell metabolism. Many parallel features are apparent, which explain how even one amino acid mutation, remote from active sites, can alter medical results. The enzyme twins involved are cyclooxygenase (aspirin) and isocitrate dehydrogenase (IDH). The IDH results are accurate to 1% and are overdetermined by adjusting a single bioinformatic scaling parameter. It appears that the final stage in optimizing protein functionality may involve leveling of the hydrophobic limits of the arms of conformational hydrophilic hinges.

  10. Bioinformatics Tools for the Discovery of New Nonribosomal Peptides

    DEFF Research Database (Denmark)

    Leclère, Valérie; Weber, Tilmann; Jacques, Philippe

    2016-01-01

    This chapter helps in the use of bioinformatics tools relevant to the discovery of new nonribosomal peptides (NRPs) produced by microorganisms. The strategy described can be applied to draft or fully assembled genome sequences. It relies on the identification of the synthetase genes and the decip......This chapter helps in the use of bioinformatics tools relevant to the discovery of new nonribosomal peptides (NRPs) produced by microorganisms. The strategy described can be applied to draft or fully assembled genome sequences. It relies on the identification of the synthetase genes...

  11. Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum

    Energy Technology Data Exchange (ETDEWEB)

    Ditty, Jayna L.; Kvaal, Christopher A.; Goodner, Brad; Freyermuth, Sharyn K.; Bailey, Cheryl; Britton, Robert A.; Gordon, Stuart G.; Heinhorst, Sabine; Reed, Kelynne; Xu, Zhaohui; Sanders-Lorenz, Erin R.; Axen, Seth; Kim, Edwin; Johns, Mitrick; Scott, Kathleen; Kerfeld, Cheryl A.

    2011-08-01

    Undergraduate life sciences education needs an overhaul, as clearly described in the National Research Council of the National Academies publication BIO 2010: Transforming Undergraduate Education for Future Research Biologists. Among BIO 2010's top recommendations is the need to involve students in working with real data and tools that reflect the nature of life sciences research in the 21st century. Education research studies support the importance of utilizing primary literature, designing and implementing experiments, and analyzing results in the context of a bona fide scientific question in cultivating the analytical skills necessary to become a scientist. Incorporating these basic scientific methodologies in undergraduate education leads to increased undergraduate and post-graduate retention in the sciences. Toward this end, many undergraduate teaching organizations offer training and suggestions for faculty to update and improve their teaching approaches to help students learn as scientists, through design and discovery (e.g., Council of Undergraduate Research [www.cur.org] and Project Kaleidoscope [www.pkal.org]). With the advent of genome sequencing and bioinformatics, many scientists now formulate biological questions and interpret research results in the context of genomic information. Just as the use of bioinformatic tools and databases changed the way scientists investigate problems, it must change how scientists teach to create new opportunities for students to gain experiences reflecting the influence of genomics, proteomics, and bioinformatics on modern life sciences research. Educators have responded by incorporating bioinformatics into diverse life science curricula. While these published exercises in, and guidelines for, bioinformatics curricula are helpful and inspirational, faculty new to the area of bioinformatics inevitably need training in the theoretical underpinnings of the algorithms. Moreover, effectively integrating bioinformatics

  12. Probabilistic models and machine learning in structural bioinformatics.

    Science.gov (United States)

    Hamelryck, Thomas

    2009-10-01

    Structural bioinformatics is concerned with the molecular structure of biomacromolecules on a genomic scale, using computational methods. Classic problems in structural bioinformatics include the prediction of protein and RNA structure from sequence, the design of artificial proteins or enzymes, and the automated analysis and comparison of biomacromolecules in atomic detail. The determination of macromolecular structure from experimental data (for example coming from nuclear magnetic resonance, X-ray crystallography or small angle X-ray scattering) has close ties with the field of structural bioinformatics. Recently, probabilistic models and machine learning methods based on Bayesian principles are providing efficient and rigorous solutions to challenging problems that were long regarded as intractable. In this review, I will highlight some important recent developments in the prediction, analysis and experimental determination of macromolecular structure that are based on such methods. These developments include generative models of protein structure, the estimation of the parameters of energy functions that are used in structure prediction, the superposition of macromolecules and structure determination methods that are based on inference. Although this review is not exhaustive, I believe the selected topics give a good impression of the exciting new, probabilistic road the field of structural bioinformatics is taking.

  13. Mathematics and evolutionary biology make bioinformatics education comprehensible.

    Science.gov (United States)

    Jungck, John R; Weisstein, Anton E

    2013-09-01

    The patterns of variation within a molecular sequence data set result from the interplay between population genetic, molecular evolutionary and macroevolutionary processes-the standard purview of evolutionary biologists. Elucidating these patterns, particularly for large data sets, requires an understanding of the structure, assumptions and limitations of the algorithms used by bioinformatics software-the domain of mathematicians and computer scientists. As a result, bioinformatics often suffers a 'two-culture' problem because of the lack of broad overlapping expertise between these two groups. Collaboration among specialists in different fields has greatly mitigated this problem among active bioinformaticians. However, science education researchers report that much of bioinformatics education does little to bridge the cultural divide, the curriculum too focused on solving narrow problems (e.g. interpreting pre-built phylogenetic trees) rather than on exploring broader ones (e.g. exploring alternative phylogenetic strategies for different kinds of data sets). Herein, we present an introduction to the mathematics of tree enumeration, tree construction, split decomposition and sequence alignment. We also introduce off-line downloadable software tools developed by the BioQUEST Curriculum Consortium to help students learn how to interpret and critically evaluate the results of standard bioinformatics analyses.

  14. Science Academies' Refresher Course on Bioinformatics in Modern ...

    Indian Academy of Sciences (India)

    IAS Admin

    A Refresher Course on 'Bioinformatics in Modern Biology' for graduate and postgraduate college/university teachers will be held at School of Life Sciences, Manipal University, Manipal for two weeks from 5 to 17 May 2014. The objective of this Course is to improvise on teaching methodologies incorporating online teaching ...

  15. The structural bioinformatics library: modeling in biomolecular science and beyond.

    Science.gov (United States)

    Cazals, Frédéric; Dreyfus, Tom

    2017-04-01

    Software in structural bioinformatics has mainly been application driven. To favor practitioners seeking off-the-shelf applications, but also developers seeking advanced building blocks to develop novel applications, we undertook the design of the Structural Bioinformatics Library ( SBL , http://sbl.inria.fr ), a generic C ++/python cross-platform software library targeting complex problems in structural bioinformatics. Its tenet is based on a modular design offering a rich and versatile framework allowing the development of novel applications requiring well specified complex operations, without compromising robustness and performances. The SBL involves four software components (1-4 thereafter). For end-users, the SBL provides ready to use, state-of-the-art (1) applications to handle molecular models defined by unions of balls, to deal with molecular flexibility, to model macro-molecular assemblies. These applications can also be combined to tackle integrated analysis problems. For developers, the SBL provides a broad C ++ toolbox with modular design, involving core (2) algorithms , (3) biophysical models and (4) modules , the latter being especially suited to develop novel applications. The SBL comes with a thorough documentation consisting of user and reference manuals, and a bugzilla platform to handle community feedback. The SBL is available from http://sbl.inria.fr. Frederic.Cazals@inria.fr. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  16. A quick guide for building a successful bioinformatics community.

    Science.gov (United States)

    Budd, Aidan; Corpas, Manuel; Brazas, Michelle D; Fuller, Jonathan C; Goecks, Jeremy; Mulder, Nicola J; Michaut, Magali; Ouellette, B F Francis; Pawlik, Aleksandra; Blomberg, Niklas

    2015-02-01

    "Scientific community" refers to a group of people collaborating together on scientific-research-related activities who also share common goals, interests, and values. Such communities play a key role in many bioinformatics activities. Communities may be linked to a specific location or institute, or involve people working at many different institutions and locations. Education and training is typically an important component of these communities, providing a valuable context in which to develop skills and expertise, while also strengthening links and relationships within the community. Scientific communities facilitate: (i) the exchange and development of ideas and expertise; (ii) career development; (iii) coordinated funding activities; (iv) interactions and engagement with professionals from other fields; and (v) other activities beneficial to individual participants, communities, and the scientific field as a whole. It is thus beneficial at many different levels to understand the general features of successful, high-impact bioinformatics communities; how individual participants can contribute to the success of these communities; and the role of education and training within these communities. We present here a quick guide to building and maintaining a successful, high-impact bioinformatics community, along with an overview of the general benefits of participating in such communities. This article grew out of contributions made by organizers, presenters, panelists, and other participants of the ISMB/ECCB 2013 workshop "The 'How To Guide' for Establishing a Successful Bioinformatics Network" at the 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and the 12th European Conference on Computational Biology (ECCB).

  17. Bioinformatics in the Netherlands : The value of a nationwide community

    NARCIS (Netherlands)

    van Gelder, Celia W.G.; Hooft, Rob; van Rijswijk, Merlijn; van den Berg, Linda; Kok, Ruben; Reinders, M.J.T.; Mons, Barend; Heringa, Jaap

    2017-01-01

    This review provides a historical overview of the inception and development of bioinformatics research in the Netherlands. Rooted in theoretical biology by foundational figures such as Paulien Hogeweg (at Utrecht University since the 1970s), the developments leading to organizational structures

  18. Bioinformatic tools and guideline for PCR primer design | Abd ...

    African Journals Online (AJOL)

    Bioinformatics has become an essential tool not only for basic research but also for applied research in biotechnology and biomedical sciences. Optimal primer sequence and appropriate primer concentration are essential for maximal specificity and efficiency of PCR. A poorly designed primer can result in little or no ...

  19. A BIOINFORMATIC STRATEGY TO RAPIDLY CHARACTERIZE CDNA LIBRARIES

    Science.gov (United States)

    A Bioinformatic Strategy to Rapidly Characterize cDNA LibrariesG. Charles Ostermeier1, David J. Dix2 and Stephen A. Krawetz1.1Departments of Obstetrics and Gynecology, Center for Molecular Medicine and Genetics, & Institute for Scientific Computing, Wayne State Univer...

  20. Robust enzyme design: bioinformatic tools for improved protein stability.

    Science.gov (United States)

    Suplatov, Dmitry; Voevodin, Vladimir; Švedas, Vytas

    2015-03-01

    The ability of proteins and enzymes to maintain a functionally active conformation under adverse environmental conditions is an important feature of biocatalysts, vaccines, and biopharmaceutical proteins. From an evolutionary perspective, robust stability of proteins improves their biological fitness and allows for further optimization. Viewed from an industrial perspective, enzyme stability is crucial for the practical application of enzymes under the required reaction conditions. In this review, we analyze bioinformatic-driven strategies that are used to predict structural changes that can be applied to wild type proteins in order to produce more stable variants. The most commonly employed techniques can be classified into stochastic approaches, empirical or systematic rational design strategies, and design of chimeric proteins. We conclude that bioinformatic analysis can be efficiently used to study large protein superfamilies systematically as well as to predict particular structural changes which increase enzyme stability. Evolution has created a diversity of protein properties that are encoded in genomic sequences and structural data. Bioinformatics has the power to uncover this evolutionary code and provide a reproducible selection of hotspots - key residues to be mutated in order to produce more stable and functionally diverse proteins and enzymes. Further development of systematic bioinformatic procedures is needed to organize and analyze sequences and structures of proteins within large superfamilies and to link them to function, as well as to provide knowledge-based predictions for experimental evaluation. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  1. Science Academies' Refresher Course on Bioinformatics in Modern ...

    Indian Academy of Sciences (India)

    Home; Journals; Resonance – Journal of Science Education; Volume 19; Issue 2. Science Academies' Refresher Course on Bioinformatics in Modern Biology. Information and Announcements Volume 19 Issue 2 February 2014 pp 192-192. Fulltext. Click here to view fulltext PDF. Permanent link:

  2. Mathematics and evolutionary biology make bioinformatics education comprehensible

    Science.gov (United States)

    Weisstein, Anton E.

    2013-01-01

    The patterns of variation within a molecular sequence data set result from the interplay between population genetic, molecular evolutionary and macroevolutionary processes—the standard purview of evolutionary biologists. Elucidating these patterns, particularly for large data sets, requires an understanding of the structure, assumptions and limitations of the algorithms used by bioinformatics software—the domain of mathematicians and computer scientists. As a result, bioinformatics often suffers a ‘two-culture’ problem because of the lack of broad overlapping expertise between these two groups. Collaboration among specialists in different fields has greatly mitigated this problem among active bioinformaticians. However, science education researchers report that much of bioinformatics education does little to bridge the cultural divide, the curriculum too focused on solving narrow problems (e.g. interpreting pre-built phylogenetic trees) rather than on exploring broader ones (e.g. exploring alternative phylogenetic strategies for different kinds of data sets). Herein, we present an introduction to the mathematics of tree enumeration, tree construction, split decomposition and sequence alignment. We also introduce off-line downloadable software tools developed by the BioQUEST Curriculum Consortium to help students learn how to interpret and critically evaluate the results of standard bioinformatics analyses. PMID:23821621

  3. Bioinformatics Analysis of Envelope Glycoprotein E epitopes of ...

    African Journals Online (AJOL)

    User

    2011-05-02

    May 2, 2011 ... A virus-neutralizing antibody by a virus-specific synthetic peptide. J. Virol. 55(3): 836-839. Geourjon C, Deléage G (1995). SOPMA: significant improvements in protein secondary structure prediction by consensus prediction from multiple alignments. Bioinformatics, 11(6): 681-684. Guex N, Peitsch MC ...

  4. CROSSWORK for Glycans: Glycan Identificatin Through Mass Spectrometry and Bioinformatics

    DEFF Research Database (Denmark)

    Rasmussen, Morten; Thaysen-Andersen, Morten; Højrup, Peter

      We have developed "GLYCANthrope " - CROSSWORKS for glycans:  a bioinformatics tool, which assists in identifying N-linked glycosylated peptides as well as their glycan moieties from MS2 data of enzymatically digested glycoproteins. The program runs either as a stand-alone application or as a plug...

  5. Intrageneric Primer Design: Bringing Bioinformatics Tools to the Class

    Science.gov (United States)

    Lima, Andre O. S.; Garces, Sergio P. S.

    2006-01-01

    Bioinformatics is one of the fastest growing scientific areas over the last decade. It focuses on the use of informatics tools for the organization and analysis of biological data. An example of their importance is the availability nowadays of dozens of software programs for genomic and proteomic studies. Thus, there is a growing field (private…

  6. Learning Genetics through an Authentic Research Simulation in Bioinformatics

    Science.gov (United States)

    Gelbart, Hadas; Yarden, Anat

    2006-01-01

    Following the rationale that learning is an active process of knowledge construction as well as enculturation into a community of experts, we developed a novel web-based learning environment in bioinformatics for high-school biology majors in Israel. The learning environment enables the learners to actively participate in a guided inquiry process…

  7. Bioinformatic analyses of kappa casein gene in mammalian ...

    African Journals Online (AJOL)

    Kappa casein (CSN3) gene is a variant of the milk protein highly conserved in mammalian species. Genetic variations in CSN3 gene of six mammalian livestock species were investigated using bioinformatics approach. A total of twenty-seven CSN3 gene sequences with corresponding amino acids belonging to the six ...

  8. Hidden in the Middle: Culture, Value and Reward in Bioinformatics

    Science.gov (United States)

    Lewis, Jamie; Bartlett, Andrew; Atkinson, Paul

    2016-01-01

    Bioinformatics--the so-called shotgun marriage between biology and computer science--is an interdiscipline. Despite interdisciplinarity being seen as a virtue, for having the capacity to solve complex problems and foster innovation, it has the potential to place projects and people in anomalous categories. For example, valorised…

  9. BIOINFORMATICS SOFTWARE FROM INDIA: CURRENT STATUS AND CHALLENGES

    Directory of Open Access Journals (Sweden)

    Deepti D. Deobagkar

    2017-09-01

    Full Text Available Bioinformatics software and visualisation tools have been a key factor in the rapid and phenomenal advances in genomics, proteomics, medicine, drug discovery, systems approaches and in fact in every area of new development. Indian scientists have also made a mark in a few specific areas. India has an advantage of an early start and extensive and organised network in the Bioinformatics education and research with substantial inputs from the Indian government. India has a strong hold in computation and IT and has a pool of bright and young talent with demographic dividend along with experienced and excellent mentors and researchers. Although small in number and scale, Bioinformatics Industry also has a presence and is making its mark in India. There are a number of high throughput and extremely useful resources available which are critical in biological data analysis and interpretation. This has made a paradigm shift in the way research can be carried out and discoveries can be made in any area of biological, biochemical and chemical research. This article summarises the current status and contributions from India in the development of software and web servers for Bioinformatics applications.

  10. Bioinformatics and phylogenetic analysis of human Tp73 gene

    African Journals Online (AJOL)

    Imtiaz

    2013-06-26

    Jun 26, 2013 ... 2Bioinformatics and Biotechnology, DES, FBAS International Islamic University, Islamabad, Pakistan. Accepted 26 April, 2013. The Tp73 ... New discoveries about the control and function of p73 are still in progress and it is ..... modern research for diagnostics and evolutionary history of p73. REFERENCES.

  11. The OAuth 2.0 Web Authorization Protocol for the Internet Addiction Bioinformatics (IABio) Database.

    Science.gov (United States)

    Choi, Jeongseok; Kim, Jaekwon; Lee, Dong Kyun; Jang, Kwang Soo; Kim, Dai-Jin; Choi, In Young

    2016-03-01

    Internet addiction (IA) has become a widespread and problematic phenomenon as smart devices pervade society. Moreover, internet gaming disorder leads to increases in social expenditures for both individuals and nations alike. Although the prevention and treatment of IA are getting more important, the diagnosis of IA remains problematic. Understanding the neurobiological mechanism of behavioral addictions is essential for the development of specific and effective treatments. Although there are many databases related to other addictions, a database for IA has not been developed yet. In addition, bioinformatics databases, especially genetic databases, require a high level of security and should be designed based on medical information standards. In this respect, our study proposes the OAuth standard protocol for database access authorization. The proposed IA Bioinformatics (IABio) database system is based on internet user authentication, which is a guideline for medical information standards, and uses OAuth 2.0 for access control technology. This study designed and developed the system requirements and configuration. The OAuth 2.0 protocol is expected to establish the security of personal medical information and be applied to genomic research on IA.

  12. The OAuth 2.0 Web Authorization Protocol for the Internet Addiction Bioinformatics (IABio Database

    Directory of Open Access Journals (Sweden)

    Jeongseok Choi

    2016-03-01

    Full Text Available Internet addiction (IA has become a widespread and problematic phenomenon as smart devices pervade society. Moreover, internet gaming disorder leads to increases in social expenditures for both individuals and nations alike. Although the prevention and treatment of IA are getting more important, the diagnosis of IA remains problematic. Understanding the neurobiological mechanism of behavioral addictions is essential for the development of specific and effective treatments. Although there are many databases related to other addictions, a database for IA has not been developed yet. In addition, bioinformatics databases, especially genetic databases, require a high level of security and should be designed based on medical information standards. In this respect, our study proposes the OAuth standard protocol for database access authorization. The proposed IA Bioinformatics (IABio database system is based on internet user authentication, which is a guideline for medical information standards, and uses OAuth 2.0 for access control technology. This study designed and developed the system requirements and configuration. The OAuth 2.0 protocol is expected to establish the security of personal medical information and be applied to genomic research on IA.

  13. BioWarehouse: a bioinformatics database warehouse toolkit

    Directory of Open Access Journals (Sweden)

    Stringer-Calvert David WJ

    2006-03-01

    Full Text Available Abstract Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the

  14. Bioclipse: an open source workbench for chemo- and bioinformatics

    Directory of Open Access Journals (Sweden)

    Wagener Johannes

    2007-02-01

    Full Text Available Abstract Background There is a need for software applications that provide users with a complete and extensible toolkit for chemo- and bioinformatics accessible from a single workbench. Commercial packages are expensive and closed source, hence they do not allow end users to modify algorithms and add custom functionality. Existing open source projects are more focused on providing a framework for integrating existing, separately installed bioinformatics packages, rather than providing user-friendly interfaces. No open source chemoinformatics workbench has previously been published, and no sucessful attempts have been made to integrate chemo- and bioinformatics into a single framework. Results Bioclipse is an advanced workbench for resources in chemo- and bioinformatics, such as molecules, proteins, sequences, spectra, and scripts. It provides 2D-editing, 3D-visualization, file format conversion, calculation of chemical properties, and much more; all fully integrated into a user-friendly desktop application. Editing supports standard functions such as cut and paste, drag and drop, and undo/redo. Bioclipse is written in Java and based on the Eclipse Rich Client Platform with a state-of-the-art plugin architecture. This gives Bioclipse an advantage over other systems as it can easily be extended with functionality in any desired direction. Conclusion Bioclipse is a powerful workbench for bio- and chemoinformatics as well as an advanced integration platform. The rich functionality, intuitive user interface, and powerful plugin architecture make Bioclipse the most advanced and user-friendly open source workbench for chemo- and bioinformatics. Bioclipse is released under Eclipse Public License (EPL, an open source license which sets no constraints on external plugin licensing; it is totally open for both open source plugins as well as commercial ones. Bioclipse is freely available at http://www.bioclipse.net.

  15. Dynamic partial reconfiguration implementation of the SVM/KNN multi-classifier on FPGA for bioinformatics application.

    Science.gov (United States)

    Hussain, Hanaa M; Benkrid, Khaled; Seker, Huseyin

    2015-01-01

    Bioinformatics data tend to be highly dimensional in nature thus impose significant computational demands. To resolve limitations of conventional computing methods, several alternative high performance computing solutions have been proposed by scientists such as Graphical Processing Units (GPUs) and Field Programmable Gate Arrays (FPGAs). The latter have shown to be efficient and high in performance. In recent years, FPGAs have been benefiting from dynamic partial reconfiguration (DPR) feature for adding flexibility to alter specific regions within the chip. This work proposes combing the use of FPGAs and DPR to build a dynamic multi-classifier architecture that can be used in processing bioinformatics data. In bioinformatics, applying different classification algorithms to the same dataset is desirable in order to obtain comparable, more reliable and consensus decision, but it can consume long time when performed on conventional PC. The DPR implementation of two common classifiers, namely support vector machines (SVMs) and K-nearest neighbor (KNN) are combined together to form a multi-classifier FPGA architecture which can utilize specific region of the FPGA to work as either SVM or KNN classifier. This multi-classifier DPR implementation achieved at least ~8x reduction in reconfiguration time over the single non-DPR classifier implementation, and occupied less space and hardware resources than having both classifiers. The proposed architecture can be extended to work as an ensemble classifier.

  16. Bioinformatics training: selecting an appropriate learning content management system--an example from the European Bioinformatics Institute.

    Science.gov (United States)

    Wright, Victoria Ann; Vaughan, Brendan W; Laurent, Thomas; Lopez, Rodrigo; Brooksbank, Cath; Schneider, Maria Victoria

    2010-11-01

    Today's molecular life scientists are well educated in the emerging experimental tools of their trade, but when it comes to training on the myriad of resources and tools for dealing with biological data, a less ideal situation emerges. Often bioinformatics users receive no formal training on how to make the most of the bioinformatics resources and tools available in the public domain. The European Bioinformatics Institute, which is part of the European Molecular Biology Laboratory (EMBL-EBI), holds the world's most comprehensive collection of molecular data, and training the research community to exploit this information is embedded in the EBI's mission. We have evaluated eLearning, in parallel with face-to-face courses, as a means of training users of our data resources and tools. We anticipate that eLearning will become an increasingly important vehicle for delivering training to our growing user base, so we have undertaken an extensive review of Learning Content Management Systems (LCMSs). Here, we describe the process that we used, which considered the requirements of trainees, trainers and systems administrators, as well as taking into account our organizational values and needs. This review describes the literature survey, user discussions and scripted platform testing that we performed to narrow down our choice of platform from 36 to a single platform. We hope that it will serve as guidance for others who are seeking to incorporate eLearning into their bioinformatics training programmes.

  17. Missing "Links" in Bioinformatics Education: Expanding Students' Conceptions of Bioinformatics Using a Biodiversity Database of Living and Fossil Reef Corals

    Science.gov (United States)

    Nehm, Ross H.; Budd, Ann F.

    2006-01-01

    NMITA is a reef coral biodiversity database that we use to introduce students to the expansive realm of bioinformatics beyond genetics. We introduce a series of lessons that have students use this database, thereby accessing real data that can be used to test hypotheses about biodiversity and evolution while targeting the "National Science …

  18. LEARNING HORMONE ACTION MECHANISMS WITH BIOINFORMATICS

    Directory of Open Access Journals (Sweden)

    João Carlos Sousa

    2007-05-01

    Full Text Available The ability to manage the constantly growing information in genetics availableon the internet is becoming crucial in biochemical education and medicalpractice. Therefore, developing students skills in working with bioinformaticstools is a challenge to undergraduate courses in the molecular life sciences.The regulation of gene transcription by hormones and vitamins is a complextopic that influences all body systems. We describe a student centered activityused in a multidisciplinary “Functional Organ System“ course on the EndocrineSystem. By receiving, as teams, a nucleotide sequence of a hormone orvitamin-response element, students navigate through internet databases to findthe gene to which it belongs. Subsequently, student’s search how thecorresponding hormone/vitamin influences the expression of that particulargene and how a dysfunctional interaction might cause disease. This activity,proposed for 4 consecutive years to cohorts of 50-60 students/year enrolled inthe 2nd year our undergraduate medical degree, revealed that 90% of thestudents developed a better understanding of the usefulness of bioinformaticsand that 98% intend to use them in the future. Since hormones and vitaminsregulate genes of all body organ systems, this web-based activity successfullyintegrates the whole body physiology of the medical curriculum and can be ofrelevance to other courses on molecular life sciences.

  19. Bioinformatics algorithm based on a parallel implementation of a machine learning approach using transducers

    International Nuclear Information System (INIS)

    Roche-Lima, Abiel; Thulasiram, Ruppa K

    2012-01-01

    Finite automata, in which each transition is augmented with an output label in addition to the familiar input label, are considered finite-state transducers. Transducers have been used to analyze some fundamental issues in bioinformatics. Weighted finite-state transducers have been proposed to pairwise alignments of DNA and protein sequences; as well as to develop kernels for computational biology. Machine learning algorithms for conditional transducers have been implemented and used for DNA sequence analysis. Transducer learning algorithms are based on conditional probability computation. It is calculated by using techniques, such as pair-database creation, normalization (with Maximum-Likelihood normalization) and parameters optimization (with Expectation-Maximization - EM). These techniques are intrinsically costly for computation, even worse when are applied to bioinformatics, because the databases sizes are large. In this work, we describe a parallel implementation of an algorithm to learn conditional transducers using these techniques. The algorithm is oriented to bioinformatics applications, such as alignments, phylogenetic trees, and other genome evolution studies. Indeed, several experiences were developed using the parallel and sequential algorithm on Westgrid (specifically, on the Breeze cluster). As results, we obtain that our parallel algorithm is scalable, because execution times are reduced considerably when the data size parameter is increased. Another experience is developed by changing precision parameter. In this case, we obtain smaller execution times using the parallel algorithm. Finally, number of threads used to execute the parallel algorithm on the Breezy cluster is changed. In this last experience, we obtain as result that speedup is considerably increased when more threads are used; however there is a convergence for number of threads equal to or greater than 16.

  20. Introductory Bioinformatics Exercises Utilizing Hemoglobin and Chymotrypsin to Reinforce the Protein Sequence-Structure-Function Relationship

    Science.gov (United States)

    Inlow, Jennifer K.; Miller, Paige; Pittman, Bethany

    2007-01-01

    We describe two bioinformatics exercises intended for use in a computer laboratory setting in an upper-level undergraduate biochemistry course. To introduce students to bioinformatics, the exercises incorporate several commonly used bioinformatics tools, including BLAST, that are freely available online. The exercises build upon the students'…

  1. Vertical and Horizontal Integration of Bioinformatics Education: A Modular, Interdisciplinary Approach

    Science.gov (United States)

    Furge, Laura Lowe; Stevens-Truss, Regina; Moore, D. Blaine; Langeland, James A.

    2009-01-01

    Bioinformatics education for undergraduates has been approached primarily in two ways: introduction of new courses with largely bioinformatics focus or introduction of bioinformatics experiences into existing courses. For small colleges such as Kalamazoo, creation of new courses within an already resource-stretched setting has not been an option.…

  2. Report on the EMBER Project--A European Multimedia Bioinformatics Educational Resource

    Science.gov (United States)

    Attwood, Terri K.; Selimas, Ioannis; Buis, Rob; Altenburg, Ruud; Herzog, Robert; Ledent, Valerie; Ghita, Viorica; Fernandes, Pedro; Marques, Isabel; Brugman, Marc

    2005-01-01

    EMBER was a European project aiming to develop bioinformatics teaching materials on the Web and CD-ROM to help address the recognised skills shortage in bioinformatics. The project grew out of pilot work on the development of an interactive web-based bioinformatics tutorial and the desire to repackage that resource with the help of a professional…

  3. Applying Instructional Design Theories to Bioinformatics Education in Microarray Analysis and Primer Design Workshops

    Science.gov (United States)

    Shachak, Aviv; Ophir, Ron; Rubin, Eitan

    2005-01-01

    The need to support bioinformatics training has been widely recognized by scientists, industry, and government institutions. However, the discussion of instructional methods for teaching bioinformatics is only beginning. Here we report on a systematic attempt to design two bioinformatics workshops for graduate biology students on the basis of…

  4. Quantum Bio-Informatics II From Quantum Information to Bio-Informatics

    Science.gov (United States)

    Accardi, L.; Freudenberg, Wolfgang; Ohya, Masanori

    2009-02-01

    / H. Kamimura -- Massive collection of full-length complementary DNA clones and microarray analyses: keys to rice transcriptome analysis / S. Kikuchi -- Changes of influenza A(H5) viruses by means of entropic chaos degree / K. Sato and M. Ohya -- Basics of genome sequence analysis in bioinformatics - its fundamental ideas and problems / T. Suzuki and S. Miyazaki -- A basic introduction to gene expression studies using microarray expression data analysis / D. Wanke and J. Kilian -- Integrating biological perspectives: a quantum leap for microarray expression analysis / D. Wanke ... [et al.].

  5. Open discovery: An integrated live Linux platform of Bioinformatics tools.

    Science.gov (United States)

    Vetrivel, Umashankar; Pilla, Kalabharath

    2008-01-01

    Historically, live linux distributions for Bioinformatics have paved way for portability of Bioinformatics workbench in a platform independent manner. Moreover, most of the existing live Linux distributions limit their usage to sequence analysis and basic molecular visualization programs and are devoid of data persistence. Hence, open discovery - a live linux distribution has been developed with the capability to perform complex tasks like molecular modeling, docking and molecular dynamics in a swift manner. Furthermore, it is also equipped with complete sequence analysis environment and is capable of running windows executable programs in Linux environment. Open discovery portrays the advanced customizable configuration of fedora, with data persistency accessible via USB drive or DVD. The Open Discovery is distributed free under Academic Free License (AFL) and can be downloaded from http://www.OpenDiscovery.org.in.

  6. Statistical modelling in biostatistics and bioinformatics selected papers

    CERN Document Server

    Peng, Defen

    2014-01-01

    This book presents selected papers on statistical model development related mainly to the fields of Biostatistics and Bioinformatics. The coverage of the material falls squarely into the following categories: (a) Survival analysis and multivariate survival analysis, (b) Time series and longitudinal data analysis, (c) Statistical model development and (d) Applied statistical modelling. Innovations in statistical modelling are presented throughout each of the four areas, with some intriguing new ideas on hierarchical generalized non-linear models and on frailty models with structural dispersion, just to mention two examples. The contributors include distinguished international statisticians such as Philip Hougaard, John Hinde, Il Do Ha, Roger Payne and Alessandra Durio, among others, as well as promising newcomers. Some of the contributions have come from researchers working in the BIO-SI research programme on Biostatistics and Bioinformatics, centred on the Universities of Limerick and Galway in Ireland and fu...

  7. Data Mining for Grammatical Inference with Bioinformatics Criteria

    Science.gov (United States)

    López, Vivian F.; Aguilar, Ramiro; Alonso, Luis; Moreno, María N.; Corchado, Juan M.

    In this paper we describe both theoretical and practical results of a novel data mining process that combines hybrid techniques of association analysis and classical sequentiation algorithms of genomics to generate grammatical structures of a specific language. We used an application of a compilers generator system that allows the development of a practical application within the area of grammarware, where the concepts of the language analysis are applied to other disciplines, such as Bioinformatic. The tool allows the complexity of the obtained grammar to be measured automatically from textual data. A technique of incremental discovery of sequential patterns is presented to obtain simplified production rules, and compacted with bioinformatics criteria to make up a grammar.

  8. 2nd Colombian Congress on Computational Biology and Bioinformatics

    CERN Document Server

    Cristancho, Marco; Isaza, Gustavo; Pinzón, Andrés; Rodríguez, Juan

    2014-01-01

    This volume compiles accepted contributions for the 2nd Edition of the Colombian Computational Biology and Bioinformatics Congress CCBCOL, after a rigorous review process in which 54 papers were accepted for publication from 119 submitted contributions. Bioinformatics and Computational Biology are areas of knowledge that have emerged due to advances that have taken place in the Biological Sciences and its integration with Information Sciences. The expansion of projects involving the study of genomes has led the way in the production of vast amounts of sequence data which needs to be organized, analyzed and stored to understand phenomena associated with living organisms related to their evolution, behavior in different ecosystems, and the development of applications that can be derived from this analysis.  .

  9. Bioinformatics data distribution and integration via Web Services and XML.

    Science.gov (United States)

    Li, Xiao; Zhang, Yizheng

    2003-11-01

    It is widely recognized that exchange, distribution, and integration of biological data are the keys to improve bioinformatics and genome biology in post-genomic era. However, the problem of exchanging and integrating biology data is not solved satisfactorily. The eXtensible Markup Language (XML) is rapidly spreading as an emerging standard for structuring documents to exchange and integrate data on the World Wide Web (WWW). Web service is the next generation of WWW and is founded upon the open standards of W3C (World Wide Web Consortium) and IETF (Internet Engineering Task Force). This paper presents XML and Web Services technologies and their use for an appropriate solution to the problem of bioinformatics data exchange and integration.

  10. Architecture exploration of FPGA based accelerators for bioinformatics applications

    CERN Document Server

    Varma, B Sharat Chandra; Balakrishnan, M

    2016-01-01

    This book presents an evaluation methodology to design future FPGA fabrics incorporating hard embedded blocks (HEBs) to accelerate applications. This methodology will be useful for selection of blocks to be embedded into the fabric and for evaluating the performance gain that can be achieved by such an embedding. The authors illustrate the use of their methodology by studying the impact of HEBs on two important bioinformatics applications: protein docking and genome assembly. The book also explains how the respective HEBs are designed and how hardware implementation of the application is done using these HEBs. It shows that significant speedups can be achieved over pure software implementations by using such FPGA-based accelerators. The methodology presented in this book may also be used for designing HEBs for accelerating software implementations in other domains besides bioinformatics. This book will prove useful to students, researchers, and practicing engineers alike.

  11. Personalized medicine: challenges and opportunities for translational bioinformatics.

    Science.gov (United States)

    Overby, Casey Lynnette; Tarczy-Hornoch, Peter

    2013-07-01

    Personalized medicine can be defined broadly as a model of healthcare that is predictive, personalized, preventive and participatory. Two US President's Council of Advisors on Science and Technology reports illustrate challenges in personalized medicine (in a 2008 report) and in use of health information technology (in a 2010 report). Translational bioinformatics is a field that can help address these challenges and is defined by the American Medical Informatics Association as "the development of storage, analytic and interpretive methods to optimize the transformation of increasing voluminous biomedical data into proactive, predictive, preventative and participatory health." This article discusses barriers to implementing genomics applications and current progress toward overcoming barriers, describes lessons learned from early experiences of institutions engaged in personalized medicine and provides example areas for translational bioinformatics research inquiry.

  12. Meeting review: 2002 O'Reilly Bioinformatics Technology Conference.

    Science.gov (United States)

    Counsell, Damian

    2002-01-01

    At the end of January I travelled to the States to speak at and attend the first O'Reilly Bioinformatics Technology Conference. It was a large, well-organized and diverse meeting with an interesting history. Although the meeting was not a typical academic conference, its style will, I am sure, become more typical of meetings in both biological and computational sciences.Speakers at the event included prominent bioinformatics researchers such as Ewan Birney, Terry Gaasterland and Lincoln Stein; authors and leaders in the open source programming community like Damian Conway and Nat Torkington; and representatives from several publishing companies including the Nature Publishing Group, Current Science Group and the President of O'Reilly himself, Tim O'Reilly. There were presentations, tutorials, debates, quizzes and even a 'jam session' for musical bioinformaticists.

  13. A quick guide for building a successful bioinformatics community.

    Directory of Open Access Journals (Sweden)

    Aidan Budd

    2015-02-01

    Full Text Available "Scientific community" refers to a group of people collaborating together on scientific-research-related activities who also share common goals, interests, and values. Such communities play a key role in many bioinformatics activities. Communities may be linked to a specific location or institute, or involve people working at many different institutions and locations. Education and training is typically an important component of these communities, providing a valuable context in which to develop skills and expertise, while also strengthening links and relationships within the community. Scientific communities facilitate: (i the exchange and development of ideas and expertise; (ii career development; (iii coordinated funding activities; (iv interactions and engagement with professionals from other fields; and (v other activities beneficial to individual participants, communities, and the scientific field as a whole. It is thus beneficial at many different levels to understand the general features of successful, high-impact bioinformatics communities; how individual participants can contribute to the success of these communities; and the role of education and training within these communities. We present here a quick guide to building and maintaining a successful, high-impact bioinformatics community, along with an overview of the general benefits of participating in such communities. This article grew out of contributions made by organizers, presenters, panelists, and other participants of the ISMB/ECCB 2013 workshop "The 'How To Guide' for Establishing a Successful Bioinformatics Network" at the 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB and the 12th European Conference on Computational Biology (ECCB.

  14. Bioinformatics meets user-centred design: a perspective.

    Directory of Open Access Journals (Sweden)

    Katrina Pavelin

    Full Text Available Designers have a saying that "the joy of an early release lasts but a short time. The bitterness of an unusable system lasts for years." It is indeed disappointing to discover that your data resources are not being used to their full potential. Not only have you invested your time, effort, and research grant on the project, but you may face costly redesigns if you want to improve the system later. This scenario would be less likely if the product was designed to provide users with exactly what they need, so that it is fit for purpose before its launch. We work at EMBL-European Bioinformatics Institute (EMBL-EBI, and we consult extensively with life science researchers to find out what they need from biological data resources. We have found that although users believe that the bioinformatics community is providing accurate and valuable data, they often find the interfaces to these resources tricky to use and navigate. We believe that if you can find out what your users want even before you create the first mock-up of a system, the final product will provide a better user experience. This would encourage more people to use the resource and they would have greater access to the data, which could ultimately lead to more scientific discoveries. In this paper, we explore the need for a user-centred design (UCD strategy when designing bioinformatics resources and illustrate this with examples from our work at EMBL-EBI. Our aim is to introduce the reader to how selected UCD techniques may be successfully applied to software design for bioinformatics.

  15. BIRCH: A user-oriented, locally-customizable, bioinformatics system

    Directory of Open Access Journals (Sweden)

    Fristensky Brian

    2007-02-01

    Full Text Available Abstract Background Molecular biologists need sophisticated analytical tools which often demand extensive computational resources. While finding, installing, and using these tools can be challenging, pipelining data from one program to the next is particularly awkward, especially when using web-based programs. At the same time, system administrators tasked with maintaining these tools do not always appreciate the needs of research biologists. Results BIRCH (Biological Research Computing Hierarchy is an organizational framework for delivering bioinformatics resources to a user group, scaling from a single lab to a large institution. The BIRCH core distribution includes many popular bioinformatics programs, unified within the GDE (Genetic Data Environment graphic interface. Of equal importance, BIRCH provides the system administrator with tools that simplify the job of managing a multiuser bioinformatics system across different platforms and operating systems. These include tools for integrating locally-installed programs and databases into BIRCH, and for customizing the local BIRCH system to meet the needs of the user base. BIRCH can also act as a front end to provide a unified view of already-existing collections of bioinformatics software. Documentation for the BIRCH and locally-added programs is merged in a hierarchical set of web pages. In addition to manual pages for individual programs, BIRCH tutorials employ step by step examples, with screen shots and sample files, to illustrate both the important theoretical and practical considerations behind complex analytical tasks. Conclusion BIRCH provides a versatile organizational framework for managing software and databases, and making these accessible to a user base. Because of its network-centric design, BIRCH makes it possible for any user to do any task from anywhere.

  16. BIRCH: a user-oriented, locally-customizable, bioinformatics system.

    Science.gov (United States)

    Fristensky, Brian

    2007-02-09

    Molecular biologists need sophisticated analytical tools which often demand extensive computational resources. While finding, installing, and using these tools can be challenging, pipelining data from one program to the next is particularly awkward, especially when using web-based programs. At the same time, system administrators tasked with maintaining these tools do not always appreciate the needs of research biologists. BIRCH (Biological Research Computing Hierarchy) is an organizational framework for delivering bioinformatics resources to a user group, scaling from a single lab to a large institution. The BIRCH core distribution includes many popular bioinformatics programs, unified within the GDE (Genetic Data Environment) graphic interface. Of equal importance, BIRCH provides the system administrator with tools that simplify the job of managing a multiuser bioinformatics system across different platforms and operating systems. These include tools for integrating locally-installed programs and databases into BIRCH, and for customizing the local BIRCH system to meet the needs of the user base. BIRCH can also act as a front end to provide a unified view of already-existing collections of bioinformatics software. Documentation for the BIRCH and locally-added programs is merged in a hierarchical set of web pages. In addition to manual pages for individual programs, BIRCH tutorials employ step by step examples, with screen shots and sample files, to illustrate both the important theoretical and practical considerations behind complex analytical tasks. BIRCH provides a versatile organizational framework for managing software and databases, and making these accessible to a user base. Because of its network-centric design, BIRCH makes it possible for any user to do any task from anywhere.

  17. A Quick Guide for Building a Successful Bioinformatics Community

    Science.gov (United States)

    Budd, Aidan; Corpas, Manuel; Brazas, Michelle D.; Fuller, Jonathan C.; Goecks, Jeremy; Mulder, Nicola J.; Michaut, Magali; Ouellette, B. F. Francis; Pawlik, Aleksandra; Blomberg, Niklas

    2015-01-01

    “Scientific community” refers to a group of people collaborating together on scientific-research-related activities who also share common goals, interests, and values. Such communities play a key role in many bioinformatics activities. Communities may be linked to a specific location or institute, or involve people working at many different institutions and locations. Education and training is typically an important component of these communities, providing a valuable context in which to develop skills and expertise, while also strengthening links and relationships within the community. Scientific communities facilitate: (i) the exchange and development of ideas and expertise; (ii) career development; (iii) coordinated funding activities; (iv) interactions and engagement with professionals from other fields; and (v) other activities beneficial to individual participants, communities, and the scientific field as a whole. It is thus beneficial at many different levels to understand the general features of successful, high-impact bioinformatics communities; how individual participants can contribute to the success of these communities; and the role of education and training within these communities. We present here a quick guide to building and maintaining a successful, high-impact bioinformatics community, along with an overview of the general benefits of participating in such communities. This article grew out of contributions made by organizers, presenters, panelists, and other participants of the ISMB/ECCB 2013 workshop “The ‘How To Guide’ for Establishing a Successful Bioinformatics Network” at the 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and the 12th European Conference on Computational Biology (ECCB). PMID:25654371

  18. A P2P Framework for Developing Bioinformatics Applications in Dynamic Cloud Environments.

    Science.gov (United States)

    Lin, Chun-Hung Richard; Wen, Chun-Hao; Lin, Ying-Chih; Tung, Kuang-Yuan; Lin, Rung-Wei; Lin, Chun-Yuan

    2013-01-01

    Bioinformatics is advanced from in-house computing infrastructure to cloud computing for tackling the vast quantity of biological data. This advance enables large number of collaborative researches to share their works around the world. In view of that, retrieving biological data over the internet becomes more and more difficult because of the explosive growth and frequent changes. Various efforts have been made to address the problems of data discovery and delivery in the cloud framework, but most of them suffer the hindrance by a MapReduce master server to track all available data. In this paper, we propose an alternative approach, called PRKad, which exploits a Peer-to-Peer (P2P) model to achieve efficient data discovery and delivery. PRKad is a Kademlia-based implementation with Round-Trip-Time (RTT) as the associated key, and it locates data according to Distributed Hash Table (DHT) and XOR metric. The simulation results exhibit that our PRKad has the low link latency to retrieve data. As an interdisciplinary application of P2P computing for bioinformatics, PRKad also provides good scalability for servicing a greater number of users in dynamic cloud environments.

  19. A P2P Framework for Developing Bioinformatics Applications in Dynamic Cloud Environments

    Directory of Open Access Journals (Sweden)

    Chun-Hung Richard Lin

    2013-01-01

    Full Text Available Bioinformatics is advanced from in-house computing infrastructure to cloud computing for tackling the vast quantity of biological data. This advance enables large number of collaborative researches to share their works around the world. In view of that, retrieving biological data over the internet becomes more and more difficult because of the explosive growth and frequent changes. Various efforts have been made to address the problems of data discovery and delivery in the cloud framework, but most of them suffer the hindrance by a MapReduce master server to track all available data. In this paper, we propose an alternative approach, called PRKad, which exploits a Peer-to-Peer (P2P model to achieve efficient data discovery and delivery. PRKad is a Kademlia-based implementation with Round-Trip-Time (RTT as the associated key, and it locates data according to Distributed Hash Table (DHT and XOR metric. The simulation results exhibit that our PRKad has the low link latency to retrieve data. As an interdisciplinary application of P2P computing for bioinformatics, PRKad also provides good scalability for servicing a greater number of users in dynamic cloud environments.

  20. mORCA: sailing bioinformatics world with mobile devices.

    Science.gov (United States)

    Díaz-Del-Pino, Sergio; Falgueras, Juan; Perez-Wohlfeil, Esteban; Trelles, Oswaldo

    2018-03-01

    Nearly 10 years have passed since the first mobile apps appeared. Given the fact that bioinformatics is a web-based world and that mobile devices are endowed with web-browsers, it seemed natural that bioinformatics would transit from personal computers to mobile devices but nothing could be further from the truth. The transition demands new paradigms, designs and novel implementations. Throughout an in-depth analysis of requirements of existing bioinformatics applications we designed and deployed an easy-to-use web-based lightweight mobile client. Such client is able to browse, select, compose automatically interface parameters, invoke services and monitor the execution of Web Services using the service's metadata stored in catalogs or repositories. mORCA is available at http://bitlab-es.com/morca/app as a web-app. It is also available in the App store by Apple and Play Store by Google. The software will be available for at least 2 years. ortrelles@uma.es. Source code, final web-app, training material and documentation is available at http://bitlab-es.com/morca. © The Author(s) 2017. Published by Oxford University Press.

  1. Update in research and methods in proteomics and bioinformatics.

    Science.gov (United States)

    Bencharit, Sompop; Border, Michael B; Edelmann, Alex; Byrd, Warren C

    2013-10-01

    The 3rd International Conference on Proteomics & Bioinformatics (Proteomics 2013) Philadelphia, PA, USA, 15-17 July 2013 The Third International Conference on Proteomics & Bioinformatics (Proteomics 2013) was sponsored by the OMICS group and was organized in order to strengthen the future of proteomics science by bringing together professionals, researchers and scholars from leading universities across the globe. The main topics of this conference included the integration of novel platforms in data analysis, the use of a systems biology approach, different novel mass spectrometry platforms and biomarker discovery methods. The conference was divided into proteomic methods and research interests. Among these two categories, interactions between methods in proteomics and bioinformatics, as well as other research methodologies, were discussed. Exceptional topics from the keynote forum, oral presentations and the poster session have been highlighted. The topics range from new techniques for analyzing proteomics data, to new models designed to help better understand genetic variations to the differences in the salivary proteomes of HIV-infected patients.

  2. KBWS: an EMBOSS associated package for accessing bioinformatics web services

    Directory of Open Access Journals (Sweden)

    Tomita Masaru

    2011-04-01

    Full Text Available Abstract The availability of bioinformatics web-based services is rapidly proliferating, for their interoperability and ease of use. The next challenge is in the integration of these services in the form of workflows, and several projects are already underway, standardizing the syntax, semantics, and user interfaces. In order to deploy the advantages of web services with locally installed tools, here we describe a collection of proxy client tools for 42 major bioinformatics web services in the form of European Molecular Biology Open Software Suite (EMBOSS UNIX command-line tools. EMBOSS provides sophisticated means for discoverability and interoperability for hundreds of tools, and our package, named the Keio Bioinformatics Web Service (KBWS, adds functionalities of local and multiple alignment of sequences, phylogenetic analyses, and prediction of cellular localization of proteins and RNA secondary structures. This software implemented in C is available under GPL from http://www.g-language.org/kbws/ and GitHub repository http://github.com/cory-ko/KBWS. Users can utilize the SOAP services implemented in Perl directly via WSDL file at http://soap.g-language.org/kbws.wsdl (RPC Encoded and http://soap.g-language.org/kbws_dl.wsdl (Document/literal.

  3. GOBLET: the Global Organisation for Bioinformatics Learning, Education and Training.

    Directory of Open Access Journals (Sweden)

    Teresa K Attwood

    2015-04-01

    Full Text Available In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy--paradoxically, many are actually closing "niche" bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all.

  4. A comparison of common programming languages used in bioinformatics.

    Science.gov (United States)

    Fourment, Mathieu; Gillings, Michael R

    2008-02-05

    The performance of different programming languages has previously been benchmarked using abstract mathematical algorithms, but not using standard bioinformatics algorithms. We compared the memory usage and speed of execution for three standard bioinformatics methods, implemented in programs using one of six different programming languages. Programs for the Sellers algorithm, the Neighbor-Joining tree construction algorithm and an algorithm for parsing BLAST file outputs were implemented in C, C++, C#, Java, Perl and Python. Implementations in C and C++ were fastest and used the least memory. Programs in these languages generally contained more lines of code. Java and C# appeared to be a compromise between the flexibility of Perl and Python and the fast performance of C and C++. The relative performance of the tested languages did not change from Windows to Linux and no clear evidence of a faster operating system was found. Source code and additional information are available from http://www.bioinformatics.org/benchmark/. This benchmark provides a comparison of six commonly used programming languages under two different operating systems. The overall comparison shows that a developer should choose an appropriate language carefully, taking into account the performance expected and the library availability for each language.

  5. Best practices in bioinformatics training for life scientists.

    KAUST Repository

    Via, Allegra

    2013-06-25

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists.

  6. Bioinformatics analysis and detection of gelatinase encoded gene in Lysinibacillussphaericus

    Science.gov (United States)

    Repin, Rul Aisyah Mat; Mutalib, Sahilah Abdul; Shahimi, Safiyyah; Khalid, Rozida Mohd.; Ayob, Mohd. Khan; Bakar, Mohd. Faizal Abu; Isa, Mohd Noor Mat

    2016-11-01

    In this study, we performed bioinformatics analysis toward genome sequence of Lysinibacillussphaericus (L. sphaericus) to determine gene encoded for gelatinase. L. sphaericus was isolated from soil and gelatinase species-specific bacterium to porcine and bovine gelatin. This bacterium offers the possibility of enzymes production which is specific to both species of meat, respectively. The main focus of this research is to identify the gelatinase encoded gene within the bacteria of L. Sphaericus using bioinformatics analysis of partially sequence genome. From the research study, three candidate gene were identified which was, gelatinase candidate gene 1 (P1), NODE_71_length_93919_cov_158.931839_21 which containing 1563 base pair (bp) in size with 520 amino acids sequence; Secondly, gelatinase candidate gene 2 (P2), NODE_23_length_52851_cov_190.061386_17 which containing 1776 bp in size with 591 amino acids sequence; and Thirdly, gelatinase candidate gene 3 (P3), NODE_106_length_32943_cov_169.147919_8 containing 1701 bp in size with 566 amino acids sequence. Three pairs of oligonucleotide primers were designed and namely as, F1, R1, F2, R2, F3 and R3 were targeted short sequences of cDNA by PCR. The amplicons were reliably results in 1563 bp in size for candidate gene P1 and 1701 bp in size for candidate gene P3. Therefore, the results of bioinformatics analysis of L. Sphaericus resulting in gene encoded gelatinase were identified.

  7. GOBLET: The Global Organisation for Bioinformatics Learning, Education and Training

    Science.gov (United States)

    Atwood, Teresa K.; Bongcam-Rudloff, Erik; Brazas, Michelle E.; Corpas, Manuel; Gaudet, Pascale; Lewitter, Fran; Mulder, Nicola; Palagi, Patricia M.; Schneider, Maria Victoria; van Gelder, Celia W. G.

    2015-01-01

    In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy—paradoxically, many are actually closing “niche” bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all. PMID:25856076

  8. Lizenzgeschäfte in der Buchverlagswirtschaft

    OpenAIRE

    Schleifer, Irene

    2016-01-01

    Licensing is an important option for companies to gain an economic ad-vantage compared to other commercial enterprises; it is a significant source of income. Especially in a sector such as the German book indus-try, which is characterised by its diffuseness in small and medium-sized enterprises, licensing has now become an established method through which companies can assert themselves on the market. They achieve product differentiation and increase their brand’s publicity. A license agreeme...

  9. SimBA: A methodology and tools for evaluating the performance of RNA-Seq bioinformatic pipelines.

    Science.gov (United States)

    Audoux, Jérôme; Salson, Mikaël; Grosset, Christophe F; Beaumeunier, Sacha; Holder, Jean-Marc; Commes, Thérèse; Philippe, Nicolas

    2017-09-29

    The evolution of next-generation sequencing (NGS) technologies has led to increased focus on RNA-Seq. Many bioinformatic tools have been developed for RNA-Seq analysis, each with unique performance characteristics and configuration parameters. Users face an increasingly complex task in understanding which bioinformatic tools are best for their specific needs and how they should be configured. In order to provide some answers to these questions, we investigate the performance of leading bioinformatic tools designed for RNA-Seq analysis and propose a methodology for systematic evaluation and comparison of performance to help users make well informed choices. To evaluate RNA-Seq pipelines, we developed a suite of two benchmarking tools. SimCT generates simulated datasets that get as close as possible to specific real biological conditions accompanied by the list of genomic incidents and mutations that have been inserted. BenchCT then compares the output of any bioinformatics pipeline that has been run against a SimCT dataset with the simulated genomic and transcriptional variations it contains to give an accurate performance evaluation in addressing specific biological question. We used these tools to simulate a real-world genomic medicine question s involving the comparison of healthy and cancerous cells. Results revealed that performance in addressing a particular biological context varied significantly depending on the choice of tools and settings used. We also found that by combining the output of certain pipelines, substantial performance improvements could be achieved. Our research emphasizes the importance of selecting and configuring bioinformatic tools for the specific biological question being investigated to obtain optimal results. Pipeline designers, developers and users should include benchmarking in the context of their biological question as part of their design and quality control process. Our SimBA suite of benchmarking tools provides a reliable basis

  10. ADN-Viewer: a 3D approach for bioinformatic analyses of large DNA sequences.

    Science.gov (United States)

    Hérisson, Joan; Ferey, Nicolas; Gros, Pierre-Emmanuel; Gherbi, Rachid

    2007-01-20

    Most of biologists work on textual DNA sequences that are limited to the linear representation of DNA. In this paper, we address the potential offered by Virtual Reality for 3D modeling and immersive visualization of large genomic sequences. The representation of the 3D structure of naked DNA allows biologists to observe and analyze genomes in an interactive way at different levels. We developed a powerful software platform that provides a new point of view for sequences analysis: ADNViewer. Nevertheless, a classical eukaryotic chromosome of 40 million base pairs requires about 6 Gbytes of 3D data. In order to manage these huge amounts of data in real-time, we designed various scene management algorithms and immersive human-computer interaction for user-friendly data exploration. In addition, one bioinformatics study scenario is proposed.

  11. Bioinformatics in microbial biotechnology – a mini review

    Directory of Open Access Journals (Sweden)

    Bansal Arvind K

    2005-06-01

    Full Text Available Abstract The revolutionary growth in the computation speed and memory storage capability has fueled a new era in the analysis of biological data. Hundreds of microbial genomes and many eukaryotic genomes including a cleaner draft of human genome have been sequenced raising the expectation of better control of microorganisms. The goals are as lofty as the development of rational drugs and antimicrobial agents, development of new enhanced bacterial strains for bioremediation and pollution control, development of better and easy to administer vaccines, the development of protein biomarkers for various bacterial diseases, and better understanding of host-bacteria interaction to prevent bacterial infections. In the last decade the development of many new bioinformatics techniques and integrated databases has facilitated the realization of these goals. Current research in bioinformatics can be classified into: (i genomics – sequencing and comparative study of genomes to identify gene and genome functionality, (ii proteomics – identification and characterization of protein related properties and reconstruction of metabolic and regulatory pathways, (iii cell visualization and simulation to study and model cell behavior, and (iv application to the development of drugs and anti-microbial agents. In this article, we will focus on the techniques and their limitations in genomics and proteomics. Bioinformatics research can be classified under three major approaches: (1 analysis based upon the available experimental wet-lab data, (2 the use of mathematical modeling to derive new information, and (3 an integrated approach that integrates search techniques with mathematical modeling. The major impact of bioinformatics research has been to automate the genome sequencing, automated development of integrated genomics and proteomics databases, automated genome comparisons to identify the genome function, automated derivation of metabolic pathways, gene

  12. Video Bioinformatics Analysis of Human Embryonic Stem Cell Colony Growth

    Science.gov (United States)

    Lin, Sabrina; Fonteno, Shawn; Satish, Shruthi; Bhanu, Bir; Talbot, Prue

    2010-01-01

    Because video data are complex and are comprised of many images, mining information from video material is difficult to do without the aid of computer software. Video bioinformatics is a powerful quantitative approach for extracting spatio-temporal data from video images using computer software to perform dating mining and analysis. In this article, we introduce a video bioinformatics method for quantifying the growth of human embryonic stem cells (hESC) by analyzing time-lapse videos collected in a Nikon BioStation CT incubator equipped with a camera for video imaging. In our experiments, hESC colonies that were attached to Matrigel were filmed for 48 hours in the BioStation CT. To determine the rate of growth of these colonies, recipes were developed using CL-Quant software which enables users to extract various types of data from video images. To accurately evaluate colony growth, three recipes were created. The first segmented the image into the colony and background, the second enhanced the image to define colonies throughout the video sequence accurately, and the third measured the number of pixels in the colony over time. The three recipes were run in sequence on video data collected in a BioStation CT to analyze the rate of growth of individual hESC colonies over 48 hours. To verify the truthfulness of the CL-Quant recipes, the same data were analyzed manually using Adobe Photoshop software. When the data obtained using the CL-Quant recipes and Photoshop were compared, results were virtually identical, indicating the CL-Quant recipes were truthful. The method described here could be applied to any video data to measure growth rates of hESC or other cells that grow in colonies. In addition, other video bioinformatics recipes can be developed in the future for other cell processes such as migration, apoptosis, and cell adhesion. PMID:20495527

  13. WU-Blast2 server at the European Bioinformatics Institute

    Science.gov (United States)

    Lopez, Rodrigo; Silventoinen, Ville; Robinson, Stephen; Kibria, Asif; Gish, Warren

    2003-01-01

    Since 1995, the WU-BLAST programs (http://blast.wustl.edu) have provided a fast, flexible and reliable method for similarity searching of biological sequence databases. The software is in use at many locales and web sites. The European Bioinformatics Institute's WU-Blast2 (http://www.ebi.ac.uk/blast2/) server has been providing free access to these search services since 1997 and today supports many features that both enhance the usability and expand on the scope of the software. PMID:12824421

  14. Multilevel Cellular Automata as a Tool for Studying Bioinformatic Processes

    Science.gov (United States)

    Hogeweg, Paulien

    The signature feature of Cellular Automata is the realization that "simple rules can give rise to complex behavior". In particular how fixed "rock-bottom" simple rules can give rise to multiple levels of organization. Here we describe Multilevel Cellular Automata, in which the microscopic entities (states) and their transition rules themselves are adjusted by the mesoscale patterns that they themselves generate. Thus we study the feedback of higher levels of organization on the lower levels. Such an approach is preeminently important for studying bioinformatic systems. We will here focus on an evolutionary approach to formalize such Multilevel Cellular Automata, and review examples of studies that use them.

  15. [Pharmacogenetics II. Research molecular methods, bioinformatics and ethical concerns].

    Science.gov (United States)

    Daudén, E

    2007-01-01

    Pharmacogenetics refers to the study of the individual pharmacological response based on the genotype. Its objective is to optimize treatment in an individual basis, thereby creating a more efficient and safe personalized therapy. In the second part of this review, the molecular methods of study in pharmacogenetics, including microarray technology or DNA chips, are discussed. Among them we highlight the microarrays used to determine the gene expression that detect specific RNA sequences, and the microarrays employed to determine the genotype that detect specific DNA sequences, including polymorphisms, particularly single nucleotide polymorphisms (SNPs). The relationship between pharmacogenetics, bioinformatics and ethical concerns is reviewed.

  16. Application of bioinformatics on the detection of pathogens by Pcr

    International Nuclear Information System (INIS)

    Rezig, Slim; Sakhri, Saber

    2007-01-01

    Salmonellas are the main responsible agent for the frequent food-borne gastrointestinal diseases. Their detection using classical methods are laborious and their results take a lot of time to be revealed. In this context, we tried to set up a revealing technique of the invA virulence gene, found in the majority of Salmonella species. After amplification with PCR using specific primers created and verified by bioinformatics programs, two couples of primers were set up and they appeared to be very specific and sensitive for the detection of invA gene. (Author)

  17. Bioinformatics Methods and Tools to Advance Clinical Care. Findings from the Yearbook 2015 Section on Bioinformatics and Translational Informatics.

    Science.gov (United States)

    Soualmia, L F; Lecroq, T

    2015-08-13

    To summarize excellent current research in the field of Bioinformatics and Translational Informatics with application in the health domain and clinical care. We provide a synopsis of the articles selected for the IMIA Yearbook 2015, from which we attempt to derive a synthetic overview of current and future activities in the field. As last year, a first step of selection was performed by querying MEDLINE with a list of MeSH descriptors completed by a list of terms adapted to the section. Each section editor has evaluated separately the set of 1,594 articles and the evaluation results were merged for retaining 15 articles for peer-review. The selection and evaluation process of this Yearbook's section on Bioinformatics and Translational Informatics yielded four excellent articles regarding data management and genome medicine that are mainly tool-based papers. In the first article, the authors present PPISURV a tool for uncovering the role of specific genes in cancer survival outcome. The second article describes the classifier PredictSNP which combines six performing tools for predicting disease-related mutations. In the third article, by presenting a high-coverage map of the human proteome using high resolution mass spectrometry, the authors highlight the need for using mass spectrometry to complement genome annotation. The fourth article is also related to patient survival and decision support. The authors present datamining methods of large-scale datasets of past transplants. The objective is to identify chances of survival. The current research activities still attest the continuous convergence of Bioinformatics and Medical Informatics, with a focus this year on dedicated tools and methods to advance clinical care. Indeed, there is a need for powerful tools for managing and interpreting complex, large-scale genomic and biological datasets, but also a need for user-friendly tools developed for the clinicians in their daily practice. All the recent research and

  18. Protecting innovation in bioinformatics and in-silico biology.

    Science.gov (United States)

    Harrison, Robert

    2003-01-01

    Commercial success or failure of innovation in bioinformatics and in-silico biology requires the appropriate use of legal tools for protecting and exploiting intellectual property. These tools include patents, copyrights, trademarks, design rights, and limiting information in the form of 'trade secrets'. Potentially patentable components of bioinformatics programmes include lines of code, algorithms, data content, data structure and user interfaces. In both the US and the European Union, copyright protection is granted for software as a literary work, and most other major industrial countries have adopted similar rules. Nonetheless, the grant of software patents remains controversial and is being challenged in some countries. Current debate extends to aspects such as whether patents can claim not only the apparatus and methods but also the data signals and/or products, such as a CD-ROM, on which the programme is stored. The patentability of substances discovered using in-silico methods is a separate debate that is unlikely to be resolved in the near future.

  19. A review of bioinformatic methods for forensic DNA analyses.

    Science.gov (United States)

    Liu, Yao-Yuan; Harbison, SallyAnn

    2018-03-01

    Short tandem repeats, single nucleotide polymorphisms, and whole mitochondrial analyses are three classes of markers which will play an important role in the future of forensic DNA typing. The arrival of massively parallel sequencing platforms in forensic science reveals new information such as insights into the complexity and variability of the markers that were previously unseen, along with amounts of data too immense for analyses by manual means. Along with the sequencing chemistries employed, bioinformatic methods are required to process and interpret this new and extensive data. As more is learnt about the use of these new technologies for forensic applications, development and standardization of efficient, favourable tools for each stage of data processing is being carried out, and faster, more accurate methods that improve on the original approaches have been developed. As forensic laboratories search for the optimal pipeline of tools, sequencer manufacturers have incorporated pipelines into sequencer software to make analyses convenient. This review explores the current state of bioinformatic methods and tools used for the analyses of forensic markers sequenced on the massively parallel sequencing (MPS) platforms currently most widely used. Copyright © 2017 Elsevier B.V. All rights reserved.

  20. MOWServ: a web client for integration of bioinformatic resources

    Science.gov (United States)

    Ramírez, Sergio; Muñoz-Mérida, Antonio; Karlsson, Johan; García, Maximiliano; Pérez-Pulido, Antonio J.; Claros, M. Gonzalo; Trelles, Oswaldo

    2010-01-01

    The productivity of any scientist is affected by cumbersome, tedious and time-consuming tasks that try to make the heterogeneous web services compatible so that they can be useful in their research. MOWServ, the bioinformatic platform offered by the Spanish National Institute of Bioinformatics, was released to provide integrated access to databases and analytical tools. Since its release, the number of available services has grown dramatically, and it has become one of the main contributors of registered services in the EMBRACE Biocatalogue. The ontology that enables most of the web-service compatibility has been curated, improved and extended. The service discovery has been greatly enhanced by Magallanes software and biodataSF. User data are securely stored on the main server by an authentication protocol that enables the monitoring of current or already-finished user’s tasks, as well as the pipelining of successive data processing services. The BioMoby standard has been greatly extended with the new features included in the MOWServ, such as management of additional information (metadata such as extended descriptions, keywords and datafile examples), a qualified registry, error handling, asynchronous services and service replication. All of them have increased the MOWServ service quality, usability and robustness. MOWServ is available at http://www.inab.org/MOWServ/ and has a mirror at http://www.bitlab-es.com/MOWServ/. PMID:20525794

  1. Agonist Binding to Chemosensory Receptors: A Systematic Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Fabrizio Fierro

    2017-09-01

    Full Text Available Human G-protein coupled receptors (hGPCRs constitute a large and highly pharmaceutically relevant membrane receptor superfamily. About half of the hGPCRs' family members are chemosensory receptors, involved in bitter taste and olfaction, along with a variety of other physiological processes. Hence these receptors constitute promising targets for pharmaceutical intervention. Molecular modeling has been so far the most important tool to get insights on agonist binding and receptor activation. Here we investigate both aspects by bioinformatics-based predictions across all bitter taste and odorant receptors for which site-directed mutagenesis data are available. First, we observe that state-of-the-art homology modeling combined with previously used docking procedures turned out to reproduce only a limited fraction of ligand/receptor interactions inferred by experiments. This is most probably caused by the low sequence identity with available structural templates, which limits the accuracy of the protein model and in particular of the side-chains' orientations. Methods which transcend the limited sampling of the conformational space of docking may improve the predictions. As an example corroborating this, we review here multi-scale simulations from our lab and show that, for the three complexes studied so far, they significantly enhance the predictive power of the computational approach. Second, our bioinformatics analysis provides support to previous claims that several residues, including those at positions 1.50, 2.50, and 7.52, are involved in receptor activation.

  2. Bioinformatic prediction and functional characterization of human KIAA0100 gene

    Directory of Open Access Journals (Sweden)

    He Cui

    2017-02-01

    Full Text Available Our previous study demonstrated that human KIAA0100 gene was a novel acute monocytic leukemia-associated antigen (MLAA gene. But the functional characterization of human KIAA0100 gene has remained unknown to date. Here, firstly, bioinformatic prediction of human KIAA0100 gene was carried out using online softwares; Secondly, Human KIAA0100 gene expression was downregulated by the clustered regularly interspaced short palindromic repeats (CRISPR/CRISPR-associated (Cas 9 system in U937 cells. Cell proliferation and apoptosis were next evaluated in KIAA0100-knockdown U937 cells. The bioinformatic prediction showed that human KIAA0100 gene was located on 17q11.2, and human KIAA0100 protein was located in the secretory pathway. Besides, human KIAA0100 protein contained a signalpeptide, a transmembrane region, three types of secondary structures (alpha helix, extended strand, and random coil , and four domains from mitochondrial protein 27 (FMP27. The observation on functional characterization of human KIAA0100 gene revealed that its downregulation inhibited cell proliferation, and promoted cell apoptosis in U937 cells. To summarize, these results suggest human KIAA0100 gene possibly comes within mitochondrial genome; moreover, it is a novel anti-apoptotic factor related to carcinogenesis or progression in acute monocytic leukemia, and may be a potential target for immunotherapy against acute monocytic leukemia.

  3. MAPI: towards the integrated exploitation of bioinformatics Web Services

    Directory of Open Access Journals (Sweden)

    Karlsson Johan

    2011-10-01

    Full Text Available Abstract Background Bioinformatics is commonly featured as a well assorted list of available web resources. Although diversity of services is positive in general, the proliferation of tools, their dispersion and heterogeneity complicate the integrated exploitation of such data processing capacity. Results To facilitate the construction of software clients and make integrated use of this variety of tools, we present a modular programmatic application interface (MAPI that provides the necessary functionality for uniform representation of Web Services metadata descriptors including their management and invocation protocols of the services which they represent. This document describes the main functionality of the framework and how it can be used to facilitate the deployment of new software under a unified structure of bioinformatics Web Services. A notable feature of MAPI is the modular organization of the functionality into different modules associated with specific tasks. This means that only the modules needed for the client have to be installed, and that the module functionality can be extended without the need for re-writing the software client. Conclusions The potential utility and versatility of the software library has been demonstrated by the implementation of several currently available clients that cover different aspects of integrated data processing, ranging from service discovery to service invocation with advanced features such as workflows composition and asynchronous services calls to multiple types of Web Services including those registered in repositories (e.g. GRID-based, SOAP, BioMOBY, R-bioconductor, and others.

  4. Computational Lipidomics and Lipid Bioinformatics: Filling In the Blanks.

    Science.gov (United States)

    Pauling, Josch; Klipp, Edda

    2016-12-22

    Lipids are highly diverse metabolites of pronounced importance in health and disease. While metabolomics is a broad field under the omics umbrella that may also relate to lipids, lipidomics is an emerging field which specializes in the identification, quantification and functional interpretation of complex lipidomes. Today, it is possible to identify and distinguish lipids in a high-resolution, high-throughput manner and simultaneously with a lot of structural detail. However, doing so may produce thousands of mass spectra in a single experiment which has created a high demand for specialized computational support to analyze these spectral libraries. The computational biology and bioinformatics community has so far established methodology in genomics, transcriptomics and proteomics but there are many (combinatorial) challenges when it comes to structural diversity of lipids and their identification, quantification and interpretation. This review gives an overview and outlook on lipidomics research and illustrates ongoing computational and bioinformatics efforts. These efforts are important and necessary steps to advance the lipidomics field alongside analytic, biochemistry, biomedical and biology communities and to close the gap in available computational methodology between lipidomics and other omics sub-branches.

  5. XMPP for cloud computing in bioinformatics supporting discovery and invocation of asynchronous web services

    Science.gov (United States)

    Wagener, Johannes; Spjuth, Ola; Willighagen, Egon L; Wikberg, Jarl ES

    2009-01-01

    Background Life sciences make heavily use of the web for both data provision and analysis. However, the increasing amount of available data and the diversity of analysis tools call for machine accessible interfaces in order to be effective. HTTP-based Web service technologies, like the Simple Object Access Protocol (SOAP) and REpresentational State Transfer (REST) services, are today the most common technologies for this in bioinformatics. However, these methods have severe drawbacks, including lack of discoverability, and the inability for services to send status notifications. Several complementary workarounds have been proposed, but the results are ad-hoc solutions of varying quality that can be difficult to use. Results We present a novel approach based on the open standard Extensible Messaging and Presence Protocol (XMPP), consisting of an extension (IO Data) to comprise discovery, asynchronous invocation, and definition of data types in the service. That XMPP cloud services are capable of asynchronous communication implies that clients do not have to poll repetitively for status, but the service sends the results back to the client upon completion. Implementations for Bioclipse and Taverna are presented, as are various XMPP cloud services in bio- and cheminformatics. Conclusion XMPP with its extensions is a powerful protocol for cloud services that demonstrate several advantages over traditional HTTP-based Web services: 1) services are discoverable without the need of an external registry, 2) asynchronous invocation eliminates the need for ad-hoc solutions like polling, and 3) input and output types defined in the service allows for generation of clients on the fly without the need of an external semantics description. The many advantages over existing technologies make XMPP a highly interesting candidate for next generation online services in bioinformatics. PMID:19732427

  6. XMPP for cloud computing in bioinformatics supporting discovery and invocation of asynchronous web services

    Directory of Open Access Journals (Sweden)

    Willighagen Egon L

    2009-09-01

    Full Text Available Abstract Background Life sciences make heavily use of the web for both data provision and analysis. However, the increasing amount of available data and the diversity of analysis tools call for machine accessible interfaces in order to be effective. HTTP-based Web service technologies, like the Simple Object Access Protocol (SOAP and REpresentational State Transfer (REST services, are today the most common technologies for this in bioinformatics. However, these methods have severe drawbacks, including lack of discoverability, and the inability for services to send status notifications. Several complementary workarounds have been proposed, but the results are ad-hoc solutions of varying quality that can be difficult to use. Results We present a novel approach based on the open standard Extensible Messaging and Presence Protocol (XMPP, consisting of an extension (IO Data to comprise discovery, asynchronous invocation, and definition of data types in the service. That XMPP cloud services are capable of asynchronous communication implies that clients do not have to poll repetitively for status, but the service sends the results back to the client upon completion. Implementations for Bioclipse and Taverna are presented, as are various XMPP cloud services in bio- and cheminformatics. Conclusion XMPP with its extensions is a powerful protocol for cloud services that demonstrate several advantages over traditional HTTP-based Web services: 1 services are discoverable without the need of an external registry, 2 asynchronous invocation eliminates the need for ad-hoc solutions like polling, and 3 input and output types defined in the service allows for generation of clients on the fly without the need of an external semantics description. The many advantages over existing technologies make XMPP a highly interesting candidate for next generation online services in bioinformatics.

  7. XMPP for cloud computing in bioinformatics supporting discovery and invocation of asynchronous web services.

    Science.gov (United States)

    Wagener, Johannes; Spjuth, Ola; Willighagen, Egon L; Wikberg, Jarl E S

    2009-09-04

    Life sciences make heavily use of the web for both data provision and analysis. However, the increasing amount of available data and the diversity of analysis tools call for machine accessible interfaces in order to be effective. HTTP-based Web service technologies, like the Simple Object Access Protocol (SOAP) and REpresentational State Transfer (REST) services, are today the most common technologies for this in bioinformatics. However, these methods have severe drawbacks, including lack of discoverability, and the inability for services to send status notifications. Several complementary workarounds have been proposed, but the results are ad-hoc solutions of varying quality that can be difficult to use. We present a novel approach based on the open standard Extensible Messaging and Presence Protocol (XMPP), consisting of an extension (IO Data) to comprise discovery, asynchronous invocation, and definition of data types in the service. That XMPP cloud services are capable of asynchronous communication implies that clients do not have to poll repetitively for status, but the service sends the results back to the client upon completion. Implementations for Bioclipse and Taverna are presented, as are various XMPP cloud services in bio- and cheminformatics. XMPP with its extensions is a powerful protocol for cloud services that demonstrate several advantages over traditional HTTP-based Web services: 1) services are discoverable without the need of an external registry, 2) asynchronous invocation eliminates the need for ad-hoc solutions like polling, and 3) input and output types defined in the service allows for generation of clients on the fly without the need of an external semantics description. The many advantages over existing technologies make XMPP a highly interesting candidate for next generation online services in bioinformatics.

  8. Selecting Feature Subsets Based on SVM-RFE and the Overlapping Ratio with Applications in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Xiaohui Lin

    2017-12-01

    Full Text Available Feature selection is an important topic in bioinformatics. Defining informative features from complex high dimensional biological data is critical in disease study, drug development, etc. Support vector machine-recursive feature elimination (SVM-RFE is an efficient feature selection technique that has shown its power in many applications. It ranks the features according to the recursive feature deletion sequence based on SVM. In this study, we propose a method, SVM-RFE-OA, which combines the classification accuracy rate and the average overlapping ratio of the samples to determine the number of features to be selected from the feature rank of SVM-RFE. Meanwhile, to measure the feature weights more accurately, we propose a modified SVM-RFE-OA (M-SVM-RFE-OA algorithm that temporally screens out the samples lying in a heavy overlapping area in each iteration. The experiments on the eight public biological datasets show that the discriminative ability of the feature subset could be measured more accurately by combining the classification accuracy rate with the average overlapping degree of the samples compared with using the classification accuracy rate alone, and shielding the samples in the overlapping area made the calculation of the feature weights more stable and accurate. The methods proposed in this study can also be used with other RFE techniques to define potential biomarkers from big biological data.

  9. Bioinformatics analysis of Brucella vaccines and vaccine targets using VIOLIN.

    Science.gov (United States)

    He, Yongqun; Xiang, Zuoshuang

    2010-09-27

    Brucella spp. are Gram-negative, facultative intracellular bacteria that cause brucellosis, one of the commonest zoonotic diseases found worldwide in humans and a variety of animal species. While several animal vaccines are available, there is no effective and safe vaccine for prevention of brucellosis in humans. VIOLIN (http://www.violinet.org) is a web-based vaccine database and analysis system that curates, stores, and analyzes published data of commercialized vaccines, and vaccines in clinical trials or in research. VIOLIN contains information for 454 vaccines or vaccine candidates for 73 pathogens. VIOLIN also contains many bioinformatics tools for vaccine data analysis, data integration, and vaccine target prediction. To demonstrate the applicability of VIOLIN for vaccine research, VIOLIN was used for bioinformatics analysis of existing Brucella vaccines and prediction of new Brucella vaccine targets. VIOLIN contains many literature mining programs (e.g., Vaxmesh) that provide in-depth analysis of Brucella vaccine literature. As a result of manual literature curation, VIOLIN contains information for 38 Brucella vaccines or vaccine candidates, 14 protective Brucella antigens, and 68 host response studies to Brucella vaccines from 97 peer-reviewed articles. These Brucella vaccines are classified in the Vaccine Ontology (VO) system and used for different ontological applications. The web-based VIOLIN vaccine target prediction program Vaxign was used to predict new Brucella vaccine targets. Vaxign identified 14 outer membrane proteins that are conserved in six virulent strains from B. abortus, B. melitensis, and B. suis that are pathogenic in humans. Of the 14 membrane proteins, two proteins (Omp2b and Omp31-1) are not present in B. ovis, a Brucella species that is not pathogenic in humans. Brucella vaccine data stored in VIOLIN were compared and analyzed using the VIOLIN query system. Bioinformatics curation and ontological representation of Brucella vaccines

  10. Atlas – a data warehouse for integrative bioinformatics

    Directory of Open Access Journals (Sweden)

    Yuen Macaire MS

    2005-02-01

    Full Text Available Abstract Background We present a biological data warehouse called Atlas that locally stores and integrates biological sequences, molecular interactions, homology information, functional annotations of genes, and biological ontologies. The goal of the system is to provide data, as well as a software infrastructure for bioinformatics research and development. Description The Atlas system is based on relational data models that we developed for each of the source data types. Data stored within these relational models are managed through Structured Query Language (SQL calls that are implemented in a set of Application Programming Interfaces (APIs. The APIs include three languages: C++, Java, and Perl. The methods in these API libraries are used to construct a set of loader applications, which parse and load the source datasets into the Atlas database, and a set of toolbox applications which facilitate data retrieval. Atlas stores and integrates local instances of GenBank, RefSeq, UniProt, Human Protein Reference Database (HPRD, Biomolecular Interaction Network Database (BIND, Database of Interacting Proteins (DIP, Molecular Interactions Database (MINT, IntAct, NCBI Taxonomy, Gene Ontology (GO, Online Mendelian Inheritance in Man (OMIM, LocusLink, Entrez Gene and HomoloGene. The retrieval APIs and toolbox applications are critical components that offer end-users flexible, easy, integrated access to this data. We present use cases that use Atlas to integrate these sources for genome annotation, inference of molecular interactions across species, and gene-disease associations. Conclusion The Atlas biological data warehouse serves as data infrastructure for bioinformatics research and development. It forms the backbone of the research activities in our laboratory and facilitates the integration of disparate, heterogeneous biological sources of data enabling new scientific inferences. Atlas achieves integration of diverse data sets at two levels. First

  11. Bioinformatics analysis of Brucella vaccines and vaccine targets using VIOLIN

    Science.gov (United States)

    2010-01-01

    Background Brucella spp. are Gram-negative, facultative intracellular bacteria that cause brucellosis, one of the commonest zoonotic diseases found worldwide in humans and a variety of animal species. While several animal vaccines are available, there is no effective and safe vaccine for prevention of brucellosis in humans. VIOLIN (http://www.violinet.org) is a web-based vaccine database and analysis system that curates, stores, and analyzes published data of commercialized vaccines, and vaccines in clinical trials or in research. VIOLIN contains information for 454 vaccines or vaccine candidates for 73 pathogens. VIOLIN also contains many bioinformatics tools for vaccine data analysis, data integration, and vaccine target prediction. To demonstrate the applicability of VIOLIN for vaccine research, VIOLIN was used for bioinformatics analysis of existing Brucella vaccines and prediction of new Brucella vaccine targets. Results VIOLIN contains many literature mining programs (e.g., Vaxmesh) that provide in-depth analysis of Brucella vaccine literature. As a result of manual literature curation, VIOLIN contains information for 38 Brucella vaccines or vaccine candidates, 14 protective Brucella antigens, and 68 host response studies to Brucella vaccines from 97 peer-reviewed articles. These Brucella vaccines are classified in the Vaccine Ontology (VO) system and used for different ontological applications. The web-based VIOLIN vaccine target prediction program Vaxign was used to predict new Brucella vaccine targets. Vaxign identified 14 outer membrane proteins that are conserved in six virulent strains from B. abortus, B. melitensis, and B. suis that are pathogenic in humans. Of the 14 membrane proteins, two proteins (Omp2b and Omp31-1) are not present in B. ovis, a Brucella species that is not pathogenic in humans. Brucella vaccine data stored in VIOLIN were compared and analyzed using the VIOLIN query system. Conclusions Bioinformatics curation and ontological

  12. Integration of bioinformatics into an undergraduate biology curriculum and the impact on development of mathematical skills.

    Science.gov (United States)

    Wightman, Bruce; Hark, Amy T

    2012-01-01

    The development of fields such as bioinformatics and genomics has created new challenges and opportunities for undergraduate biology curricula. Students preparing for careers in science, technology, and medicine need more intensive study of bioinformatics and more sophisticated training in the mathematics on which this field is based. In this study, we deliberately integrated bioinformatics instruction at multiple course levels into an existing biology curriculum. Students in an introductory biology course, intermediate lab courses, and advanced project-oriented courses all participated in new course components designed to sequentially introduce bioinformatics skills and knowledge, as well as computational approaches that are common to many bioinformatics applications. In each course, bioinformatics learning was embedded in an existing disciplinary instructional sequence, as opposed to having a single course where all bioinformatics learning occurs. We designed direct and indirect assessment tools to follow student progress through the course sequence. Our data show significant gains in both student confidence and ability in bioinformatics during individual courses and as course level increases. Despite evidence of substantial student learning in both bioinformatics and mathematics, students were skeptical about the link between learning bioinformatics and learning mathematics. While our approach resulted in substantial learning gains, student "buy-in" and engagement might be better in longer project-based activities that demand application of skills to research problems. Nevertheless, in situations where a concentrated focus on project-oriented bioinformatics is not possible or desirable, our approach of integrating multiple smaller components into an existing curriculum provides an alternative. Copyright © 2012 Wiley Periodicals, Inc.

  13. Bioinformatics in the secondary science classroom: A study of state content standards and students' perceptions of, and performance in, bioinformatics lessons

    Science.gov (United States)

    Wefer, Stephen H.

    The proliferation of bioinformatics in modern Biology marks a new revolution in science, which promises to influence science education at all levels. This thesis examined state standards for content that articulated bioinformatics, and explored secondary students' affective and cognitive perceptions of, and performance in, a bioinformatics mini-unit. The results are presented as three studies. The first study analyzed secondary science standards of 49 U.S States (Iowa has no science framework) and the District of Columbia for content related to bioinformatics at the introductory high school biology level. The bionformatics content of each state's Biology standards were categorized into nine areas and the prevalence of each area documented. The nine areas were: The Human Genome Project, Forensics, Evolution, Classification, Nucleotide Variations, Medicine, Computer Use, Agriculture/Food Technology, and Science Technology and Society/Socioscientific Issues (STS/SSI). Findings indicated a generally low representation of bioinformatics related content, which varied substantially across the different areas. Recommendations are made for reworking existing standards to incorporate bioinformatics and to facilitate the goal of promoting science literacy in this emerging new field among secondary school students. The second study examined thirty-two students' affective responses to, and content mastery of, a two-week bioinformatics mini-unit. The findings indicate that the students generally were positive relative to their interest level, the usefulness of the lessons, the difficulty level of the lessons, likeliness to engage in additional bioinformatics, and were overall successful on the assessments. A discussion of the results and significance is followed by suggestions for future research and implementation for transferability. The third study presents a case study of individual differences among ten secondary school students, whose cognitive and affective percepts were

  14. The web server of IBM's Bioinformatics and Pattern Discovery group.

    Science.gov (United States)

    Huynh, Tien; Rigoutsos, Isidore; Parida, Laxmi; Platt, Daniel; Shibuya, Tetsuo

    2003-07-01

    We herein present and discuss the services and content which are available on the web server of IBM's Bioinformatics and Pattern Discovery group. The server is operational around the clock and provides access to a variety of methods that have been published by the group's members and collaborators. The available tools correspond to applications ranging from the discovery of patterns in streams of events and the computation of multiple sequence alignments, to the discovery of genes in nucleic acid sequences and the interactive annotation of amino acid sequences. Additionally, annotations for more than 70 archaeal, bacterial, eukaryotic and viral genomes are available on-line and can be searched interactively. The tools and code bundles can be accessed beginning at http://cbcsrv.watson.ibm.com/Tspd.html whereas the genomics annotations are available at http://cbcsrv.watson.ibm.com/Annotations/.

  15. Getting started with microbiome analysis: sample acquisition to bioinformatics.

    Science.gov (United States)

    Kumar, Ranjit; Eipers, Peter; Little, Rebecca B; Crowley, Michael; Crossman, David K; Lefkowitz, Elliot J; Morrow, Casey D

    2014-07-14

    Historically, in order to study microbes, it was necessary to grow them in the laboratory. It was clear though that many microbe communities were refractory to study because none of the members could be grown outside of their native habitat. The development of culture-independent methods to study microbiota using high-throughput sequencing of the 16S ribosomal RNA gene variable regions present in all prokaryotic organisms has provided new opportunities to investigate complex microbial communities. In this unit, the process for a microbiome analysis is described. Many of the components required for this process may already exist. A pipeline is described for acquisition of samples from different sites on the human body, isolation of microbial DNA, and DNA sequencing using the Illumina MiSeq sequencing platform. Finally, a new analytical workflow for basic bioinformatics data analysis, QWRAP, is described, which can be used by clinical and basic science investigators. Copyright © 2014 John Wiley & Sons, Inc.

  16. BioRuby: bioinformatics software for the Ruby programming language.

    Science.gov (United States)

    Goto, Naohisa; Prins, Pjotr; Nakao, Mitsuteru; Bonnal, Raoul; Aerts, Jan; Katayama, Toshiaki

    2010-10-15

    The BioRuby software toolkit contains a comprehensive set of free development tools and libraries for bioinformatics and molecular biology, written in the Ruby programming language. BioRuby has components for sequence analysis, pathway analysis, protein modelling and phylogenetic analysis; it supports many widely used data formats and provides easy access to databases, external programs and public web services, including BLAST, KEGG, GenBank, MEDLINE and GO. BioRuby comes with a tutorial, documentation and an interactive environment, which can be used in the shell, and in the web browser. BioRuby is free and open source software, made available under the Ruby license. BioRuby runs on all platforms that support Ruby, including Linux, Mac OS X and Windows. And, with JRuby, BioRuby runs on the Java Virtual Machine. The source code is available from http://www.bioruby.org/. katayama@bioruby.org

  17. Bioinformatics and the Politics of Innovation in the Life Sciences

    Science.gov (United States)

    Zhou, Yinhua; Datta, Saheli; Salter, Charlotte

    2016-01-01

    The governments of China, India, and the United Kingdom are unanimous in their belief that bioinformatics should supply the link between basic life sciences research and its translation into health benefits for the population and the economy. Yet at the same time, as ambitious states vying for position in the future global bioeconomy they differ considerably in the strategies adopted in pursuit of this goal. At the heart of these differences lies the interaction between epistemic change within the scientific community itself and the apparatus of the state. Drawing on desk-based research and thirty-two interviews with scientists and policy makers in the three countries, this article analyzes the politics that shape this interaction. From this analysis emerges an understanding of the variable capacities of different kinds of states and political systems to work with science in harnessing the potential of new epistemic territories in global life sciences innovation. PMID:27546935

  18. A bioinformatics roadmap for the human vaccines project.

    Science.gov (United States)

    Scheuermann, Richard H; Sinkovits, Robert S; Schenkelberg, Theodore; Koff, Wayne C

    2017-06-01

    Biomedical research has become a data intensive science in which high throughput experimentation is producing comprehensive data about biological systems at an ever-increasing pace. The Human Vaccines Project is a new public-private partnership, with the goal of accelerating development of improved vaccines and immunotherapies for global infectious diseases and cancers by decoding the human immune system. To achieve its mission, the Project is developing a Bioinformatics Hub as an open-source, multidisciplinary effort with the overarching goal of providing an enabling infrastructure to support the data processing, analysis and knowledge extraction procedures required to translate high throughput, high complexity human immunology research data into biomedical knowledge, to determine the core principles driving specific and durable protective immune responses.

  19. An Adaptive Hybrid Multiprocessor technique for bioinformatics sequence alignment

    KAUST Repository

    Bonny, Talal

    2012-07-28

    Sequence alignment algorithms such as the Smith-Waterman algorithm are among the most important applications in the development of bioinformatics. Sequence alignment algorithms must process large amounts of data which may take a long time. Here, we introduce our Adaptive Hybrid Multiprocessor technique to accelerate the implementation of the Smith-Waterman algorithm. Our technique utilizes both the graphics processing unit (GPU) and the central processing unit (CPU). It adapts to the implementation according to the number of CPUs given as input by efficiently distributing the workload between the processing units. Using existing resources (GPU and CPU) in an efficient way is a novel approach. The peak performance achieved for the platforms GPU + CPU, GPU + 2CPUs, and GPU + 3CPUs is 10.4 GCUPS, 13.7 GCUPS, and 18.6 GCUPS, respectively (with the query length of 511 amino acid). © 2010 IEEE.

  20. ISEV position paper: extracellular vesicle RNA analysis and bioinformatics

    Directory of Open Access Journals (Sweden)

    Andrew F. Hill

    2013-12-01

    Full Text Available Extracellular vesicles (EVs are the collective term for the various vesicles that are released by cells into the extracellular space. Such vesicles include exosomes and microvesicles, which vary by their size and/or protein and genetic cargo. With the discovery that EVs contain genetic material in the form of RNA (evRNA has come the increased interest in these vesicles for their potential use as sources of disease biomarkers and potential therapeutic agents. Rapid developments in the availability of deep sequencing technologies have enabled the study of EV-related RNA in detail. In October 2012, the International Society for Extracellular Vesicles (ISEV held a workshop on “evRNA analysis and bioinformatics.” Here, we report the conclusions of one of the roundtable discussions where we discussed evRNA analysis technologies and provide some guidelines to researchers in the field to consider when performing such analysis.

  1. Achievements and challenges in structural bioinformatics and computational biophysics.

    Science.gov (United States)

    Samish, Ilan; Bourne, Philip E; Najmanovich, Rafael J

    2015-01-01

    The field of structural bioinformatics and computational biophysics has undergone a revolution in the last 10 years. Developments that are captured annually through the 3DSIG meeting, upon which this article reflects. An increase in the accessible data, computational resources and methodology has resulted in an increase in the size and resolution of studied systems and the complexity of the questions amenable to research. Concomitantly, the parameterization and efficiency of the methods have markedly improved along with their cross-validation with other computational and experimental results. The field exhibits an ever-increasing integration with biochemistry, biophysics and other disciplines. In this article, we discuss recent achievements along with current challenges within the field. © The Author 2014. Published by Oxford University Press.

  2. Bioinformatics tools in predictive ecology: applications to fisheries

    Science.gov (United States)

    Tucker, Allan; Duplisea, Daniel

    2012-01-01

    There has been a huge effort in the advancement of analytical techniques for molecular biological data over the past decade. This has led to many novel algorithms that are specialized to deal with data associated with biological phenomena, such as gene expression and protein interactions. In contrast, ecological data analysis has remained focused to some degree on off-the-shelf statistical techniques though this is starting to change with the adoption of state-of-the-art methods, where few assumptions can be made about the data and a more explorative approach is required, for example, through the use of Bayesian networks. In this paper, some novel bioinformatics tools for microarray data are discussed along with their ‘crossover potential’ with an application to fisheries data. In particular, a focus is made on the development of models that identify functionally equivalent species in different fish communities with the aim of predicting functional collapse. PMID:22144390

  3. Bioinformatics approaches to single-cell analysis in developmental biology.

    Science.gov (United States)

    Yalcin, Dicle; Hakguder, Zeynep M; Otu, Hasan H

    2016-03-01

    Individual cells within the same population show various degrees of heterogeneity, which may be better handled with single-cell analysis to address biological and clinical questions. Single-cell analysis is especially important in developmental biology as subtle spatial and temporal differences in cells have significant associations with cell fate decisions during differentiation and with the description of a particular state of a cell exhibiting an aberrant phenotype. Biotechnological advances, especially in the area of microfluidics, have led to a robust, massively parallel and multi-dimensional capturing, sorting, and lysis of single-cells and amplification of related macromolecules, which have enabled the use of imaging and omics techniques on single cells. There have been improvements in computational single-cell image analysis in developmental biology regarding feature extraction, segmentation, image enhancement and machine learning, handling limitations of optical resolution to gain new perspectives from the raw microscopy images. Omics approaches, such as transcriptomics, genomics and epigenomics, targeting gene and small RNA expression, single nucleotide and structural variations and methylation and histone modifications, rely heavily on high-throughput sequencing technologies. Although there are well-established bioinformatics methods for analysis of sequence data, there are limited bioinformatics approaches which address experimental design, sample size considerations, amplification bias, normalization, differential expression, coverage, clustering and classification issues, specifically applied at the single-cell level. In this review, we summarize biological and technological advancements, discuss challenges faced in the aforementioned data acquisition and analysis issues and present future prospects for application of single-cell analyses to developmental biology. © The Author 2015. Published by Oxford University Press on behalf of the European

  4. Exploring Cystic Fibrosis Using Bioinformatics Tools: A Module Designed for the Freshman Biology Course

    Science.gov (United States)

    Zhang, Xiaorong

    2011-01-01

    We incorporated a bioinformatics component into the freshman biology course that allows students to explore cystic fibrosis (CF), a common genetic disorder, using bioinformatics tools and skills. Students learn about CF through searching genetic databases, analyzing genetic sequences, and observing the three-dimensional structures of proteins…

  5. Comparative Proteome Bioinformatics: Identification of Phosphotyrosine Signaling Proteins in the Unicellular Protozoan Ciliate Tetrahymena

    DEFF Research Database (Denmark)

    Gammeltoft, Steen; Christensen, Søren Tvorup; Joachimiak, Marcin

    2005-01-01

    Tetrahymena, bioinformatics, cilia, evolution, signaling, TtPTK1, PTK, Grb2, SH-PTP 2, Plcy, Src, PTP, PI3K, SH2, SH3, PH......Tetrahymena, bioinformatics, cilia, evolution, signaling, TtPTK1, PTK, Grb2, SH-PTP 2, Plcy, Src, PTP, PI3K, SH2, SH3, PH...

  6. Bioinformatics Methods for Interpreting Toxicogenomics Data: The Role of Text-Mining

    NARCIS (Netherlands)

    Hettne, K.M.; Kleinjans, J.; Stierum, R.H.; Boorsma, A.; Kors, J.A.

    2014-01-01

    This chapter concerns the application of bioinformatics methods to the analysis of toxicogenomics data. The chapter starts with an introduction covering how bioinformatics has been applied in toxicogenomics data analysis, and continues with a description of the foundations of a specific

  7. Integration of Bioinformatics into an Undergraduate Biology Curriculum and the Impact on Development of Mathematical Skills

    Science.gov (United States)

    Wightman, Bruce; Hark, Amy T.

    2012-01-01

    The development of fields such as bioinformatics and genomics has created new challenges and opportunities for undergraduate biology curricula. Students preparing for careers in science, technology, and medicine need more intensive study of bioinformatics and more sophisticated training in the mathematics on which this field is based. In this…

  8. A Summer Program Designed to Educate College Students for Careers in Bioinformatics

    Science.gov (United States)

    Krilowicz, Beverly; Johnston, Wendie; Sharp, Sandra B.; Warter-Perez, Nancy; Momand, Jamil

    2007-01-01

    A summer program was created for undergraduates and graduate students that teaches bioinformatics concepts, offers skills in professional development, and provides research opportunities in academic and industrial institutions. We estimate that 34 of 38 graduates (89%) are in a career trajectory that will use bioinformatics. Evidence from…

  9. Making Bioinformatics Projects a Meaningful Experience in an Undergraduate Biotechnology or Biomedical Science Programme

    Science.gov (United States)

    Sutcliffe, Iain C.; Cummings, Stephen P.

    2007-01-01

    Bioinformatics has emerged as an important discipline within the biological sciences that allows scientists to decipher and manage the vast quantities of data (such as genome sequences) that are now available. Consequently, there is an obvious need to provide graduates in biosciences with generic, transferable skills in bioinformatics. We present…

  10. A Portable Bioinformatics Course for Upper-Division Undergraduate Curriculum in Sciences

    Science.gov (United States)

    Floraino, Wely B.

    2008-01-01

    This article discusses the challenges that bioinformatics education is facing and describes a bioinformatics course that is successfully taught at the California State Polytechnic University, Pomona, to the fourth year undergraduate students in biological sciences, chemistry, and computer science. Information on lecture and computer practice…

  11. Influenza research database: an integrated bioinformatics resource for influenza virus research

    Science.gov (United States)

    The Influenza Research Database (IRD) is a U.S. National Institute of Allergy and Infectious Diseases (NIAID)-sponsored Bioinformatics Resource Center dedicated to providing bioinformatics support for influenza virus research. IRD facilitates the research and development of vaccines, diagnostics, an...

  12. Implementing a Web-Based Introductory Bioinformatics Course for Non-Bioinformaticians That Incorporates Practical Exercises

    Science.gov (United States)

    Vincent, Antony T.; Bourbonnais, Yves; Brouard, Jean-Simon; Deveau, Hélène; Droit, Arnaud; Gagné, Stéphane M.; Guertin, Michel; Lemieux, Claude; Rathier, Louis; Charette, Steve J.; Lagüe, Patrick

    2018-01-01

    A recent scientific discipline, bioinformatics, defined as using informatics for the study of biological problems, is now a requirement for the study of biological sciences. Bioinformatics has become such a powerful and popular discipline that several academic institutions have created programs in this field, allowing students to become…

  13. Bioinformatics in Middle East Program Curricula--A Focus on the Arabian Gulf

    Science.gov (United States)

    Loucif, Samia

    2014-01-01

    The purpose of this paper is to investigate the inclusion of bioinformatics in program curricula in the Middle East, focusing on educational institutions in the Arabian Gulf. Bioinformatics is a multidisciplinary field which has emerged in response to the need for efficient data storage and retrieval, and accurate and fast computational and…

  14. Teaching Bioinformatics and Neuroinformatics by Using Free Web-Based Tools

    Science.gov (United States)

    Grisham, William; Schottler, Natalie A.; Valli-Marill, Joanne; Beck, Lisa; Beatty, Jackson

    2010-01-01

    This completely computer-based module's purpose is to introduce students to bioinformatics resources. We present an easy-to-adopt module that weaves together several important bioinformatic tools so students can grasp how these tools are used in answering research questions. Students integrate information gathered from websites dealing with…

  15. Incorporating a Collaborative Web-Based Virtual Laboratory in an Undergraduate Bioinformatics Course

    Science.gov (United States)

    Weisman, David

    2010-01-01

    Face-to-face bioinformatics courses commonly include a weekly, in-person computer lab to facilitate active learning, reinforce conceptual material, and teach practical skills. Similarly, fully-online bioinformatics courses employ hands-on exercises to achieve these outcomes, although students typically perform this work offsite. Combining a…

  16. Computer Programming and Biomolecular Structure Studies: A Step beyond Internet Bioinformatics

    Science.gov (United States)

    Likic, Vladimir A.

    2006-01-01

    This article describes the experience of teaching structural bioinformatics to third year undergraduate students in a subject titled "Biomolecular Structure and Bioinformatics." Students were introduced to computer programming and used this knowledge in a practical application as an alternative to the well established Internet bioinformatics…

  17. Bioinformatics in High School Biology Curricula: A Study of State Science Standards

    Science.gov (United States)

    Wefer, Stephen H.; Sheppard, Keith

    2008-01-01

    The proliferation of bioinformatics in modern biology marks a modern revolution in science that promises to influence science education at all levels. This study analyzed secondary school science standards of 49 U.S. states (Iowa has no science framework) and the District of Columbia for content related to bioinformatics. The bioinformatics…

  18. BioStar: an online question & answer resource for the bioinformatics community

    Science.gov (United States)

    Although the era of big data has produced many bioinformatics tools and databases, using them effectively often requires specialized knowledge. Many groups lack bioinformatics expertise, and frequently find that software documentation is inadequate and local colleagues may be overburdened or unfamil...

  19. Green Fluorescent Protein-Focused Bioinformatics Laboratory Experiment Suitable for Undergraduates in Biochemistry Courses

    Science.gov (United States)

    Rowe, Laura

    2017-01-01

    An introductory bioinformatics laboratory experiment focused on protein analysis has been developed that is suitable for undergraduate students in introductory biochemistry courses. The laboratory experiment is designed to be potentially used as a "stand-alone" activity in which students are introduced to basic bioinformatics tools and…

  20. Visualizing and Sharing Results in Bioinformatics Projects: GBrowse and GenBank Exports

    Science.gov (United States)

    Effective tools for presenting and sharing data are necessary for collaborative projects, typical for bioinformatics. In order to facilitate sharing our data with other genomics, molecular biology, and bioinformatics researchers, we have developed software to export our data to GenBank and combined ...

  1. Bioinformatics goes to school--new avenues for teaching contemporary biology.

    Science.gov (United States)

    Wood, Louisa; Gebhardt, Philipp

    2013-01-01

    Since 2010, the European Molecular Biology Laboratory's (EMBL) Heidelberg laboratory and the European Bioinformatics Institute (EMBL-EBI) have jointly run bioinformatics training courses developed specifically for secondary school science teachers within Europe and EMBL member states. These courses focus on introducing bioinformatics, databases, and data-intensive biology, allowing participants to explore resources and providing classroom-ready materials to support them in sharing this new knowledge with their students. In this article, we chart our progress made in creating and running three bioinformatics training courses, including how the course resources are received by participants and how these, and bioinformatics in general, are subsequently used in the classroom. We assess the strengths and challenges of our approach, and share what we have learned through our interactions with European science teachers.

  2. 9th International Conference on Practical Applications of Computational Biology and Bioinformatics

    CERN Document Server

    Rocha, Miguel; Fdez-Riverola, Florentino; Paz, Juan

    2015-01-01

    This proceedings presents recent practical applications of Computational Biology and  Bioinformatics. It contains the proceedings of the 9th International Conference on Practical Applications of Computational Biology & Bioinformatics held at University of Salamanca, Spain, at June 3rd-5th, 2015. The International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB) is an annual international meeting dedicated to emerging and challenging applied research in Bioinformatics and Computational Biology. Biological and biomedical research are increasingly driven by experimental techniques that challenge our ability to analyse, process and extract meaningful knowledge from the underlying data. The impressive capabilities of next generation sequencing technologies, together with novel and ever evolving distinct types of omics data technologies, have put an increasingly complex set of challenges for the growing fields of Bioinformatics and Computational Biology. The analysis o...

  3. Extending Asia Pacific bioinformatics into new realms in the "-omics" era.

    Science.gov (United States)

    Ranganathan, Shoba; Eisenhaber, Frank; Tong, Joo Chuan; Tan, Tin Wee

    2009-12-03

    The 2009 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation dating back to 1998, was organized as the 8th International Conference on Bioinformatics (InCoB), Sept. 7-11, 2009 at Biopolis, Singapore. Besides bringing together scientists from the field of bioinformatics in this region, InCoB has actively engaged clinicians and researchers from the area of systems biology, to facilitate greater synergy between these two groups. InCoB2009 followed on from a series of successful annual events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea), New Delhi (India), Hong Kong and Taipei (Taiwan), with InCoB2010 scheduled to be held in Tokyo, Japan, Sept. 26-28, 2010. The Workshop on Education in Bioinformatics and Computational Biology (WEBCB) and symposia on Clinical Bioinformatics (CBAS), the Singapore Symposium on Computational Biology (SYMBIO) and training tutorials were scheduled prior to the scientific meeting, and provided ample opportunity for in-depth learning and special interest meetings for educators, clinicians and students. We provide a brief overview of the peer-reviewed bioinformatics manuscripts accepted for publication in this supplement, grouped into thematic areas. In order to facilitate scientific reproducibility and accountability, we have, for the first time, introduced minimum information criteria for our pubilcations, including compliance to a Minimum Information about a Bioinformatics Investigation (MIABi). As the regional research expertise in bioinformatics matures, we have delineated a minimum set of bioinformatics skills required for addressing the computational challenges of the "-omics" era.

  4. Sample Subset Optimization Techniques for Imbalanced and Ensemble Learning Problems in Bioinformatics Applications.

    Science.gov (United States)

    Yang, Pengyi; Yoo, Paul D; Fernando, Juanita; Zhou, Bing B; Zhang, Zili; Zomaya, Albert Y

    2014-03-01

    Data sampling is a widely used technique in a broad range of machine learning problems. Traditional sampling approaches generally rely on random resampling from a given dataset. However, these approaches do not take into consideration additional information, such as sample quality and usefulness. We recently proposed a data sampling technique, called sample subset optimization (SSO). The SSO technique relies on a cross-validation procedure for identifying and selecting the most useful samples as subsets. In this paper, we describe the application of SSO techniques to imbalanced and ensemble learning problems, respectively. For imbalanced learning, the SSO technique is employed as an under-sampling technique for identifying a subset of highly discriminative samples in the majority class. In ensemble learning, the SSO technique is utilized as a generic ensemble technique where multiple optimized subsets of samples from each class are selected for building an ensemble classifier. We demonstrate the utilities and advantages of the proposed techniques on a variety of bioinformatics applications where class imbalance, small sample size, and noisy data are prevalent.

  5. Improvement of the banana "Musa acuminata" reference sequence using NGS data and semi-automated bioinformatics methods.

    Science.gov (United States)

    Martin, Guillaume; Baurens, Franc-Christophe; Droc, Gaëtan; Rouard, Mathieu; Cenci, Alberto; Kilian, Andrzej; Hastie, Alex; Doležel, Jaroslav; Aury, Jean-Marc; Alberti, Adriana; Carreel, Françoise; D'Hont, Angélique

    2016-03-16

    Recent advances in genomics indicate functional significance of a majority of genome sequences and their long range interactions. As a detailed examination of genome organization and function requires very high quality genome sequence, the objective of this study was to improve reference genome assembly of banana (Musa acuminata). We have developed a modular bioinformatics pipeline to improve genome sequence assemblies, which can handle various types of data. The pipeline comprises several semi-automated tools. However, unlike classical automated tools that are based on global parameters, the semi-automated tools proposed an expert mode for a user who can decide on suggested improvements through local compromises. The pipeline was used to improve the draft genome sequence of Musa acuminata. Genotyping by sequencing (GBS) of a segregating population and paired-end sequencing were used to detect and correct scaffold misassemblies. Long insert size paired-end reads identified scaffold junctions and fusions missed by automated assembly methods. GBS markers were used to anchor scaffolds to pseudo-molecules with a new bioinformatics approach that avoids the tedious step of marker ordering during genetic map construction. Furthermore, a genome map was constructed and used to assemble scaffolds into super scaffolds. Finally, a consensus gene annotation was projected on the new assembly from two pre-existing annotations. This approach reduced the total Musa scaffold number from 7513 to 1532 (i.e. by 80%), with an N50 that increased from 1.3 Mb (65 scaffolds) to 3.0 Mb (26 scaffolds). 89.5% of the assembly was anchored to the 11 Musa chromosomes compared to the previous 70%. Unknown sites (N) were reduced from 17.3 to 10.0%. The release of the Musa acuminata reference genome version 2 provides a platform for detailed analysis of banana genome variation, function and evolution. Bioinformatics tools developed in this work can be used to improve genome sequence assemblies in

  6. A web services choreography scenario for interoperating bioinformatics applications.

    Science.gov (United States)

    de Knikker, Remko; Guo, Youjun; Li, Jin-Long; Kwan, Albert K H; Yip, Kevin Y; Cheung, David W; Cheung, Kei-Hoi

    2004-03-10

    Very often genome-wide data analysis requires the interoperation of multiple databases and analytic tools. A large number of genome databases and bioinformatics applications are available through the web, but it is difficult to automate interoperation because: 1) the platforms on which the applications run are heterogeneous, 2) their web interface is not machine-friendly, 3) they use a non-standard format for data input and output, 4) they do not exploit standards to define application interface and message exchange, and 5) existing protocols for remote messaging are often not firewall-friendly. To overcome these issues, web services have emerged as a standard XML-based model for message exchange between heterogeneous applications. Web services engines have been developed to manage the configuration and execution of a web services workflow. To demonstrate the benefit of using web services over traditional web interfaces, we compare the two implementations of HAPI, a gene expression analysis utility developed by the University of California San Diego (UCSD) that allows visual characterization of groups or clusters of genes based on the biomedical literature. This utility takes a set of microarray spot IDs as input and outputs a hierarchy of MeSH Keywords that correlates to the input and is grouped by Medical Subject Heading (MeSH) category. While the HTML output is easy for humans to visualize, it is difficult for computer applications to interpret semantically. To facilitate the capability of machine processing, we have created a workflow of three web services that replicates the HAPI functionality. These web services use document-style messages, which means that messages are encoded in an XML-based format. We compared three approaches to the implementation of an XML-based workflow: a hard coded Java application, Collaxa BPEL Server and Taverna Workbench. The Java program functions as a web services engine and interoperates with these web services using a web

  7. A web services choreography scenario for interoperating bioinformatics applications

    Directory of Open Access Journals (Sweden)

    Cheung David W

    2004-03-01

    Full Text Available Abstract Background Very often genome-wide data analysis requires the interoperation of multiple databases and analytic tools. A large number of genome databases and bioinformatics applications are available through the web, but it is difficult to automate interoperation because: 1 the platforms on which the applications run are heterogeneous, 2 their web interface is not machine-friendly, 3 they use a non-standard format for data input and output, 4 they do not exploit standards to define application interface and message exchange, and 5 existing protocols for remote messaging are often not firewall-friendly. To overcome these issues, web services have emerged as a standard XML-based model for message exchange between heterogeneous applications. Web services engines have been developed to manage the configuration and execution of a web services workflow. Results To demonstrate the benefit of using web services over traditional web interfaces, we compare the two implementations of HAPI, a gene expression analysis utility developed by the University of California San Diego (UCSD that allows visual characterization of groups or clusters of genes based on the biomedical literature. This utility takes a set of microarray spot IDs as input and outputs a hierarchy of MeSH Keywords that correlates to the input and is grouped by Medical Subject Heading (MeSH category. While the HTML output is easy for humans to visualize, it is difficult for computer applications to interpret semantically. To facilitate the capability of machine processing, we have created a workflow of three web services that replicates the HAPI functionality. These web services use document-style messages, which means that messages are encoded in an XML-based format. We compared three approaches to the implementation of an XML-based workflow: a hard coded Java application, Collaxa BPEL Server and Taverna Workbench. The Java program functions as a web services engine and interoperates

  8. A web services choreography scenario for interoperating bioinformatics applications

    Science.gov (United States)

    de Knikker, Remko; Guo, Youjun; Li, Jin-long; Kwan, Albert KH; Yip, Kevin Y; Cheung, David W; Cheung, Kei-Hoi

    2004-01-01

    Background Very often genome-wide data analysis requires the interoperation of multiple databases and analytic tools. A large number of genome databases and bioinformatics applications are available through the web, but it is difficult to automate interoperation because: 1) the platforms on which the applications run are heterogeneous, 2) their web interface is not machine-friendly, 3) they use a non-standard format for data input and output, 4) they do not exploit standards to define application interface and message exchange, and 5) existing protocols for remote messaging are often not firewall-friendly. To overcome these issues, web services have emerged as a standard XML-based model for message exchange between heterogeneous applications. Web services engines have been developed to manage the configuration and execution of a web services workflow. Results To demonstrate the benefit of using web services over traditional web interfaces, we compare the two implementations of HAPI, a gene expression analysis utility developed by the University of California San Diego (UCSD) that allows visual characterization of groups or clusters of genes based on the biomedical literature. This utility takes a set of microarray spot IDs as input and outputs a hierarchy of MeSH Keywords that correlates to the input and is grouped by Medical Subject Heading (MeSH) category. While the HTML output is easy for humans to visualize, it is difficult for computer applications to interpret semantically. To facilitate the capability of machine processing, we have created a workflow of three web services that replicates the HAPI functionality. These web services use document-style messages, which means that messages are encoded in an XML-based format. We compared three approaches to the implementation of an XML-based workflow: a hard coded Java application, Collaxa BPEL Server and Taverna Workbench. The Java program functions as a web services engine and interoperates with these web

  9. A review of bioinformatics training applied to research in molecular medicine, agriculture and biodiversity in Costa Rica and Central America.

    Science.gov (United States)

    Orozco, Allan; Morera, Jessica; Jiménez, Sergio; Boza, Ricardo

    2013-09-01

    Today, Bioinformatics has become a scientific discipline with great relevance for the Molecular Biosciences and for the Omics sciences in general. Although developed countries have progressed with large strides in Bioinformatics education and research, in other regions, such as Central America, the advances have occurred in a gradual way and with little support from the Academia, either at the undergraduate or graduate level. To address this problem, the University of Costa Rica's Medical School, a regional leader in Bioinformatics in Central America, has been conducting a series of Bioinformatics workshops, seminars and courses, leading to the creation of the region's first Bioinformatics Master's Degree. The recent creation of the Central American Bioinformatics Network (BioCANET), associated to the deployment of a supporting computational infrastructure (HPC Cluster) devoted to provide computing support for Molecular Biology in the region, is providing a foundational stone for the development of Bioinformatics in the area. Central American bioinformaticians have participated in the creation of as well as co-founded the Iberoamerican Bioinformatics Society (SOIBIO). In this article, we review the most recent activities in education and research in Bioinformatics from several regional institutions. These activities have resulted in further advances for Molecular Medicine, Agriculture and Biodiversity research in Costa Rica and the rest of the Central American countries. Finally, we provide summary information on the first Central America Bioinformatics International Congress, as well as the creation of the first Bioinformatics company (Indromics Bioinformatics), spin-off the Academy in Central America and the Caribbean.

  10. Bioinformatics programs are 31-fold over-represented among the highest impact scientific papers of the past two decades.

    Science.gov (United States)

    Wren, Jonathan D

    2016-09-01

    To analyze the relative proportion of bioinformatics papers and their non-bioinformatics counterparts in the top 20 most cited papers annually for the past two decades. When defining bioinformatics papers as encompassing both those that provide software for data analysis or methods underlying data analysis software, we find that over the past two decades, more than a third (34%) of the most cited papers in science were bioinformatics papers, which is approximately a 31-fold enrichment relative to the total number of bioinformatics papers published. More than half of the most cited papers during this span were bioinformatics papers. Yet, the average 5-year JIF of top 20 bioinformatics papers was 7.7, whereas the average JIF for top 20 non-bioinformatics papers was 25.8, significantly higher (P bioinformatics journals tended to have higher Gini coefficients, suggesting that development of novel bioinformatics resources may be somewhat 'hit or miss'. That is, relative to other fields, bioinformatics produces some programs that are extremely widely adopted and cited, yet there are fewer of intermediate success. jdwren@gmail.com Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  11. ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis.

    Science.gov (United States)

    Römer, Michael; Eichner, Johannes; Dräger, Andreas; Wrzodek, Clemens; Wrzodek, Finja; Zell, Andreas

    2016-01-01

    Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/.

  12. ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis.

    Directory of Open Access Journals (Sweden)

    Michael Römer

    Full Text Available Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/.

  13. Progress and challenges in bioinformatics approaches for enhancer identification

    KAUST Repository

    Kleftogiannis, Dimitrios A.

    2017-02-03

    Enhancers are cis-acting DNA elements that play critical roles in distal regulation of gene expression. Identifying enhancers is an important step for understanding distinct gene expression programs that may reflect normal and pathogenic cellular conditions. Experimental identification of enhancers is constrained by the set of conditions used in the experiment. This requires multiple experiments to identify enhancers, as they can be active under specific cellular conditions but not in different cell types/tissues or cellular states. This has opened prospects for computational prediction methods that can be used for high-throughput identification of putative enhancers to complement experimental approaches. Potential functions and properties of predicted enhancers have been catalogued and summarized in several enhancer-oriented databases. Because the current methods for the computational prediction of enhancers produce significantly different enhancer predictions, it will be beneficial for the research community to have an overview of the strategies and solutions developed in this field. In this review, we focus on the identification and analysis of enhancers by bioinformatics approaches. First, we describe a general framework for computational identification of enhancers, present relevant data types and discuss possible computational solutions. Next, we cover over 30 existing computational enhancer identification methods that were developed since 2000. Our review highlights advantages, limitations and potentials, while suggesting pragmatic guidelines for development of more efficient computational enhancer prediction methods. Finally, we discuss challenges and open problems of this topic, which require further consideration.

  14. WeBIAS: a web server for publishing bioinformatics applications.

    Science.gov (United States)

    Daniluk, Paweł; Wilczyński, Bartek; Lesyng, Bogdan

    2015-11-02

    One of the requirements for a successful scientific tool is its availability. Developing a functional web service, however, is usually considered a mundane and ungratifying task, and quite often neglected. When publishing bioinformatic applications, such attitude puts additional burden on the reviewers who have to cope with poorly designed interfaces in order to assess quality of presented methods, as well as impairs actual usefulness to the scientific community at large. In this note we present WeBIAS-a simple, self-contained solution to make command-line programs accessible through web forms. It comprises a web portal capable of serving several applications and backend schedulers which carry out computations. The server handles user registration and authentication, stores queries and results, and provides a convenient administrator interface. WeBIAS is implemented in Python and available under GNU Affero General Public License. It has been developed and tested on GNU/Linux compatible platforms covering a vast majority of operational WWW servers. Since it is written in pure Python, it should be easy to deploy also on all other platforms supporting Python (e.g. Windows, Mac OS X). Documentation and source code, as well as a demonstration site are available at http://bioinfo.imdik.pan.pl/webias . WeBIAS has been designed specifically with ease of installation and deployment of services in mind. Setting up a simple application requires minimal effort, yet it is possible to create visually appealing, feature-rich interfaces for query submission and presentation of results.

  15. Bicycle: a bioinformatics pipeline to analyze bisulfite sequencing data.

    Science.gov (United States)

    Graña, Osvaldo; López-Fernández, Hugo; Fdez-Riverola, Florentino; González Pisano, David; Glez-Peña, Daniel

    2018-04-15

    High-throughput sequencing of bisulfite-converted DNA is a technique used to measure DNA methylation levels. Although a considerable number of computational pipelines have been developed to analyze such data, none of them tackles all the peculiarities of the analysis together, revealing limitations that can force the user to manually perform additional steps needed for a complete processing of the data. This article presents bicycle, an integrated, flexible analysis pipeline for bisulfite sequencing data. Bicycle analyzes whole genome bisulfite sequencing data, targeted bisulfite sequencing data and hydroxymethylation data. To show how bicycle overtakes other available pipelines, we compared them on a defined number of features that are summarized in a table. We also tested bicycle with both simulated and real datasets, to show its level of performance, and compared it to different state-of-the-art methylation analysis pipelines. Bicycle is publicly available under GNU LGPL v3.0 license at http://www.sing-group.org/bicycle. Users can also download a customized Ubuntu LiveCD including bicycle and other bisulfite sequencing data pipelines compared here. In addition, a docker image with bicycle and its dependencies, which allows a straightforward use of bicycle in any platform (e.g. Linux, OS X or Windows), is also available. ograna@cnio.es or dgpena@uvigo.es. Supplementary data are available at Bioinformatics online.

  16. Bioinformatics Analysis of MAPKKK Family Genes in Medicago truncatula

    Directory of Open Access Journals (Sweden)

    Wei Li

    2016-04-01

    Full Text Available Mitogen‐activated protein kinase kinase kinase (MAPKKK is a component of the MAPK cascade pathway that plays an important role in plant growth, development, and response to abiotic stress, the functions of which have been well characterized in several plant species, such as Arabidopsis, rice, and maize. In this study, we performed genome‐wide and systemic bioinformatics analysis of MAPKKK family genes in Medicago truncatula. In total, there were 73 MAPKKK family members identified by search of homologs, and they were classified into three subfamilies, MEKK, ZIK, and RAF. Based on the genomic duplication function, 72 MtMAPKKK genes were located throughout all chromosomes, but they cluster in different chromosomes. Using microarray data and high‐throughput sequencing‐data, we assessed their expression profiles in growth and development processes; these results provided evidence for exploring their important functions in developmental regulation, especially in the nodulation process. Furthermore, we investigated their expression in abiotic stresses by RNA‐seq, which confirmed their critical roles in signal transduction and regulation processes under stress. In summary, our genome‐wide, systemic characterization and expressional analysis of MtMAPKKK genes will provide insights that will be useful for characterizing the molecular functions of these genes in M. truncatula.

  17. Accurate Prediction of Coronary Artery Disease Using Bioinformatics Algorithms

    Directory of Open Access Journals (Sweden)

    Hajar Shafiee

    2016-06-01

    Full Text Available Background and Objectives: Cardiovascular disease is one of the main causes of death in developed and Third World countries. According to the statement of the World Health Organization, it is predicted that death due to heart disease will rise to 23 million by 2030. According to the latest statistics reported by Iran’s Minister of health, 3.39% of all deaths are attributed to cardiovascular diseases and 19.5% are related to myocardial infarction. The aim of this study was to predict coronary artery disease using data mining algorithms. Methods: In this study, various bioinformatics algorithms, such as decision trees, neural networks, support vector machines, clustering, etc., were used to predict coronary heart disease. The data used in this study was taken from several valid databases (including 14 data. Results: In this research, data mining techniques can be effectively used to diagnose different diseases, including coronary artery disease. Also, for the first time, a prediction system based on support vector machine with the best possible accuracy was introduced. Conclusion: The results showed that among the features, thallium scan variable is the most important feature in the diagnosis of heart disease. Designation of machine prediction models, such as support vector machine learning algorithm can differentiate between sick and healthy individuals with 100% accuracy.

  18. The European Bioinformatics Institute in 2017: data coordination and integration

    Science.gov (United States)

    Cochrane, Guy; Apweiler, Rolf; Birney, Ewan

    2018-01-01

    Abstract The European Bioinformatics Institute (EMBL-EBI) supports life-science research throughout the world by providing open data, open-source software and analytical tools, and technical infrastructure (https://www.ebi.ac.uk). We accommodate an increasingly diverse range of data types and integrate them, so that biologists in all disciplines can explore life in ever-increasing detail. We maintain over 40 data resources, many of which are run collaboratively with partners in 16 countries (https://www.ebi.ac.uk/services). Submissions continue to increase exponentially: our data storage has doubled in less than two years to 120 petabytes. Recent advances in cellular imaging and single-cell sequencing techniques are generating a vast amount of high-dimensional data, bringing to light new cell types and new perspectives on anatomy. Accordingly, one of our main focus areas is integrating high-quality information from bioimaging, biobanking and other types of molecular data. This is reflected in our deep involvement in Open Targets, stewarding of plant phenotyping standards (MIAPPE) and partnership in the Human Cell Atlas data coordination platform, as well as the 2017 launch of the Omics Discovery Index. This update gives a birds-eye view of EMBL-EBI’s approach to data integration and service development as genomics begins to enter the clinic. PMID:29186510

  19. Phylogenetic diversity (PD and biodiversity conservation: some bioinformatics challenges

    Directory of Open Access Journals (Sweden)

    Daniel P. Faith

    2006-01-01

    Full Text Available Biodiversity conservation addresses information challenges through estimations encapsulated in measures of diversity. A quantitative measure of phylogenetic diversity, “PD”, has been defined as the minimum total length of all the phylogenetic branches required to span a given set of taxa on the phylogenetic tree (Faith 1992a. While a recent paper incorrectly characterizes PD as not including information about deeper phylogenetic branches, PD applications over the past decade document the proper incorporation of shared deep branches when assessing the total PD of a set of taxa. Current PD applications to macroinvertebrate taxa in streams of New South Wales, Australia illustrate the practical importance of this definition. Phylogenetic lineages, often corresponding to new, “cryptic”, taxa, are restricted to a small number of stream localities. A recent case of human impact causing loss of taxa in one locality implies a higher PD value for another locality, because it now uniquely represents a deeper branch. This molecular-based phylogenetic pattern supports the use of DNA barcoding programs for biodiversity conservation planning. Here, PD assessments side-step the contentious use of barcoding-based “species” designations. Bio-informatics challenges include combining different phylogenetic evidence, optimization problems for conservation planning, and effective integration of phylogenetic information with environmental and socio-economic data.

  20. Bioinformatic approaches reveal metagenomic characterization of soil microbial community.

    Directory of Open Access Journals (Sweden)

    Zhuofei Xu

    Full Text Available As is well known, soil is a complex ecosystem harboring the most prokaryotic biodiversity on the Earth. In recent years, the advent of high-throughput sequencing techniques has greatly facilitated the progress of soil ecological studies. However, how to effectively understand the underlying biological features of large-scale sequencing data is a new challenge. In the present study, we used 33 publicly available metagenomes from diverse soil sites (i.e. grassland, forest soil, desert, Arctic soil, and mangrove sediment and integrated some state-of-the-art computational tools to explore the phylogenetic and functional characterizations of the microbial communities in soil. Microbial composition and metabolic potential in soils were comprehensively illustrated at the metagenomic level. A spectrum of metagenomic biomarkers containing 46 taxa and 33 metabolic modules were detected to be significantly differential that could be used as indicators to distinguish at least one of five soil communities. The co-occurrence associations between complex microbial compositions and functions were inferred by network-based approaches. Our results together with the established bioinformatic pipelines should provide a foundation for future research into the relation between soil biodiversity and ecosystem function.

  1. E-MSD: an integrated data resource for bioinformatics.

    Science.gov (United States)

    Velankar, S; McNeil, P; Mittard-Runte, V; Suarez, A; Barrell, D; Apweiler, R; Henrick, K

    2005-01-01

    The Macromolecular Structure Database (MSD) group (http://www.ebi.ac.uk/msd/) continues to enhance the quality and consistency of macromolecular structure data in the worldwide Protein Data Bank (wwPDB) and to work towards the integration of various bioinformatics data resources. One of the major obstacles to the improved integration of structural databases such as MSD and sequence databases like UniProt is the absence of up to date and well-maintained mapping between corresponding entries. We have worked closely with the UniProt group at the EBI to clean up the taxonomy and sequence cross-reference information in the MSD and UniProt databases. This information is vital for the reliable integration of the sequence family databases such as Pfam and Interpro with the structure-oriented databases of SCOP and CATH. This information has been made available to the eFamily group (http://www.efamily.org.uk/) and now forms the basis of the regular interchange of information between the member databases (MSD, UniProt, Pfam, Interpro, SCOP and CATH). This exchange of annotation information has enriched the structural information in the MSD database with annotation from wider sequence-oriented resources. This work was carried out under the 'Structure Integration with Function, Taxonomy and Sequences (SIFTS)' initiative (http://www.ebi.ac.uk/msd-srv/docs/sifts) in the MSD group.

  2. Bioinformatic Prediction of WSSV-Host Protein-Protein Interaction

    Directory of Open Access Journals (Sweden)

    Zheng Sun

    2014-01-01

    Full Text Available WSSV is one of the most dangerous pathogens in shrimp aquaculture. However, the molecular mechanism of how WSSV interacts with shrimp is still not very clear. In the present study, bioinformatic approaches were used to predict interactions between proteins from WSSV and shrimp. The genome data of WSSV (NC_003225.1 and the constructed transcriptome data of F. chinensis were used to screen potentially interacting proteins by searching in protein interaction databases, including STRING, Reactome, and DIP. Forty-four pairs of proteins were suggested to have interactions between WSSV and the shrimp. Gene ontology analysis revealed that 6 pairs of these interacting proteins were classified into “extracellular region” or “receptor complex” GO-terms. KEGG pathway analysis showed that they were involved in the “ECM-receptor interaction pathway.” In the 6 pairs of interacting proteins, an envelope protein called “collagen-like protein” (WSSV-CLP encoded by an early virus gene “wsv001” in WSSV interacted with 6 deduced proteins from the shrimp, including three integrin alpha (ITGA, two integrin beta (ITGB, and one syndecan (SDC. Sequence analysis on WSSV-CLP, ITGA, ITGB, and SDC revealed that they possessed the sequence features for protein-protein interactions. This study might provide new insights into the interaction mechanisms between WSSV and shrimp.

  3. Immunogenicity of novel Dengue virus epitopes identified by bioinformatic analysis.

    Science.gov (United States)

    Sánchez-Burgos, Gilma; Ramos-Castañeda, José; Cedillo-Rivera, Roberto; Dumonteil, Eric

    2010-10-01

    We used T cell epitope prediction tools to identify epitopes from Dengue virus polyprotein sequences, and evaluated in vivo and in vitro the immunogenicity and antigenicity of the corresponding synthetic vaccine candidates. Twenty-two epitopes were predicted to have a high affinity for MHC class I (H-2Kd, H-2Dd, H-2Ld alleles) or class II (IAd alleles). These epitopes were conserved between the four virus serotypes, but with no similarity to human and mouse sequences. Thirteen synthetic peptides induced specific antibodies production with or without T cells activation in mice. Three synthetic peptides induced mostly IgG antibodies, and one of these from the E gene induced a neutralizing response. Ten peptides induced a combination of humoral and cellular responses by CD4+ and CD8+ T cells. Twelve peptides were novel B and T cell epitopes. These results indicate that our bioinformatics strategy is a powerful tool for the identification of novel antigens and its application to human HLA may lead to a potent epitope-based vaccine against Dengue virus and many other pathogens. (c) 2010 Elsevier B.V. All rights reserved.

  4. MEMOSys: Bioinformatics platform for genome-scale metabolic models.

    Science.gov (United States)

    Pabinger, Stephan; Rader, Robert; Agren, Rasmus; Nielsen, Jens; Trajanoski, Zlatko

    2011-01-31

    Recent advances in genomic sequencing have enabled the use of genome sequencing in standard biological and biotechnological research projects. The challenge is how to integrate the large amount of data in order to gain novel biological insights. One way to leverage sequence data is to use genome-scale metabolic models. We have therefore designed and implemented a bioinformatics platform which supports the development of such metabolic models. MEMOSys (MEtabolic MOdel research and development System) is a versatile platform for the management, storage, and development of genome-scale metabolic models. It supports the development of new models by providing a built-in version control system which offers access to the complete developmental history. Moreover, the integrated web board, the authorization system, and the definition of user roles allow collaborations across departments and institutions. Research on existing models is facilitated by a search system, references to external databases, and a feature-rich comparison mechanism. MEMOSys provides customizable data exchange mechanisms using the SBML format to enable analysis in external tools. The web application is based on the Java EE framework and offers an intuitive user interface. It currently contains six annotated microbial metabolic models. We have developed a web-based system designed to provide researchers a novel application facilitating the management and development of metabolic models. The system is freely available at http://www.icbi.at/MEMOSys.

  5. MEMOSys: Bioinformatics platform for genome-scale metabolic models

    Directory of Open Access Journals (Sweden)

    Agren Rasmus

    2011-01-01

    Full Text Available Abstract Background Recent advances in genomic sequencing have enabled the use of genome sequencing in standard biological and biotechnological research projects. The challenge is how to integrate the large amount of data in order to gain novel biological insights. One way to leverage sequence data is to use genome-scale metabolic models. We have therefore designed and implemented a bioinformatics platform which supports the development of such metabolic models. Results MEMOSys (MEtabolic MOdel research and development System is a versatile platform for the management, storage, and development of genome-scale metabolic models. It supports the development of new models by providing a built-in version control system which offers access to the complete developmental history. Moreover, the integrated web board, the authorization system, and the definition of user roles allow collaborations across departments and institutions. Research on existing models is facilitated by a search system, references to external databases, and a feature-rich comparison mechanism. MEMOSys provides customizable data exchange mechanisms using the SBML format to enable analysis in external tools. The web application is based on the Java EE framework and offers an intuitive user interface. It currently contains six annotated microbial metabolic models. Conclusions We have developed a web-based system designed to provide researchers a novel application facilitating the management and development of metabolic models. The system is freely available at http://www.icbi.at/MEMOSys.

  6. Exploiting graphics processing units for computational biology and bioinformatics.

    Science.gov (United States)

    Payne, Joshua L; Sinnott-Armstrong, Nicholas A; Moore, Jason H

    2010-09-01

    Advances in the video gaming industry have led to the production of low-cost, high-performance graphics processing units (GPUs) that possess more memory bandwidth and computational capability than central processing units (CPUs), the standard workhorses of scientific computing. With the recent release of generalpurpose GPUs and NVIDIA's GPU programming language, CUDA, graphics engines are being adopted widely in scientific computing applications, particularly in the fields of computational biology and bioinformatics. The goal of this article is to concisely present an introduction to GPU hardware and programming, aimed at the computational biologist or bioinformaticist. To this end, we discuss the primary differences between GPU and CPU architecture, introduce the basics of the CUDA programming language, and discuss important CUDA programming practices, such as the proper use of coalesced reads, data types, and memory hierarchies. We highlight each of these topics in the context of computing the all-pairs distance between instances in a dataset, a common procedure in numerous disciplines of scientific computing. We conclude with a runtime analysis of the GPU and CPU implementations of the all-pairs distance calculation. We show our final GPU implementation to outperform the CPU implementation by a factor of 1700.

  7. Graphics processing units in bioinformatics, computational biology and systems biology.

    Science.gov (United States)

    Nobile, Marco S; Cazzaniga, Paolo; Tangherloni, Andrea; Besozzi, Daniela

    2017-09-01

    Several studies in Bioinformatics, Computational Biology and Systems Biology rely on the definition of physico-chemical or mathematical models of biological systems at different scales and levels of complexity, ranging from the interaction of atoms in single molecules up to genome-wide interaction networks. Traditional computational methods and software tools developed in these research fields share a common trait: they can be computationally demanding on Central Processing Units (CPUs), therefore limiting their applicability in many circumstances. To overcome this issue, general-purpose Graphics Processing Units (GPUs) are gaining an increasing attention by the scientific community, as they can considerably reduce the running time required by standard CPU-based software, and allow more intensive investigations of biological systems. In this review, we present a collection of GPU tools recently developed to perform computational analyses in life science disciplines, emphasizing the advantages and the drawbacks in the use of these parallel architectures. The complete list of GPU-powered tools here reviewed is available at http://bit.ly/gputools. © The Author 2016. Published by Oxford University Press.

  8. Fifteen years SIB Swiss Institute of Bioinformatics: life science databases, tools and support.

    Science.gov (United States)

    Stockinger, Heinz; Altenhoff, Adrian M; Arnold, Konstantin; Bairoch, Amos; Bastian, Frederic; Bergmann, Sven; Bougueleret, Lydie; Bucher, Philipp; Delorenzi, Mauro; Lane, Lydie; Le Mercier, Philippe; Lisacek, Frédérique; Michielin, Olivier; Palagi, Patricia M; Rougemont, Jacques; Schwede, Torsten; von Mering, Christian; van Nimwegen, Erik; Walther, Daniel; Xenarios, Ioannis; Zavolan, Mihaela; Zdobnov, Evgeny M; Zoete, Vincent; Appel, Ron D

    2014-07-01

    The SIB Swiss Institute of Bioinformatics (www.isb-sib.ch) was created in 1998 as an institution to foster excellence in bioinformatics. It is renowned worldwide for its databases and software tools, such as UniProtKB/Swiss-Prot, PROSITE, SWISS-MODEL, STRING, etc, that are all accessible on ExPASy.org, SIB's Bioinformatics Resource Portal. This article provides an overview of the scientific and training resources SIB has consistently been offering to the life science community for more than 15 years. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. Model-driven user interfaces for bioinformatics data resources: regenerating the wheel as an alternative to reinventing it

    Directory of Open Access Journals (Sweden)

    Swainston Neil

    2006-12-01

    Full Text Available Abstract Background The proliferation of data repositories in bioinformatics has resulted in the development of numerous interfaces that allow scientists to browse, search and analyse the data that they contain. Interfaces typically support repository access by means of web pages, but other means are also used, such as desktop applications and command line tools. Interfaces often duplicate functionality amongst each other, and this implies that associated development activities are repeated in different laboratories. Interfaces developed by public laboratories are often created with limited developer resources. In such environments, reducing the time spent on creating user interfaces allows for a better deployment of resources for specialised tasks, such as data integration or analysis. Laboratories maintaining data resources are challenged to reconcile requirements for software that is reliable, functional and flexible with limitations on software development resources. Results This paper proposes a model-driven approach for the partial generation of user interfaces for searching and browsing bioinformatics data repositories. Inspired by the Model Driven Architecture (MDA of the Object Management Group (OMG, we have developed a system that generates interfaces designed for use with bioinformatics resources. This approach helps laboratory domain experts decrease the amount of time they have to spend dealing with the repetitive aspects of user interface development. As a result, the amount of time they can spend on gathering requirements and helping develop specialised features increases. The resulting system is known as Pierre, and has been validated through its application to use cases in the life sciences, including the PEDRoDB proteomics database and the e-Fungi data warehouse. Conclusion MDAs focus on generating software from models that describe aspects of service capabilities, and can be applied to support rapid development of repository

  10. "Broadband" Bioinformatics Skills Transfer with the Knowledge Transfer Programme (KTP): Educational Model for Upliftment and Sustainable Development.

    Science.gov (United States)

    Chimusa, Emile R; Mbiyavanga, Mamana; Masilela, Velaphi; Kumuthini, Judit

    2015-11-01

    A shortage of practical skills and relevant expertise is possibly the primary obstacle to social upliftment and sustainable development in Africa. The "omics" fields, especially genomics, are increasingly dependent on the effective interpretation of large and complex sets of data. Despite abundant natural resources and population sizes comparable with many first-world countries from which talent could be drawn, countries in Africa still lag far behind the rest of the world in terms of specialized skills development. Moreover, there are serious concerns about disparities between countries within the continent. The multidisciplinary nature of the bioinformatics field, coupled with rare and depleting expertise, is a critical problem for the advancement of bioinformatics in Africa. We propose a formalized matchmaking system, which is aimed at reversing this trend, by introducing the Knowledge Transfer Programme (KTP). Instead of individual researchers travelling to other labs to learn, researchers with desirable skills are invited to join African research groups for six weeks to six months. Visiting researchers or trainers will pass on their expertise to multiple people simultaneously in their local environments, thus increasing the efficiency of knowledge transference. In return, visiting researchers have the opportunity to develop professional contacts, gain industry work experience, work with novel datasets, and strengthen and support their ongoing research. The KTP develops a network with a centralized hub through which groups and individuals are put into contact with one another and exchanges are facilitated by connecting both parties with potential funding sources. This is part of the PLOS Computational Biology Education collection.

  11. Effect of Wnt3a on Keratinocytes Utilizing in Vitro and Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Ju-Suk Nam

    2014-03-01

    Full Text Available Wingless-type (Wnt signaling proteins participate in various cell developmental processes. A suppressive role of Wnt5a on keratinocyte growth has already been observed. However, the role of other Wnt proteins in proliferation and differentiation of keratinocytes remains unknown. Here, we investigated the effects of the Wnt ligand, Wnt3a, on proliferation and differentiation of keratinocytes. Keratinocytes from normal human skin were cultured and treated with recombinant Wnt3a alone or in combination with the inflammatory cytokine, tumor necrosis factor α (TNFα. Furthermore, using bioinformatics, we analyzed the biochemical parameters, molecular evolution, and protein–protein interaction network for the Wnt family. Application of recombinant Wnt3a showed an anti-proliferative effect on keratinocytes in a dose-dependent manner. After treatment with TNFα, Wnt3a still demonstrated an anti-proliferative effect on human keratinocytes. Exogenous treatment of Wnt3a was unable to alter mRNA expression of differentiation markers of keratinocytes, whereas an altered expression was observed in TNFα-stimulated keratinocytes. In silico phylogenetic, biochemical, and protein–protein interaction analysis showed several close relationships among the family members of the Wnt family. Moreover, a close phylogenetic and biochemical similarity was observed between Wnt3a and Wnt5a. Finally, we proposed a hypothetical mechanism to illustrate how the Wnt3a protein may inhibit the process of proliferation in keratinocytes, which would be useful for future researchers.

  12. Entropy-based analysis and bioinformatics-inspired integration of global economic information transfer.

    Directory of Open Access Journals (Sweden)

    Jinkyu Kim

    Full Text Available The assessment of information transfer in the global economic network helps to understand the current environment and the outlook of an economy. Most approaches on global networks extract information transfer based mainly on a single variable. This paper establishes an entirely new bioinformatics-inspired approach to integrating information transfer derived from multiple variables and develops an international economic network accordingly. In the proposed methodology, we first construct the transfer entropies (TEs between various intra- and inter-country pairs of economic time series variables, test their significances, and then use a weighted sum approach to aggregate information captured in each TE. Through a simulation study, the new method is shown to deliver better information integration compared to existing integration methods in that it can be applied even when intra-country variables are correlated. Empirical investigation with the real world data reveals that Western countries are more influential in the global economic network and that Japan has become less influential following the Asian currency crisis.

  13. Entropy-based analysis and bioinformatics-inspired integration of global economic information transfer.

    Science.gov (United States)

    Kim, Jinkyu; Kim, Gunn; An, Sungbae; Kwon, Young-Kyun; Yoon, Sungroh

    2013-01-01

    The assessment of information transfer in the global economic network helps to understand the current environment and the outlook of an economy. Most approaches on global networks extract information transfer based mainly on a single variable. This paper establishes an entirely new bioinformatics-inspired approach to integrating information transfer derived from multiple variables and develops an international economic network accordingly. In the proposed methodology, we first construct the transfer entropies (TEs) between various intra- and inter-country pairs of economic time series variables, test their significances, and then use a weighted sum approach to aggregate information captured in each TE. Through a simulation study, the new method is shown to deliver better information integration compared to existing integration methods in that it can be applied even when intra-country variables are correlated. Empirical investigation with the real world data reveals that Western countries are more influential in the global economic network and that Japan has become less influential following the Asian currency crisis.

  14. Pay-as-you-go data integration for bio-informatics

    NARCIS (Netherlands)

    Wanders, B.

    2012-01-01

    Scientific research in bio-informatics is often data-driven and supported by numerous biological databases. A biological database contains factual information collected from scientific experiments and computational analyses about areas including genomics, proteomics, metabolomics, microarray gene

  15. Fundamentals of bioinformatics and computational biology methods and exercises in matlab

    CERN Document Server

    Singh, Gautam B

    2015-01-01

    This book offers comprehensive coverage of all the core topics of bioinformatics, and includes practical examples completed using the MATLAB bioinformatics toolbox™. It is primarily intended as a textbook for engineering and computer science students attending advanced undergraduate and graduate courses in bioinformatics and computational biology. The book develops bioinformatics concepts from the ground up, starting with an introductory chapter on molecular biology and genetics. This chapter will enable physical science students to fully understand and appreciate the ultimate goals of applying the principles of information technology to challenges in biological data management, sequence analysis, and systems biology. The first part of the book also includes a survey of existing biological databases, tools that have become essential in today’s biotechnology research. The second part of the book covers methodologies for retrieving biological information, including fundamental algorithms for sequence compar...

  16. Biopython: freely available Python tools for computational molecular biology and bioinformatics

    DEFF Research Database (Denmark)

    Cock, Peter J A; Antao, Tiago; Chang, Jeffrey T

    2009-01-01

    SUMMARY: The Biopython project is a mature open source international collaboration of volunteer developers, providing Python libraries for a wide range of bioinformatics problems. Biopython includes modules for reading and writing different sequence file formats and multiple sequence alignments...

  17. Bioinformatics resources for cancer research with an emphasis on gene function and structure prediction tools

    Directory of Open Access Journals (Sweden)

    Daisuke Kihara

    2006-01-01

    Full Text Available The immensely popular fields of cancer research and bioinformatics overlap in many different areas, e.g. large data repositories that allow for users to analyze data from many experiments (data handling, databases, pattern mining, microarray data analysis, and interpretation of proteomics data. There are many newly available resources in these areas that may be unfamiliar to most cancer researchers wanting to incorporate bioinformatics tools and analyses into their work, and also to bioinformaticians looking for real data to develop and test algorithms. This review reveals the interdependence of cancer research and bioinformatics, and highlight the most appropriate and useful resources available to cancer researchers. These include not only public databases, but general and specific bioinformatics tools which can be useful to the cancer researcher. The primary foci are function and structure prediction tools of protein genes. The result is a useful reference to cancer researchers and bioinformaticians studying cancer alike.

  18. Preliminary Study of Bioinformatics Patents and Their Classifications Registered in the KIPRIS Database.

    Science.gov (United States)

    Park, Hyun-Seok

    2012-12-01

    Whereas a vast amount of new information on bioinformatics is made available to the public through patents, only a small set of patents are cited in academic papers. A detailed analysis of registered bioinformatics patents, using the existing patent search system, can provide valuable information links between science and technology. However, it is extremely difficult to select keywords to capture bioinformatics patents, reflecting the convergence of several underlying technologies. No single word or even several words are sufficient to identify such patents. The analysis of patent subclasses can provide valuable information. In this paper, I did a preliminary study of the current status of bioinformatics patents and their International Patent Classification (IPC) groups registered in the Korea Intellectual Property Rights Information Service (KIPRIS) database.

  19. Comparison of acceleration techniques forselected low-level bioinformatics operations

    Directory of Open Access Journals (Sweden)

    Daniel eLangenkaemper

    2016-02-01

    Full Text Available Within the recent years clock rates of modern processors stagnated while the demand for computing power continued to grow. This applied particularly for the fields of life sciences and bioinformatics, where new technologies keep on creating rapidly growing piles of raw data with increasing speed. The number of cores per processor increased in an attempt to compensate for slight increments of clock rates. This technological shift demands changes in software development, especially in the field of high performance computing where parallelization techniques are gaining in importance due to the pressing issue of large sized datasets generated by e.g. modern genomics.This paper presents an overview of state-of-the-art manual and automatic acceleration techniques and lists some applications employing these in different areas of sequence informatics. Furthermore we provide examples for automatic acceleration of two use cases to show typical problems and gains of transforming a serial application to a parallel one. The paper should aid the reader in deciding for a certain techniques for the problem at hand.We compare four different state-of-the-art automatic acceleration approaches (OpenMP, PluTo-SICA, PPCG, and OpenACC. Their performance as well as their applicability for selected use cases is discussed. While optimizations targeting the CPU worked better in the complex k-mer use case, optimizers for Graphics Processing Units (GPUs performed better in the matrix multiplication example. But performance is only superior at a certain problem size due to data migration overhead.We show that automatic code parallelization is feasible with current compiler software and yields significant increases in execution speed. Automatic optimizers for CPU are mature and usually

  20. Bioinformatics analyses of Shigella CRISPR structure and spacer classification.

    Science.gov (United States)

    Wang, Pengfei; Zhang, Bing; Duan, Guangcai; Wang, Yingfang; Hong, Lijuan; Wang, Linlin; Guo, Xiangjiao; Xi, Yuanlin; Yang, Haiyan

    2016-03-01

    Clustered regularly interspaced short palindromic repeats (CRISPR) are inheritable genetic elements of a variety of archaea and bacteria and indicative of the bacterial ecological adaptation, conferring acquired immunity against invading foreign nucleic acids. Shigella is an important pathogen for anthroponosis. This study aimed to analyze the features of Shigella CRISPR structure and classify the spacers through bioinformatics approach. Among 107 Shigella, 434 CRISPR structure loci were identified with two to seven loci in different strains. CRISPR-Q1, CRISPR-Q4 and CRISPR-Q5 were widely distributed in Shigella strains. Comparison of the first and last repeats of CRISPR1, CRISPR2 and CRISPR3 revealed several base variants and different stem-loop structures. A total of 259 cas genes were found among these 107 Shigella strains. The cas gene deletions were discovered in 88 strains. However, there is one strain that does not contain cas gene. Intact clusters of cas genes were found in 19 strains. From comprehensive analysis of sequence signature and BLAST and CRISPRTarget score, the 708 spacers were classified into three subtypes: Type I, Type II and Type III. Of them, Type I spacer referred to those linked with one gene segment, Type II spacer linked with two or more different gene segments, and Type III spacer undefined. This study examined the diversity of CRISPR/cas system in Shigella strains, demonstrated the main features of CRISPR structure and spacer classification, which provided critical information for elucidation of the mechanisms of spacer formation and exploration of the role the spacers play in the function of the CRISPR/cas system.

  1. Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud.

    Science.gov (United States)

    Afgan, Enis; Sloggett, Clare; Goonasekera, Nuwan; Makunin, Igor; Benson, Derek; Crowe, Mark; Gladman, Simon; Kowsar, Yousef; Pheasant, Michael; Horst, Ron; Lonie, Andrew

    2015-01-01

    Analyzing high throughput genomics data is a complex and compute intensive task, generally requiring numerous software tools and large reference data sets, tied together in successive stages of data transformation and visualisation. A computational platform enabling best practice genomics analysis ideally meets a number of requirements, including: a wide range of analysis and visualisation tools, closely linked to large user and reference data sets; workflow platform(s) enabling accessible, reproducible, portable analyses, through a flexible set of interfaces; highly available, scalable computational resources; and flexibility and versatility in the use of these resources to meet demands and expertise of a variety of users. Access to an appropriate computational platform can be a significant barrier to researchers, as establishing such a platform requires a large upfront investment in hardware, experience, and expertise. We designed and implemented the Genomics Virtual Laboratory (GVL) as a middleware layer of machine images, cloud management tools, and online services that enable researchers to build arbitrarily sized compute clusters on demand, pre-populated with fully configured bioinformatics tools, reference datasets and workflow and visualisation options. The platform is flexible in that users can conduct analyses through web-based (Galaxy, RStudio, IPython Notebook) or command-line interfaces, and add/remove compute nodes and data resources as required. Best-practice tutorials and protocols provide a path from introductory training to practice. The GVL is available on the OpenStack-based Australian Research Cloud (http://nectar.org.au) and the Amazon Web Services cloud. The principles, implementation and build process are designed to be cloud-agnostic. This paper provides a blueprint for the design and implementation of a cloud-based Genomics Virtual Laboratory. We discuss scope, design considerations and technical and logistical constraints, and explore the

  2. Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud.

    Directory of Open Access Journals (Sweden)

    Enis Afgan

    Full Text Available Analyzing high throughput genomics data is a complex and compute intensive task, generally requiring numerous software tools and large reference data sets, tied together in successive stages of data transformation and visualisation. A computational platform enabling best practice genomics analysis ideally meets a number of requirements, including: a wide range of analysis and visualisation tools, closely linked to large user and reference data sets; workflow platform(s enabling accessible, reproducible, portable analyses, through a flexible set of interfaces; highly available, scalable computational resources; and flexibility and versatility in the use of these resources to meet demands and expertise of a variety of users. Access to an appropriate computational platform can be a significant barrier to researchers, as establishing such a platform requires a large upfront investment in hardware, experience, and expertise.We designed and implemented the Genomics Virtual Laboratory (GVL as a middleware layer of machine images, cloud management tools, and online services that enable researchers to build arbitrarily sized compute clusters on demand, pre-populated with fully configured bioinformatics tools, reference datasets and workflow and visualisation options. The platform is flexible in that users can conduct analyses through web-based (Galaxy, RStudio, IPython Notebook or command-line interfaces, and add/remove compute nodes and data resources as required. Best-practice tutorials and protocols provide a path from introductory training to practice. The GVL is available on the OpenStack-based Australian Research Cloud (http://nectar.org.au and the Amazon Web Services cloud. The principles, implementation and build process are designed to be cloud-agnostic.This paper provides a blueprint for the design and implementation of a cloud-based Genomics Virtual Laboratory. We discuss scope, design considerations and technical and logistical constraints

  3. Bioinformatic approach to correction of negative emotional state in rats

    Directory of Open Access Journals (Sweden)

    D. A. Bevzyuk

    2014-05-01

    Full Text Available One of the most important problems of modern society is the growth of aggression which leads to crimes, sadism, violence and other antisocial actions. Therefore, the study of the mechanisms of development of aggressive behavior and negative aggressive reactions is extremely urgent. Thus, in order to correct the aggressive behavior it is extremely important to develop new methodological approaches. Previous studies have shown that the brain biopotentials provide functional transfer of information. With the help of magnetic or other information carrier, the information can be stored and transmitted remotely from the donor to the recipient and, in that way, can control the behavior of the animal body. Using of biological programs, expressed via the brain biopotentials at their minimum voltage (which is indicated by the amplitude of the frequency spectrum of electrical oscillations in the EEG, it can give therapeutic effect. The aim of the paper is studying of the neurobiological mechanisms of aggressive behavior and usage of bioinformatic programs expressed via the potentials of the brain. In the course of the experiments, we have shown the possibility to use the ventromedial hypothalamic stimulation as a realistic model of negative emotional state. In this model, behavioral indexes (reactions of avoidance, increased vocalizations, negative grooming correlated with the indexes of electrical activity of brain structures of emotional limbic-neocortical system (numerous epileptiform paroxysms, sharp shocks and complexes with epi components of peak-theta and peak-delta type. The introduction of information expressed in biopotentials of the positive emotional ventrolateral hypothalamus of the brain of donor rat to recipient rats with a model negative emotional state leads to balancing of emotional homeostasis, resulting in disappearance of convulsive components and diffuse affective paroxysms in EEG, and also in manifestations of positive grooming and

  4. A bioinformatic and computational study of myosin phosphatase subunit diversity.

    Science.gov (United States)

    Dippold, Rachael P; Fisher, Steven A

    2014-08-01

    Variability in myosin phosphatase (MP) subunits may provide specificity in signaling pathways that regulate muscle tone. We utilized public databases and computational algorithms to investigate the phylogenetic diversity of MP regulatory (PPP1R12A-C) and inhibitory (PPP1R14A-D) subunits. The comparison of exonic coding sequences and expression data confirmed or refuted the existence of isoforms and their tissue-specific expression in different model organisms. The comparison of intronic and exonic sequences identified potential expressional regulatory elements. As examples, smooth muscle MP regulatory subunit (PPP1R12A) is highly conserved through evolution. Its alternative exon E24 is present in fish through mammals with two invariant features: 1) a reading frame shift generating a premature termination codon and 2) a hexanucleotide sequence adjacent to the 3' splice site hypothesized to be a novel suppressor of exon splicing. A characteristic of the striated muscle MP regulatory subunit (PPP1R12B) locus is numerous and phylogenetically variable transcriptional start sites. In fish this locus only codes for the small (M21) subunit, suggesting the primordial function of this gene. Inhibitory subunits show little intragenic variability; their diversity is thought to have arisen by expansion and tissue-specific expression of different gene family members. We demonstrate differences in the regulatory landscape between smooth muscle enriched (PPP1R14A) and more ubiquitously expressed (PPP1R14B) family members and identify deeply conserved intronic sequence and predicted transcriptional cis-regulatory elements. This bioinformatic and computational study has uncovered a number of attributes of MP subunits that supports selection of ideal model organisms and testing of hypotheses regarding their physiological significance and regulated expression. Copyright © 2014 the American Physiological Society.

  5. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis.

    Directory of Open Access Journals (Sweden)

    Roslyn D Noar

    Full Text Available Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that

  6. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis.

    Science.gov (United States)

    Noar, Roslyn D; Daub, Margaret E

    2016-01-01

    Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity) for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity) to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that they may encode

  7. Nispero: a cloud-computing based Scala tool specially suited for bioinformatics data processing

    OpenAIRE

    Evdokim Kovach; Alexey Alekhin; Eduardo Pareja Tobes; Raquel Tobes; Eduardo Pareja; Marina Manrique

    2014-01-01

    Nowadays it is widely accepted that the bioinformatics data analysis is a real bottleneck in many research activities related to life sciences. High-throughput technologies like Next Generation Sequencing (NGS) have completely reshaped the biology and bioinformatics landscape. Undoubtedly NGS has allowed important progress in many life-sciences related fields but has also presented interesting challenges in terms of computation capabilities and algorithms. Many kinds of tasks related with NGS...

  8. A New Technique to Manage Big Bioinformatics Data Using Genetic Algorithms

    OpenAIRE

    Huda Jalil Dikhil; Mohammad Shkoukani; Suhail Sami Owais

    2016-01-01

    The continuous growth of data, mainly the medical data at laboratories becomes very complex to use and to manage by using traditional ways. So, the researchers start studying genetic information field which increased in the past thirty years in bioinformatics domain (the computer science field, genetic biology field, and DNA). This growth of data becomes known as big bioinformatics data. Thus, efficient algorithms such as Genetic Algorithms are needed to deal with this big and vast amount of ...

  9. libcov: A C++ bioinformatic library to manipulate protein structures, sequence alignments and phylogeny

    OpenAIRE

    Butt, Davin; Roger, Andrew J; Blouin, Christian

    2005-01-01

    Background An increasing number of bioinformatics methods are considering the phylogenetic relationships between biological sequences. Implementing new methodologies using the maximum likelihood phylogenetic framework can be a time consuming task. Results The bioinformatics library libcov is a collection of C++ classes that provides a high and low-level interface to maximum likelihood phylogenetics, sequence analysis and a data structure for structural biological methods. libcov can be used ...

  10. SYMBIOmatics: Synergies in Medical Informatics and Bioinformatics – exploring current scientific literature for emerging topics

    Science.gov (United States)

    Rebholz-Schuhman, Dietrich; Cameron, Graham; Clark, Dominic; van Mulligen, Erik; Coatrieux, Jean-Louis; Del Hoyo Barbolla, Eva; Martin-Sanchez, Fernando; Milanesi, Luciano; Porro, Ivan; Beltrame, Francesco; Tollis, Ioannis; Van der Lei, Johan

    2007-01-01

    Background The SYMBIOmatics Specific Support Action (SSA) is "an information gathering and dissemination activity" that seeks "to identify synergies between the bioinformatics and the medical informatics" domain to improve collaborative progress between both domains (ref. to ). As part of the project experts in both research fields will be identified and approached through a survey. To provide input to the survey, the scientific literature was analysed to extract topics relevant to both medical informatics and bioinformatics. Results This paper presents results of a systematic analysis of the scientific literature from medical informatics research and bioinformatics research. In the analysis pairs of words (bigrams) from the leading bioinformatics and medical informatics journals have been used as indication of existing and emerging technologies and topics over the period 2000–2005 ("recent") and 1990–1990 ("past"). We identified emerging topics that were equally important to bioinformatics and medical informatics in recent years such as microarray experiments, ontologies, open source, text mining and support vector machines. Emerging topics that evolved only in bioinformatics were system biology, protein interaction networks and statistical methods for microarray analyses, whereas emerging topics in medical informatics were grid technology and tissue microarrays. Conclusion We conclude that although both fields have their own specific domains of interest, they share common technological developments that tend to be initiated by new developments in biotechnology and computer science. PMID:17430562

  11. Bioinformatic identification of novel putative photoreceptor specific cis-elements

    Directory of Open Access Journals (Sweden)

    Knox Barry E

    2007-10-01

    Full Text Available Abstract Background Cell specific gene expression is largely regulated by different combinations of transcription factors that bind cis-elements in the upstream promoter sequence. However, experimental detection of cis-elements is difficult, expensive, and time-consuming. This provides a motivation for developing bioinformatic methods to identify cis-elements that could prioritize future experimental studies. Here, we use motif discovery algorithms to predict transcription factor binding sites involved in regulating the differences between murine rod and cone photoreceptor populations. Results To identify highly conserved motifs enriched in promoters that drive expression in either rod or cone photoreceptors, we assembled a set of murine rod-specific, cone-specific, and non-photoreceptor background promoter sequences. These sets were used as input to a newly devised motif discovery algorithm called Iterative Alignment/Modular Motif Selection (IAMMS. Using IAMMS, we predicted 34 motifs that may contribute to rod-specific (19 motifs or cone-specific (15 motifs expression patterns. Of these, 16 rod- and 12 cone-specific motifs were found in clusters near the transcription start site. New findings include the observation that cone promoters tend to contain TATA boxes, while rod promoters tend to be TATA-less (exempting Rho and Cnga1. Additionally, we identify putative sites for IL-6 effectors (in rods and RXR family members (in cones that can explain experimental data showing changes to cell-fate by activating these signaling pathways during rod/cone development. Two of the predicted motifs (NRE and ROP2 have been confirmed experimentally to be involved in cell-specific expression patterns. We provide a full database of predictions as additional data that may contain further valuable information. IAMMS predictions are compared with existing motif discovery algorithms, DME and BioProspector. We find that over 60% of IAMMS predictions are confirmed by

  12. Keemei: cloud-based validation of tabular bioinformatics file formats in Google Sheets.

    Science.gov (United States)

    Rideout, Jai Ram; Chase, John H; Bolyen, Evan; Ackermann, Gail; González, Antonio; Knight, Rob; Caporaso, J Gregory

    2016-06-13

    Bioinformatics software often requires human-generated tabular text files as input and has specific requirements for how those data are formatted. Users frequently manage these data in spreadsheet programs, which is convenient for researchers who are compiling the requisite information because the spreadsheet programs can easily be used on different platforms including laptops and tablets, and because they provide a familiar interface. It is increasingly common for many different researchers to be involved in compiling these data, including study coordinators, clinicians, lab technicians and bioinformaticians. As a result, many research groups are shifting toward using cloud-based spreadsheet programs, such as Google Sheets, which support the concurrent editing of a single spreadsheet by different users working on different platforms. Most of the researchers who enter data are not familiar with the formatting requirements of the bioinformatics programs that will be used, so validating and correcting file formats is often a bottleneck prior to beginning bioinformatics analysis. We present Keemei, a Google Sheets Add-on, for validating tabular files used in bioinformatics analyses. Keemei is available free of charge from Google's Chrome Web Store. Keemei can be installed and run on any web browser supported by Google Sheets. Keemei currently supports the validation of two widely used tabular bioinformatics formats, the Quantitative Insights into Microbial Ecology (QIIME) sample metadata mapping file format and the Spatially Referenced Genetic Data (SRGD) format, but is designed to easily support the addition of others. Keemei will save researchers time and frustration by providing a convenient interface for tabular bioinformatics file format validation. By allowing everyone involved with data entry for a project to easily validate their data, it will reduce the validation and formatting bottlenecks that are commonly encountered when human-generated data files are

  13. An overview of topic modeling and its current applications in bioinformatics.

    Science.gov (United States)

    Liu, Lin; Tang, Lin; Dong, Wen; Yao, Shaowen; Zhou, Wei

    2016-01-01

    With the rapid accumulation of biological datasets, machine learning methods designed to automate data analysis are urgently needed. In recent years, so-called topic models that originated from the field of natural language processing have been receiving much attention in bioinformatics because of their interpretability. Our aim was to review the application and development of topic models for bioinformatics. This paper starts with the description of a topic model, with a focus on the understanding of topic modeling. A general outline is provided on how to build an application in a topic model and how to develop a topic model. Meanwhile, the literature on application of topic models to biological data was searched and analyzed in depth. According to the types of models and the analogy between the concept of document-topic-word and a biological object (as well as the tasks of a topic model), we categorized the related studies and provided an outlook on the use of topic models for the development of bioinformatics applications. Topic modeling is a useful method (in contrast to the traditional means of data reduction in bioinformatics) and enhances researchers' ability to interpret biological information. Nevertheless, due to the lack of topic models optimized for specific biological data, the studies on topic modeling in biological data still have a long and challenging road ahead. We believe that topic models are a promising method for various applications in bioinformatics research.

  14. XML schemas for common bioinformatic data types and their application in workflow systems.

    Science.gov (United States)

    Seibel, Philipp N; Krüger, Jan; Hartmeier, Sven; Schwarzer, Knut; Löwenthal, Kai; Mersch, Henning; Dandekar, Thomas; Giegerich, Robert

    2006-11-06

    Today, there is a growing need in bioinformatics to combine available software tools into chains, thus building complex applications from existing single-task tools. To create such workflows, the tools involved have to be able to work with each other's data--therefore, a common set of well-defined data formats is needed. Unfortunately, current bioinformatic tools use a great variety of heterogeneous formats. Acknowledging the need for common formats, the Helmholtz Open BioInformatics Technology network (HOBIT) identified several basic data types used in bioinformatics and developed appropriate format descriptions, formally defined by XML schemas, and incorporated them in a Java library (BioDOM). These schemas currently cover sequence, sequence alignment, RNA secondary structure and RNA secondary structure alignment formats in a form that is independent of any specific program, thus enabling seamless interoperation of different tools. All XML formats are available at http://bioschemas.sourceforge.net, the BioDOM library can be obtained at http://biodom.sourceforge.net. The HOBIT XML schemas and the BioDOM library simplify adding XML support to newly created and existing bioinformatic tools, enabling these tools to interoperate seamlessly in workflow scenarios.

  15. Teaching bioinformatics and neuroinformatics by using free web-based tools.

    Science.gov (United States)

    Grisham, William; Schottler, Natalie A; Valli-Marill, Joanne; Beck, Lisa; Beatty, Jackson

    2010-01-01

    This completely computer-based module's purpose is to introduce students to bioinformatics resources. We present an easy-to-adopt module that weaves together several important bioinformatic tools so students can grasp how these tools are used in answering research questions. Students integrate information gathered from websites dealing with anatomy (Mouse Brain Library), quantitative trait locus analysis (WebQTL from GeneNetwork), bioinformatics and gene expression analyses (University of California, Santa Cruz Genome Browser, National Center for Biotechnology Information's Entrez Gene, and the Allen Brain Atlas), and information resources (PubMed). Instructors can use these various websites in concert to teach genetics from the phenotypic level to the molecular level, aspects of neuroanatomy and histology, statistics, quantitative trait locus analysis, and molecular biology (including in situ hybridization and microarray analysis), and to introduce bioinformatic resources. Students use these resources to discover 1) the region(s) of chromosome(s) influencing the phenotypic trait, 2) a list of candidate genes-narrowed by expression data, 3) the in situ pattern of a given gene in the region of interest, 4) the nucleotide sequence of the candidate gene, and 5) articles describing the gene. Teaching materials such as a detailed student/instructor's manual, PowerPoints, sample exams, and links to free Web resources can be found at http://mdcune.psych.ucla.edu/modules/bioinformatics.

  16. Developing a Bioinformatics Program and Supporting Infrastructure in a Biomedical Library

    Directory of Open Access Journals (Sweden)

    Nathan Hosburgh

    2018-03-01

    Full Text Available Background: Over the last couple decades, the field of bioinformatics has helped spur medical discoveries that offer a better understanding of the genetic basis of disease, which in turn improve public health and save lives. Concomitantly, support requirements for molecular biology researchers have grown in scope and complexity, incorporating specialized resources, technologies, and techniques. Case Presentation: To address this specific need among National Institutes of Health (NIH intramural researchers, the NIH Library hired an expert bioinformatics trainer and consultant with a PhD in biochemistry to implement a bioinformatics support program. This study traces the program from its inception in 2009 to its present form. Discussion involves the particular skills of program staff, development of content, collection of resources, associated technology, assessment, and the impact of the program on the NIH community. Conclusion: Based on quantitative and qualitative data, the bioinformatics support program has been heavily used and appreciated by researchers. Continued success will depend on filling key staff positions, building on the existing program infrastructure, and keeping abreast of developments within the field to remain relevant and in touch with the medical research community utilizing bioinformatics services.

  17. Bioinformatics education in high school: implications for promoting science, technology, engineering, and mathematics careers.

    Science.gov (United States)

    Kovarik, Dina N; Patterson, Davis G; Cohen, Carolyn; Sanders, Elizabeth A; Peterson, Karen A; Porter, Sandra G; Chowning, Jeanne Ting

    2013-01-01

    We investigated the effects of our Bio-ITEST teacher professional development model and bioinformatics curricula on cognitive traits (awareness, engagement, self-efficacy, and relevance) in high school teachers and students that are known to accompany a developing interest in science, technology, engineering, and mathematics (STEM) careers. The program included best practices in adult education and diverse resources to empower teachers to integrate STEM career information into their classrooms. The introductory unit, Using Bioinformatics: Genetic Testing, uses bioinformatics to teach basic concepts in genetics and molecular biology, and the advanced unit, Using Bioinformatics: Genetic Research, utilizes bioinformatics to study evolution and support student research with DNA barcoding. Pre-post surveys demonstrated significant growth (n = 24) among teachers in their preparation to teach the curricula and infuse career awareness into their classes, and these gains were sustained through the end of the academic year. Introductory unit students (n = 289) showed significant gains in awareness, relevance, and self-efficacy. While these students did not show significant gains in engagement, advanced unit students (n = 41) showed gains in all four cognitive areas. Lessons learned during Bio-ITEST are explored in the context of recommendations for other programs that wish to increase student interest in STEM careers.

  18. Analysis of requirements for teaching materials based on the course bioinformatics for plant metabolism

    Science.gov (United States)

    Balqis, Widodo, Lukiati, Betty; Amin, Mohamad

    2017-05-01

    A way to improve the quality of learning in the course of Plant Metabolism in the Department of Biology, State University of Malang, is to develop teaching materials. This research evaluates the needs of bioinformatics-based teaching material in the course Plant Metabolism by the Analyze, Design, Develop, Implement, and Evaluate (ADDIE) development model. Data were collected through questionnaires distributed to the students in the Plant Metabolism course of the Department of Biology, University of Malang, and analysis of the plan of lectures semester (RPS). Learning gains of this course show that it is not yet integrated into the field of bioinformatics. All respondents stated that plant metabolism books do not include bioinformatics and fail to explain the metabolism of a chemical compound of a local plant in Indonesia. Respondents thought that bioinformatics can explain examples and metabolism of a secondary metabolite analysis techniques and discuss potential medicinal compounds from local plants. As many as 65% of the respondents said that the existing metabolism book could not be used to understand secondary metabolism in lectures of plant metabolism. Therefore, the development of teaching materials including plant metabolism-based bioinformatics is important to improve the understanding of the lecture material in plant metabolism.

  19. Bioinformatics Education in High School: Implications for Promoting Science, Technology, Engineering, and Mathematics Careers

    Science.gov (United States)

    Kovarik, Dina N.; Patterson, Davis G.; Cohen, Carolyn; Sanders, Elizabeth A.; Peterson, Karen A.; Porter, Sandra G.; Chowning, Jeanne Ting

    2013-01-01

    We investigated the effects of our Bio-ITEST teacher professional development model and bioinformatics curricula on cognitive traits (awareness, engagement, self-efficacy, and relevance) in high school teachers and students that are known to accompany a developing interest in science, technology, engineering, and mathematics (STEM) careers. The program included best practices in adult education and diverse resources to empower teachers to integrate STEM career information into their classrooms. The introductory unit, Using Bioinformatics: Genetic Testing, uses bioinformatics to teach basic concepts in genetics and molecular biology, and the advanced unit, Using Bioinformatics: Genetic Research, utilizes bioinformatics to study evolution and support student research with DNA barcoding. Pre–post surveys demonstrated significant growth (n = 24) among teachers in their preparation to teach the curricula and infuse career awareness into their classes, and these gains were sustained through the end of the academic year. Introductory unit students (n = 289) showed significant gains in awareness, relevance, and self-efficacy. While these students did not show significant gains in engagement, advanced unit students (n = 41) showed gains in all four cognitive areas. Lessons learned during Bio-ITEST are explored in the context of recommendations for other programs that wish to increase student interest in STEM careers. PMID:24006393

  20. Structural Bioinformatics-Based Prediction of Exceptional Selectivity of p38 MAP Kinase Inhibitor PH-797804

    Energy Technology Data Exchange (ETDEWEB)

    Xing, Li; Shieh, Huey S.; Selness, Shaun R.; Devraj, Rajesh V.; Walker, John K.; Devadas, Balekudru; Hope, Heidi R.; Compton, Robert P.; Schindler, John F.; Hirsch, Jeffrey L.; Benson, Alan G.; Kurumbail, Ravi G.; Stegeman, Roderick A.; Williams, Jennifer M.; Broadus, Richard M.; Walden, Zara; Monahan, Joseph B.; Pfizer

    2009-07-24

    PH-797804 is a diarylpyridinone inhibitor of p38{alpha} mitogen-activated protein (MAP) kinase derived from a racemic mixture as the more potent atropisomer (aS), first proposed by molecular modeling and subsequently confirmed by experiments. On the basis of structural comparison with a different biaryl pyrazole template and supported by dozens of high-resolution crystal structures of p38{alpha} inhibitor complexes, PH-797804 is predicted to possess a high level of specificity across the broad human kinase genome. We used a structural bioinformatics approach to identify two selectivity elements encoded by the TXXXG sequence motif on the p38{alpha} kinase hinge: (i) Thr106 that serves as the gatekeeper to the buried hydrophobic pocket occupied by 2,4-difluorophenyl of PH-797804 and (ii) the bidentate hydrogen bonds formed by the pyridinone moiety with the kinase hinge requiring an induced 180{sup o} rotation of the Met109-Gly110 peptide bond. The peptide flip occurs in p38{alpha} kinase due to the critical glycine residue marked by its conformational flexibility. Kinome-wide sequence mining revealed rare presentation of the selectivity motif. Corroboratively, PH-797804 exhibited exceptionally high specificity against MAP kinases and the related kinases. No cross-reactivity was observed in large panels of kinase screens (selectivity ratio of >500-fold). In cellular assays, PH-797804 demonstrated superior potency and selectivity consistent with the biochemical measurements. PH-797804 has met safety criteria in human phase I studies and is under clinical development for several inflammatory conditions. Understanding the rationale for selectivity at the molecular level helps elucidate the biological function and design of specific p38{alpha} kinase inhibitors.

  1. BioSmalltalk: a pure object system and library for bioinformatics.

    Science.gov (United States)

    Morales, Hernán F; Giovambattista, Guillermo

    2013-09-15

    We have developed BioSmalltalk, a new environment system for pure object-oriented bioinformatics programming. Adaptive end-user programming systems tend to become more important for discovering biological knowledge, as is demonstrated by the emergence of open-source programming toolkits for bioinformatics in the past years. Our software is intended to bridge the gap between bioscientists and rapid software prototyping while preserving the possibility of scaling to whole-system biology applications. BioSmalltalk performs better in terms of execution time and memory usage than Biopython and BioPerl for some classical situations. BioSmalltalk is cross-platform and freely available (MIT license) through the Google Project Hosting at http://code.google.com/p/biosmalltalk hernan.morales@gmail.com Supplementary data are available at Bioinformatics online.

  2. An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics

    International Nuclear Information System (INIS)

    Taylor, Ronald C.

    2010-01-01

    Bioinformatics researchers are increasingly confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce. An overview is given of the current usage within the bioinformatics community of Hadoop, a top-level Apache Software Foundation project, and of associated open source software projects. The concepts behind Hadoop and the associated HBase project are defined, and current bioinformatics software that employ Hadoop is described. The focus is on next-generation sequencing, as the leading application area to date.

  3. DEVELOPMENT AND IMPLEMENTATION OF A BIOINFORMATICS ONLINE DISTANCE EDUCATION LEARNING TOOL IN AFRICA

    Directory of Open Access Journals (Sweden)

    Oluwagbemi Olugbenga OLUSEUN

    2009-07-01

    Full Text Available New scientific research fields are evolving on a yearly basis but some parts of the African continent are less aware. Thus, there arises the need for a suitable implementation strategy in introducing the basic components of an emerging scientific field to some part of the African populace through the development of an online distance education learning tool. This emerging field is known as bioinformatics. This research work was instrumental in elucidating the need for a suitable implementation platform for bioinformatics education in parts of the African continent that are less aware of this innovative and interesting field. The aim of this research work was to disseminate the basic knowledge and applications of bioinformatics to these parts of the African continent.

  4. Engaging Students in a Bioinformatics Activity to Introduce Gene Structure and Function

    Directory of Open Access Journals (Sweden)

    Barbara J. May

    2013-02-01

    Full Text Available Bioinformatics spans many fields of biological research and plays a vital role in mining and analyzing data. Therefore, there is an ever-increasing need for students to understand not only what can be learned from this data, but also how to use basic bioinformatics tools.  This activity is designed to provide secondary and undergraduate biology students to a hands-on activity meant to explore and understand gene structure with the use of basic bioinformatic tools.  Students are provided an “unknown” sequence from which they are asked to use a free online gene finder program to identify the gene. Students then predict the putative function of this gene with the use of additional online databases.

  5. Databases and Associated Bioinformatic Tools in Studies of Food Allergens, Epitopes and Haptens – a Review

    Directory of Open Access Journals (Sweden)

    Bucholska Justyna

    2018-06-01

    Full Text Available Allergies and/or food intolerances are a growing problem of the modern world. Diffi culties associated with the correct diagnosis of food allergies result in the need to classify the factors causing allergies and allergens themselves. Therefore, internet databases and other bioinformatic tools play a special role in deepening knowledge of biologically-important compounds. Internet repositories, as a source of information on different chemical compounds, including those related to allergy and intolerance, are increasingly being used by scientists. Bioinformatic methods play a signifi cant role in biological and medical sciences, and their importance in food science is increasing. This study aimed at presenting selected databases and tools of bioinformatic analysis useful in research on food allergies, allergens (11 databases, epitopes (7 databases, and haptens (2 databases. It also presents examples of the application of computer methods in studies related to allergies.

  6. A Survey of Bioinformatics Database and Software Usage through Mining the Literature.

    Directory of Open Access Journals (Sweden)

    Geraint Duck

    Full Text Available Computer-based resources are central to much, if not most, biological and medical research. However, while there is an ever expanding choice of bioinformatics resources to use, described within the biomedical literature, little work to date has provided an evaluation of the full range of availability or levels of usage of database and software resources. Here we use text mining to process the PubMed Central full-text corpus, identifying mentions of databases or software within the scientific literature. We provide an audit of the resources contained within the biomedical literature, and a comparison of their relative usage, both over time and between the sub-disciplines of bioinformatics, biology and medicine. We find that trends in resource usage differs between these domains. The bioinformatics literature emphasises novel resource development, while database and software usage within biology and medicine is more stable and conservative. Many resources are only mentioned in the bioinformatics literature, with a relatively small number making it out into general biology, and fewer still into the medical literature. In addition, many resources are seeing a steady decline in their usage (e.g., BLAST, SWISS-PROT, though some are instead seeing rapid growth (e.g., the GO, R. We find a striking imbalance in resource usage with the top 5% of resource names (133 names accounting for 47% of total usage, and over 70% of resources extracted being only mentioned once each. While these results highlight the dynamic and creative nature of bioinformatics research they raise questions about software reuse, choice and the sharing of bioinformatics practice. Is it acceptable that so many resources are apparently never reused? Finally, our work is a step towards automated extraction of scientific method from text. We make the dataset generated by our study available under the CC0 license here: http://dx.doi.org/10.6084/m9.figshare.1281371.

  7. "Broadband" Bioinformatics Skills Transfer with the Knowledge Transfer Programme (KTP: Educational Model for Upliftment and Sustainable Development.

    Directory of Open Access Journals (Sweden)

    Emile R Chimusa

    2015-11-01

    Full Text Available A shortage of practical skills and relevant expertise is possibly the primary obstacle to social upliftment and sustainable development in Africa. The "omics" fields, especially genomics, are increasingly dependent on the effective interpretation of large and complex sets of data. Despite abundant natural resources and population sizes comparable with many first-world countries from which talent could be drawn, countries in Africa still lag far behind the rest of the world in terms of specialized skills development. Moreover, there are serious concerns about disparities between countries within the continent. The multidisciplinary nature of the bioinformatics field, coupled with rare and depleting expertise, is a critical problem for the advancement of bioinformatics in Africa. We propose a formalized matchmaking system, which is aimed at reversing this trend, by introducing the Knowledge Transfer Programme (KTP. Instead of individual researchers travelling to other labs to learn, researchers with desirable skills are invited to join African research groups for six weeks to six months. Visiting researchers or trainers will pass on their expertise to multiple people simultaneously in their local environments, thus increasing the efficiency of knowledge transference. In return, visiting researchers have the opportunity to develop professional contacts, gain industry work experience, work with novel datasets, and strengthen and support their ongoing research. The KTP develops a network with a centralized hub through which groups and individuals are put into contact with one another and exchanges are facilitated by connecting both parties with potential funding sources. This is part of the PLOS Computational Biology Education collection.

  8. New Directions in Statistical Physics: Econophysics, Bioinformatics, and Pattern Recognition

    International Nuclear Information System (INIS)

    Grassberger, P

    2004-01-01

    This book contains 18 contributions from different authors. Its subtitle 'Econophysics, Bioinformatics, and Pattern Recognition' says more precisely what it is about: not so much about central problems of conventional statistical physics like equilibrium phase transitions and critical phenomena, but about its interdisciplinary applications. After a long period of specialization, physicists have, over the last few decades, found more and more satisfaction in breaking out of the limitations set by the traditional classification of sciences. Indeed, this classification had never been strict, and physicists in particular had always ventured into other fields. Helmholtz, in the middle of the 19th century, had considered himself a physicist when working on physiology, stressing that the physics of animate nature is as much a legitimate field of activity as the physics of inanimate nature. Later, Max Delbrueck and Francis Crick did for experimental biology what Schroedinger did for its theoretical foundation. And many of the experimental techniques used in chemistry, biology, and medicine were developed by a steady stream of talented physicists who left their proper discipline to venture out into the wider world of science. The development we have witnessed over the last thirty years or so is different. It started with neural networks where methods could be applied which had been developed for spin glasses, but todays list includes vehicular traffic (driven lattice gases), geology (self-organized criticality), economy (fractal stochastic processes and large scale simulations), engineering (dynamical chaos), and many others. By staying in the physics departments, these activities have transformed the physics curriculum and the view physicists have of themselves. In many departments there are now courses on econophysics or on biological physics, and some universities offer degrees in the physics of traffic or in econophysics. In order to document this change of attitude

  9. Rough-fuzzy pattern recognition applications in bioinformatics and medical imaging

    CERN Document Server

    Maji, Pradipta

    2012-01-01

    Learn how to apply rough-fuzzy computing techniques to solve problems in bioinformatics and medical image processing Emphasizing applications in bioinformatics and medical image processing, this text offers a clear framework that enables readers to take advantage of the latest rough-fuzzy computing techniques to build working pattern recognition models. The authors explain step by step how to integrate rough sets with fuzzy sets in order to best manage the uncertainties in mining large data sets. Chapters are logically organized according to the major phases of pattern recognition systems dev

  10. Bioinformatics prediction of swine MHC class I epitopes from Porcine Reproductive and Respiratory Syndrome Virus

    DEFF Research Database (Denmark)

    Welner, Simon; Nielsen, Morten; Lund, Ole

    an effective CTL response against PRRSV, we have taken a bioinformatics approach to identify common PRRSV epitopes predicted to react broadly with predominant swine MHC (SLA) alleles. First, the genomic integrity and sequencing method was examined for 334 available complete PRRSV type 2 genomes leaving 104...... by the PopCover algorithm, providing a final list of 54 epitopes prioritized according to maximum coverage of PRRSV strains and SLA alleles. This bioinformatics approach provides a rational strategy for selecting peptides for a CTL-activating vaccine with broad coverage of both virus and swine diversity...

  11. Evaluating the Effectiveness of a Practical Inquiry-Based Learning Bioinformatics Module on Undergraduate Student Engagement and Applied Skills

    Science.gov (United States)

    Brown, James A. L.

    2016-01-01

    A pedagogic intervention, in the form of an inquiry-based peer-assisted learning project (as a practical student-led bioinformatics module), was assessed for its ability to increase students' engagement, practical bioinformatic skills and process-specific knowledge. Elements assessed were process-specific knowledge following module completion,…

  12. Thesis Proposal

    DEFF Research Database (Denmark)

    Sloth, Erik

    2010-01-01

    de konkrete forskningsprojekter. Denne ’omvendte’ argumentationsgang er begrundet i det faktum at ph.d. afhandlingens kerne netop er de forskningsbaserede artikler. Den teoretiske gennemgang fungerer således som en kontekstualisering af forskningsprojekterne. I den teoretiske gennemgang starter jeg......Strukturen i Thesis proposal er følgende: Først præsenteres mine konkrete empiriske forskningsprojekter som skal munde ud i afhandlingens artikler. Jeg præsenterer herefter de teoretiske overvejelser omkring oplevelsesbegrebet og forbrugerkulturteori som danner baggrund for at jeg er nået frem til...

  13. Functionality and Evolutionary History of the Chaperonins in Thermophilic Archaea. A Bioinformatical Perspective

    Science.gov (United States)

    Karlin, Samuel

    2004-01-01

    We used bioinformatics methods to study phylogenetic relations and differentiation patterns of the archaeal chaperonin 60 kDa heat-shock protein (HSP60) genes in support of the study of differential expression patterns of the three chaperonin genes encoded in Sulfolobus shibatae.

  14. An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics.

    Science.gov (United States)

    Taylor, Ronald C

    2010-12-21

    Bioinformatics researchers are now confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce. An overview is given of the current usage within the bioinformatics community of Hadoop, a top-level Apache Software Foundation project, and of associated open source software projects. The concepts behind Hadoop and the associated HBase project are defined, and current bioinformatics software that employ Hadoop is described. The focus is on next-generation sequencing, as the leading application area to date. Hadoop and the MapReduce programming paradigm already have a substantial base in the bioinformatics community, especially in the field of next-generation sequencing analysis, and such use is increasing. This is due to the cost-effectiveness of Hadoop-based analysis on commodity Linux clusters, and in the cloud via data upload to cloud vendors who have implemented Hadoop/HBase; and due to the effectiveness and ease-of-use of the MapReduce method in parallelization of many data analysis algorithms.

  15. Assessing an effective undergraduate module teaching applied bioinformatics to biology students.

    Science.gov (United States)

    Madlung, Andreas

    2018-01-01

    Applied bioinformatics skills are becoming ever more indispensable for biologists, yet incorporation of these skills into the undergraduate biology curriculum is lagging behind, in part due to a lack of instructors willing and able to teach basic bioinformatics in classes that don't specifically focus on quantitative skill development, such as statistics or computer sciences. To help undergraduate course instructors who themselves did not learn bioinformatics as part of their own education and are hesitant to plunge into teaching big data analysis, a module was developed that is written in plain-enough language, using publicly available computing tools and data, to allow novice instructors to teach next-generation sequence analysis to upper-level undergraduate students. To determine if the module allowed students to develop a better understanding of and appreciation for applied bioinformatics, various tools were developed and employed to assess the impact of the module. This article describes both the module and its assessment. Students found the activity valuable for their education and, in focus group discussions, emphasized that they saw a need for more and earlier instruction of big data analysis as part of the undergraduate biology curriculum.

  16. Relax with CouchDB - Into the non-relational DBMS era of Bioinformatics

    Science.gov (United States)

    Manyam, Ganiraju; Payton, Michelle A.; Roth, Jack A.; Abruzzo, Lynne V.; Coombes, Kevin R.

    2012-01-01

    With the proliferation of high-throughput technologies, genome-level data analysis has become common in molecular biology. Bioinformaticians are developing extensive resources to annotate and mine biological features from high-throughput data. The underlying database management systems for most bioinformatics software are based on a relational model. Modern non-relational databases offer an alternative that has flexibility, scalability, and a non-rigid design schema. Moreover, with an accelerated development pace, non-relational databases like CouchDB can be ideal tools to construct bioinformatics utilities. We describe CouchDB by presenting three new bioinformatics resources: (a) geneSmash, which collates data from bioinformatics resources and provides automated gene-centric annotations, (b) drugBase, a database of drug-target interactions with a web interface powered by geneSmash, and (c) HapMap-CN, which provides a web interface to query copy number variations from three SNP-chip HapMap datasets. In addition to the web sites, all three systems can be accessed programmatically via web services. PMID:22609849

  17. Integration of Proteomics, Bioinformatics, and Systems Biology in Traumatic Brain Injury Biomarker Discovery

    Science.gov (United States)

    Guingab-Cagmat, J.D.; Cagmat, E.B.; Hayes, R.L.; Anagli, J.

    2013-01-01

    Traumatic brain injury (TBI) is a major medical crisis without any FDA-approved pharmacological therapies that have been demonstrated to improve functional outcomes. It has been argued that discovery of disease-relevant biomarkers might help to guide successful clinical trials for TBI. Major advances in mass spectrometry (MS) have revolutionized the field of proteomic biomarker discovery and facilitated the identification of several candidate markers that are being further evaluated for their efficacy as TBI biomarkers. However, several hurdles have to be overcome even during the discovery phase which is only the first step in the long process of biomarker development. The high-throughput nature of MS-based proteomic experiments generates a massive amount of mass spectral data presenting great challenges in downstream interpretation. Currently, different bioinformatics platforms are available for functional analysis and data mining of MS-generated proteomic data. These tools provide a way to convert data sets to biologically interpretable results and functional outcomes. A strategy that has promise in advancing biomarker development involves the triad of proteomics, bioinformatics, and systems biology. In this review, a brief overview of how bioinformatics and systems biology tools analyze, transform, and interpret complex MS datasets into biologically relevant results is discussed. In addition, challenges and limitations of proteomics, bioinformatics, and systems biology in TBI biomarker discovery are presented. A brief survey of researches that utilized these three overlapping disciplines in TBI biomarker discovery is also presented. Finally, examples of TBI biomarkers and their applications are discussed. PMID:23750150

  18. Bioinformatics and Systems Biology: bridging the gap between heterogeneous student backgrounds.

    NARCIS (Netherlands)

    Abeln, S.; Molenaar, D.; Feenstra, K.A.; Hoefsloot, H.C.J.; Teusink, B.; Heringa, J.

    2013-01-01

    Teaching students with very diverse backgrounds can be extremely challenging. This article uses the Bioinformatics and Systems Biology MSc in Amsterdam as a case study to describe how the knowledge gap for students with heterogeneous backgrounds can be bridged. We show that a mix in backgrounds can

  19. The DBCLS BioHackathon: standardization and interoperability for bioinformatics web services and workflows

    NARCIS (Netherlands)

    Katayama, T.; Arakawa, K.; Nakao, M.; Prins, J.C.P.

    2010-01-01

    Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However,

  20. Models@Home: distributed computing in bioinformatics using a screensaver based approach.

    NARCIS (Netherlands)

    Krieger, E.; Vriend, G.

    2002-01-01

    MOTIVATION: Due to the steadily growing computational demands in bioinformatics and related scientific disciplines, one is forced to make optimal use of the available resources. A straightforward solution is to build a network of idle computers and let each of them work on a small piece of a

  1. A case study of tuning MapReduce for efficient Bioinformatics in the cloud

    Energy Technology Data Exchange (ETDEWEB)

    Shi, Lizhen [Florida State Univ., Tallahassee, FL (United States); Wang, Zhong [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Yu, Weikuan [Florida State Univ., Tallahassee, FL (United States); Meng, Xiandong [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

    2016-10-06

    The combination of the Hadoop MapReduce programming model and cloud computing allows biological scientists to analyze next-generation sequencing (NGS) data in a timely and cost-effective manner. Cloud computing platforms remove the burden of IT facility procurement and management from end users and provide ease of access to Hadoop clusters. However, biological scientists are still expected to choose appropriate Hadoop parameters for running their jobs. More importantly, the available Hadoop tuning guidelines are either obsolete or too general to capture the particular characteristics of bioinformatics applications. In this paper, we aim to minimize the cloud computing cost spent on bioinformatics data analysis by optimizing the extracted significant Hadoop parameters. When using MapReduce-based bioinformatics tools in the cloud, the default settings often lead to resource underutilization and wasteful expenses. We choose k-mer counting, a representative application used in a large number of NGS data analysis tools, as our study case. Experimental results show that, with the fine-tuned parameters, we achieve a total of 4× speedup compared with the original performance (using the default settings). Finally, this paper presents an exemplary case for tuning MapReduce-based bioinformatics applications in the cloud, and documents the key parameters that could lead to significant performance benefits.

  2. A telescope for the RNA universe : novel bioinformatic approaches to analyze RNA sequencing data

    NARCIS (Netherlands)

    Pulyakhina, Irina

    2016-01-01

    In this thesis I focus on the application of bioinformatics to analyze RNA. The type of experimental data of interest is sequencing data generated with various Next Generation Sequencing technique: nuclear RNA, cytoplasmic RNA, captured polyadenylated RNA fragments, etc. I highlight the necessity in

  3. A global perspective on evolving bioinformatics and data science training needs.

    Science.gov (United States)

    Attwood, Teresa K; Blackford, Sarah; Brazas, Michelle D; Davies, Angela; Schneider, Maria Victoria

    2017-08-29

    Bioinformatics is now intrinsic to life science research, but the past decade has witnessed a continuing deficiency in this essential expertise. Basic data stewardship is still taught relatively rarely in life science education programmes, creating a chasm between theory and practice, and fuelling demand for bioinformatics training across all educational levels and career roles. Concerned by this, surveys have been conducted in recent years to monitor bioinformatics and computational training needs worldwide. This article briefly reviews the principal findings of a number of these studies. We see that there is still a strong appetite for short courses to improve expertise and confidence in data analysis and interpretation; strikingly, however, the most urgent appeal is for bioinformatics to be woven into the fabric of life science degree programmes. Satisfying the relentless training needs of current and future generations of life scientists will require a concerted response from stakeholders across the globe, who need to deliver sustainable solutions capable of both transforming education curricula and cultivating a new cadre of trainer scientists. © The Author 2017. Published by Oxford University Press.

  4. 'Students-as-partners' scheme enhances postgraduate students' employability skills while addressing gaps in bioinformatics education.

    Science.gov (United States)

    Mello, Luciane V; Tregilgas, Luke; Cowley, Gwen; Gupta, Anshul; Makki, Fatima; Jhutty, Anjeet; Shanmugasundram, Achchuthan

    2017-01-01

    Teaching bioinformatics is a longstanding challenge for educators who need to demonstrate to students how skills developed in the classroom may be applied to real world research. This study employed an action research methodology which utilised student-staff partnership and peer-learning. It was centred on the experiences of peer-facilitators, students who had previously taken a postgraduate bioinformatics module, and had applied knowledge and skills gained from it to their own research. It aimed to demonstrate to peer-receivers, current students, how bioinformatics could be used in their own research while developing peer-facilitators' teaching and mentoring skills. This student-centred approach was well received by the peer-receivers, who claimed to have gained improved understanding of bioinformatics and its relevance to research. Equally, peer-facilitators also developed a better understanding of the subject and appreciated that the activity was a rare and invaluable opportunity to develop their teaching and mentoring skills, enhancing their employability.

  5. Bioinformatic analysis of Rp1 gene causing visual disparity in humans

    African Journals Online (AJOL)

    user

    Bioinformatic analysis of Rp1 gene causing visual disparity in humans. Sana Zahra and ... mRNA degradation but also results in truncated protein production leading towards visual disparity in humans. Secondary structure of RP1 gene was ..... The comparison clearly supports the fact that missense mutation R677X causes ...

  6. Synergy between Medical Informatics and Bioinformatics: Facilitating Genomic Medicine for Future Health Care

    Czech Academy of Sciences Publication Activity Database

    Martin-Sanchez, F.; Iakovidis, I.; Norager, S.; Maojo, V.; de Groen, P.; Van der Lei, J.; Jones, T.; Abraham-Fuchs, K.; Apweiler, R.; Babic, A.; Baud, R.; Breton, V.; Cinquin, P.; Doupi, P.; Dugas, M.; Eils, R.; Engelbrecht, R.; Ghazal, P.; Jehenson, P.; Kulikowski, C.; Lampe, K.; De Moor, G.; Orphanoudakis, S.; Rossing, N.; Sarachan, B.; Sousa, A.; Spekowius, G.; Thireos, G.; Zahlmann, G.; Zvárová, Jana; Hermosilla, I.; Vicente, F. J.

    2004-01-01

    Roč. 37, - (2004), s. 30-42 ISSN 1532-0464 Institutional research plan: CEZ:AV0Z1030915 Keywords : bioinformatics * medical informatics * genomics * genomic medicine * biomedical informatics Subject RIV: BD - Theory of Information Impact factor: 1.013, year: 2004

  7. Bioinformatic approaches to augment study of epithelial-to-mesenchymal transition in lung cancer.

    Science.gov (United States)

    Beck, Tim N; Chikwem, Adaeze J; Solanki, Nehal R; Golemis, Erica A

    2014-10-01

    Bioinformatic approaches are intended to provide systems level insight into the complex biological processes that underlie serious diseases such as cancer. In this review we describe current bioinformatic resources, and illustrate how they have been used to study a clinically important example: epithelial-to-mesenchymal transition (EMT) in lung cancer. Lung cancer is the leading cause of cancer-related deaths and is often diagnosed at advanced stages, leading to limited therapeutic success. While EMT is essential during development and wound healing, pathological reactivation of this program by cancer cells contributes to metastasis and drug resistance, both major causes of death from lung cancer. Challenges of studying EMT include its transient nature, its molecular and phenotypic heterogeneity, and the complicated networks of rewired signaling cascades. Given the biology of lung cancer and the role of EMT, it is critical to better align the two in order to advance the impact of precision oncology. This task relies heavily on the application of bioinformatic resources. Besides summarizing recent work in this area, we use four EMT-associated genes, TGF-β (TGFB1), NEDD9/HEF1, β-catenin (CTNNB1) and E-cadherin (CDH1), as exemplars to demonstrate the current capacities and limitations of probing bioinformatic resources to inform hypothesis-driven studies with therapeutic goals. Copyright © 2014 the American Physiological Society.

  8. Surfing the data tsunami, a bioinformatic dissection of the proangiogenic monocyte

    NARCIS (Netherlands)

    van der Pouw Kraan, T. C. T. M.; van der Laan, A. M.; Piek, J. J.; Horrevoets, A. J. G.

    2012-01-01

    In this review we compare expression studies on monocyte subsets as an example to show the integrated possibilities of molecular databases and bioinformatic analysis tools. Monocytes have been recognized as cells with great plasticity and differentiation potential that play a pivotal role in

  9. Datasets2Tools, repository and search engine for bioinformatics datasets, tools and canned analyses

    Science.gov (United States)

    Torre, Denis; Krawczuk, Patrycja; Jagodnik, Kathleen M.; Lachmann, Alexander; Wang, Zichen; Wang, Lily; Kuleshov, Maxim V.; Ma'Ayan, Avi

    2018-02-01

    Biomedical data repositories such as the Gene Expression Omnibus (GEO) enable the search and discovery of relevant biomedical digital data objects. Similarly, resources such as OMICtools, index bioinformatics tools that can extract knowledge from these digital data objects. However, systematic access to pre-generated 'canned' analyses applied by bioinformatics tools to biomedical digital data objects is currently not available. Datasets2Tools is a repository indexing 31,473 canned bioinformatics analyses applied to 6,431 datasets. The Datasets2Tools repository also contains the indexing of 4,901 published bioinformatics software tools, and all the analyzed datasets. Datasets2Tools enables users to rapidly find datasets, tools, and canned analyses through an intuitive web interface, a Google Chrome extension, and an API. Furthermore, Datasets2Tools provides a platform for contributing canned analyses, datasets, and tools, as well as evaluating these digital objects according to their compliance with the findable, accessible, interoperable, and reusable (FAIR) principles. By incorporating community engagement, Datasets2Tools promotes sharing of digital resources to stimulate the extraction of knowledge from biomedical research data. Datasets2Tools is freely available from: http://amp.pharm.mssm.edu/datasets2tools.

  10. DIVERGENOME: a bioinformatics platform to assist population genetics and genetic epidemiology studies.

    Science.gov (United States)

    Magalhães, Wagner C S; Rodrigues, Maíra R; Silva, Donnys; Soares-Souza, Giordano; Iannini, Márcia L; Cerqueira, Gustavo C; Faria-Campos, Alessandra C; Tarazona-Santos, Eduardo

    2012-05-01

    Large-scale genomics initiatives such as the HapMap project and the 1000-genomes rely on powerful bioinformatics support to assist data production and analysis. Contrastingly, few bioinformatics platforms oriented to smaller research groups exist to store, handle, share, and integrate data from different sources, as well as to assist these scientists to perform their analyses efficiently. We developed such a bioinformatics platform, DIVERGENOME, to assist population genetics and genetic epidemiology studies performed by small- to medium-sized research groups. The platform is composed of two integrated components, a relational database (DIVERGENOMEdb), and a set of tools to convert data formats as required by popular software in population genetics and genetic epidemiology (DIVERGENOMEtools). In DIVERGENOMEdb, information on genotypes, polymorphism, laboratory protocols, individuals, populations, and phenotypes is organized in projects. These can be queried according to permissions. Here, we validated DIVERGENOME through a use case regarding the analysis of SLC2A4 genetic diversity in human populations. DIVERGENOME, with its intuitive Web interface and automatic data loading capability, facilitates its use by individuals without bioinformatics background, allowing complex queries to be easily interrogated and straightforward data format conversions (not available in similar platforms). DIVERGENOME is open source, freely available, and can be accessed online (pggenetica.icb.ufmg.br/divergenome) or hosted locally. © 2012 Wiley Periodicals, Inc.

  11. Why Choose This One? Factors in Scientists' Selection of Bioinformatics Tools

    Science.gov (United States)

    Bartlett, Joan C.; Ishimura, Yusuke; Kloda, Lorie A.

    2011-01-01

    Purpose: The objective was to identify and understand the factors involved in scientists' selection of preferred bioinformatics tools, such as databases of gene or protein sequence information (e.g., GenBank) or programs that manipulate and analyse biological data (e.g., BLAST). Methods: Eight scientists maintained research diaries for a two-week…

  12. Bioinformatics analysis identifies several intrinsically disordered human E3 ubiquitin-protein ligases

    DEFF Research Database (Denmark)

    Boomsma, Wouter Krogh; Nielsen, Sofie Vincents; Lindorff-Larsen, Kresten

    2016-01-01

    conduct a bioinformatics analysis to examine >600 human and S. cerevisiae E3 ligases to identify enzymes that are similar to San1 in terms of function and/or mechanism of substrate recognition. An initial sequence-based database search was found to detect candidates primarily based on the homology...

  13. Tissue damage in organic rainbow trout muscle investigated by proteomics and bioinformatics

    DEFF Research Database (Denmark)

    Wulff, Tune; Silva, T.; Nielsen, Michael Engelbrecht

    2013-01-01

    and magnitude of the cellular response, in the context of a regenerative process. Using a bioinformatics approach, the main biological function of these proteins were assigned, showing the regulation of proteins involved in processes like apoptosis, iron homeostasis and regulation of muscular structure...

  14. In-depth analysis of the adipocyte proteome by mass spectrometry and bioinformatics

    DEFF Research Database (Denmark)

    Adachi, Jun; Kumar, Chanchal; Zhang, Yanling

    2007-01-01

    , mitochondria, membrane, and cytosol of 3T3-L1 adipocytes. We identified 3,287 proteins while essentially eliminating false positives, making this one of the largest high confidence proteomes reported to date. Comprehensive bioinformatics analysis revealed that the adipocyte proteome, despite its specialized...

  15. Infusing Bioinformatics and Research-Like Experience into a Molecular Biology Laboratory Course

    Science.gov (United States)

    Nogaj, Luiza A.

    2014-01-01

    A nine-week laboratory project designed for a sophomore level molecular biology course is described. Small groups of students (3-4 per group) choose a tumor suppressor gene (TSG) or an oncogene for this project. Each group researches the role of their TSG/oncogene from primary literature articles and uses bioinformatics engines to find the gene…

  16. 10 years for the Journal of Bioinformatics and Computational Biology (2003-2013) -- a retrospective.

    Science.gov (United States)

    Eisenhaber, Frank; Sherman, Westley Arthur

    2014-06-01

    The Journal of Bioinformatics and Computational Biology (JBCB) started publishing scientific articles in 2003. It has established itself as home for solid research articles in the field (~ 60 per year) that are surprisingly well cited. JBCB has an important function as alternative publishing channel in addition to other, bigger journals.

  17. Bioinformatics Education in High School: Implications for Promoting Science, Technology, Engineering, and Mathematics Careers

    Science.gov (United States)

    Kovarik, Dina N.; Patterson, Davis G.; Cohen, Carolyn; Sanders, Elizabeth A.; Peterson, Karen A.; Porter, Sandra G.; Chowning, Jeanne Ting

    2013-01-01

    We investigated the effects of our Bio-ITEST teacher professional development model and bioinformatics curricula on cognitive traits (awareness, engagement, self-efficacy, and relevance) in high school teachers and students that are known to accompany a developing interest in science, technology, engineering, and mathematics (STEM) careers. The…

  18. A Critical Analysis of Assessment Quality in Genomics and Bioinformatics Education Research

    Science.gov (United States)

    Campbell, Chad E.; Nehm, Ross H.

    2013-01-01

    The growing importance of genomics and bioinformatics methods and paradigms in biology has been accompanied by an explosion of new curricula and pedagogies. An important question to ask about these educational innovations is whether they are having a meaningful impact on students' knowledge, attitudes, or skills. Although assessments are…

  19. A Linked Series of Laboratory Exercises in Molecular Biology Utilizing Bioinformatics and GFP

    Science.gov (United States)

    Medin, Carey L.; Nolin, Katie L.

    2011-01-01

    Molecular biologists commonly use bioinformatics to map and analyze DNA and protein sequences and to align different DNA and protein sequences for comparison. Additionally, biologists can create and view 3D models of protein structures to further understand intramolecular interactions. The primary goal of this 10-week laboratory was to introduce…

  20. Ramping up to the Biology Workbench: A Multi-Stage Approach to Bioinformatics Education

    Science.gov (United States)

    Greene, Kathleen; Donovan, Sam

    2005-01-01

    In the process of designing and field-testing bioinformatics curriculum materials, we have adopted a three-stage, progressive model that emphasizes collaborative scientific inquiry. The elements of the model include: (1) context setting, (2) introduction to concepts, processes, and tools, and (3) development of competent use of technologically…

  1. An "in silico" Bioinformatics Laboratory Manual for Bioscience Departments: "Prediction of Glycosylation Sites in Phosphoethanolamine Transferases"

    Science.gov (United States)

    Alyuruk, Hakan; Cavas, Levent

    2014-01-01

    Genomics and proteomics projects have produced a huge amount of raw biological data including DNA and protein sequences. Although these data have been stored in data banks, their evaluation is strictly dependent on bioinformatics tools. These tools have been developed by multidisciplinary experts for fast and robust analysis of biological data.…

  2. A Complementary Bioinformatics Approach to Identify Potential Plant Cell Wall Glycosyltransferase-Encoding Genes

    DEFF Research Database (Denmark)

    Egelund, Jack; Skjøt, Michael; Geshi, Naomi

    2004-01-01

    . Although much is known with regard to composition and fine structures of the plant CW, only a handful of CW biosynthetic GT genes-all classified in the CAZy system-have been characterized. In an effort to identify CW GTs that have not yet been classified in the CAZy database, a simple bioinformatics...

  3. Alu Insertions and Genetic Diversity: A Preliminary Investigation by an Undergraduate Bioinformatics Class

    Science.gov (United States)

    Elwess, Nancy L.; Duprey, Stephen L.; Harney, Lindesay A.; Langman, Jessie E.; Marino, Tara C.; Martinez, Carolina; McKeon, Lauren L.; Moss, Chantel I. E.; Myrie, Sasha S.; Taylor, Luke Ryan

    2008-01-01

    "Alu"-insertion polymorphisms were used by an undergraduate Bioinformatics class to study how these insertion sites could be the basis for an investigation in human population genetics. Based on the students' investigation, both allele and genotype "Alu" frequencies were determined for African-American and Japanese populations as well as a…

  4. Bioinformatic analysis of functional differences between the immunoproteasome and the constitutive proteasome

    DEFF Research Database (Denmark)

    Kesmir, Can; van Noort, V.; de Boer, R.J.

    2003-01-01

    not yet been quantified how different the specificity of two forms of the proteasome are. The main question, which still lacks direct evidence, is whether the immunoproteasome generates more MHC ligands. Here we use bioinformatics tools to quantify these differences and show that the immunoproteasome...

  5. A bioinformatics-based overview of protein Lys-Ne-acetylation

    Science.gov (United States)

    Among posttranslational modifications, there are some conceptual similarities between Lys-N'-acetylation and Ser/Thr/Tyr O-phosphorylation. Herein we present a bioinformatics-based overview of reversible protein Lys-acetylation, including some comparisons with reversible protein phosphorylation. T...

  6. Strategies for Using Peer-Assisted Learning Effectively in an Undergraduate Bioinformatics Course

    Science.gov (United States)

    Shapiro, Casey; Ayon, Carlos; Moberg-Parker, Jordan; Levis-Fitzgerald, Marc; Sanders, Erin R.

    2013-01-01

    This study used a mixed methods approach to evaluate hybrid peer-assisted learning approaches incorporated into a bioinformatics tutorial for a genome annotation research project. Quantitative and qualitative data were collected from undergraduates who enrolled in a research-based laboratory course during two different academic terms at UCLA.…

  7. Protein co-evolution: how do we combine bioinformatics and experimental approaches?

    Science.gov (United States)

    Sandler, Inga; Abu-Qarn, Mehtap; Aharoni, Amir

    2013-02-02

    Molecular co-evolution is manifested by compensatory changes in proteins designed to enable adaptation to their natural environment. In recent years, bioinformatics approaches allowed for the detection of co-evolution at the level of the whole protein or of specific residues. Such efforts enabled prediction of protein-protein interactions, functional assignments of proteins and the identification of interacting residues, thereby providing information on protein structure. Still, despite such advances, relatively little is known regarding the functional implications of sequence divergence resulting from protein co-evolution. While bioinformatics approaches usually analyze thousands of proteins to obtain a broad view of protein co-evolution, experimental evaluation of protein co-evolution serves to study only individual proteins. In this review, we describe recent advances in bioinformatics and experimental efforts aimed at examining protein co-evolution. Accordingly, we discuss possible modes of crosstalk between the bioinformatics and experimental approaches to facilitate the identification of co-evolutionary signals in proteins and to understand their implications for the structure and function of proteins.

  8. Integrative content-driven concepts for bioinformatics “beyond the cell”

    Indian Academy of Sciences (India)

    2006-10-10

    Oct 10, 2006 ... Bioinformatics has delivered great contributions to genome and genomics research, without which the world-wide success of this and other global ('omics') approaches would not have been possible. More recently, it has developed further towards the analysis of different kinds of networks thus laying the ...

  9. A lightweight, flow-based toolkit for parallel and distributed bioinformatics pipelines

    Directory of Open Access Journals (Sweden)

    Cieślik Marcin

    2011-02-01

    Full Text Available Abstract Background Bioinformatic analyses typically proceed as chains of data-processing tasks. A pipeline, or 'workflow', is a well-defined protocol, with a specific structure defined by the topology of data-flow interdependencies, and a particular functionality arising from the data transformations applied at each step. In computer science, the dataflow programming (DFP paradigm defines software systems constructed in this manner, as networks of message-passing components. Thus, bioinformatic workflows can be naturally mapped onto DFP concepts. Results To enable the flexible creation and execution of bioinformatics dataflows, we have written a modular framework for parallel pipelines in Python ('PaPy'. A PaPy workflow is created from re-usable components connected by data-pipes into a directed acyclic graph, which together define nested higher-order map functions. The successive functional transformations of input data are evaluated on flexibly pooled compute resources, either local or remote. Input items are processed in batches of adjustable size, all flowing one to tune the trade-off between parallelism and lazy-evaluation (memory consumption. An add-on module ('NuBio' facilitates the creation of bioinformatics workflows by providing domain specific data-containers (e.g., for biomolecular sequences, alignments, structures and functionality (e.g., to parse/write standard file formats. Conclusions PaPy offers a modular framework for the creation and deployment of parallel and distributed data-processing workflows. Pipelines derive their functionality from user-written, data-coupled components, so PaPy also can be viewed as a lightweight toolkit for extensible, flow-based bioinformatics data-processing. The simplicity and flexibility of distributed PaPy pipelines may help users bridge the gap between traditional desktop/workstation and grid computing. PaPy is freely distributed as open-source Python code at http://muralab.org/PaPy, and

  10. Improving data workflow systems with cloud services and use of open data for bioinformatics research.

    Science.gov (United States)

    Karim, Md Rezaul; Michel, Audrey; Zappa, Achille; Baranov, Pavel; Sahay, Ratnesh; Rebholz-Schuhmann, Dietrich

    2017-04-16

    Data workflow systems (DWFSs) enable bioinformatics researchers to combine components for data access and data analytics, and to share the final data analytics approach with their collaborators. Increasingly, such systems have to cope with large-scale data, such as full genomes (about 200 GB each), public fact repositories (about 100 TB of data) and 3D imaging data at even larger scales. As moving the data becomes cumbersome, the DWFS needs to embed its processes into a cloud infrastructure, where the data are already hosted. As the standardized public data play an increasingly important role, the DWFS needs to comply with Semantic Web technologies. This advancement to DWFS would reduce overhead costs and accelerate the progress in bioinformatics research based on large-scale data and public resources, as researchers would require less specialized IT knowledge for the implementation. Furthermore, the high data growth rates in bioinformatics research drive the demand for parallel and distributed computing, which then imposes a need for scalability and high-throughput capabilities onto the DWFS. As a result, requirements for data sharing and access to public knowledge bases suggest that compliance of the DWFS with Semantic Web standards is necessary. In this article, we will analyze the existing DWFS with regard to their capabilities toward public open data use as well as large-scale computational and human interface requirements. We untangle the parameters for selecting a preferable solution for bioinformatics research with particular consideration to using cloud services and Semantic Web technologies. Our analysis leads to research guidelines and recommendations toward the development of future DWFS for the bioinformatics research community. © The Author 2017. Published by Oxford University Press.

  11. The Revolution in Viral Genomics as Exemplified by the Bioinformatic Analysis of Human Adenoviruses

    Directory of Open Access Journals (Sweden)

    Sarah Torres

    2010-06-01

    Full Text Available Over the past 30 years, genomic and bioinformatic analysis of human adenoviruses has been achieved using a variety of DNA sequencing methods; initially with the use of restriction enzymes and more currently with the use of the GS FLX pyrosequencing technology. Following the conception of DNA sequencing in the 1970s, analysis of adenoviruses has evolved from 100 base pair mRNA fragments to entire genomes. Comparative genomics of adenoviruses made its debut in 1984 when nucleotides and amino acids of coding sequences within the hexon genes of two human adenoviruses (HAdV, HAdV–C2 and HAdV–C5, were compared and analyzed. It was determined that there were three different zones (1-393, 394-1410, 1411-2910 within the hexon gene, of which HAdV–C2 and HAdV–C5 shared zones 1 and 3 with 95% and 89.5% nucleotide identity, respectively. In 1992, HAdV-C5 became the first adenovirus genome to be fully sequenced using the Sanger method. Over the next seven years, whole genome analysis and characterization was completed using bioinformatic tools such as blastn, tblastx, ClustalV and FASTA, in order to determine key proteins in species HAdV-A through HAdV-F. The bioinformatic revolution was initiated with the introduction of a novel species, HAdV-G, that was typed and named by the use of whole genome sequencing and phylogenetics as opposed to traditional serology. HAdV bioinformatics will continue to advance as the latest sequencing technology enables scientists to add to and expand the resource databases. As a result of these advancements, how novel HAdVs are typed has changed. Bioinformatic analysis has become the revolutionary tool that has significantly accelerated the in-depth study of HAdV microevolution through comparative genomics.

  12. The eBioKit, a stand-alone educational platform for bioinformatics.

    Science.gov (United States)

    Hernández-de-Diego, Rafael; de Villiers, Etienne P; Klingström, Tomas; Gourlé, Hadrien; Conesa, Ana; Bongcam-Rudloff, Erik

    2017-09-01

    Bioinformatics skills have become essential for many research areas; however, the availability of qualified researchers is usually lower than the demand and training to increase the number of able bioinformaticians is an important task for the bioinformatics community. When conducting training or hands-on tutorials, the lack of control over the analysis tools and repositories often results in undesirable situations during training, as unavailable online tools or version conflicts may delay, complicate, or even prevent the successful completion of a training event. The eBioKit is a stand-alone educational platform that hosts numerous tools and databases for bioinformatics research and allows training to take place in a controlled environment. A key advantage of the eBioKit over other existing teaching solutions is that all the required software and databases are locally installed on the system, significantly reducing the dependence on the internet. Furthermore, the architecture of the eBioKit has demonstrated itself to be an excellent balance between portability and performance, not only making the eBioKit an exceptional educational tool but also providing small research groups with a platform to incorporate bioinformatics analysis in their research. As a result, the eBioKit has formed an integral part of training and research performed by a wide variety of universities and organizations such as the Pan African Bioinformatics Network (H3ABioNet) as part of the initiative Human Heredity and Health in Africa (H3Africa), the Southern Africa Network for Biosciences (SAnBio) initiative, the Biosciences eastern and central Africa (BecA) hub, and the International Glossina Genome Initiative.

  13. A lightweight, flow-based toolkit for parallel and distributed bioinformatics pipelines.

    Science.gov (United States)

    Cieślik, Marcin; Mura, Cameron

    2011-02-25

    Bioinformatic analyses typically proceed as chains of data-processing tasks. A pipeline, or 'workflow', is a well-defined protocol, with a specific structure defined by the topology of data-flow interdependencies, and a particular functionality arising from the data transformations applied at each step. In computer science, the dataflow programming (DFP) paradigm defines software systems constructed in this manner, as networks of message-passing components. Thus, bioinformatic workflows can be naturally mapped onto DFP concepts. To enable the flexible creation and execution of bioinformatics dataflows, we have written a modular framework for parallel pipelines in Python ('PaPy'). A PaPy workflow is created from re-usable components connected by data-pipes into a directed acyclic graph, which together define nested higher-order map functions. The successive functional transformations of input data are evaluated on flexibly pooled compute resources, either local or remote. Input items are processed in batches of adjustable size, all flowing one to tune the trade-off between parallelism and lazy-evaluation (memory consumption). An add-on module ('NuBio') facilitates the creation of bioinformatics workflows by providing domain specific data-containers (e.g., for biomolecular sequences, alignments, structures) and functionality (e.g., to parse/write standard file formats). PaPy offers a modular framework for the creation and deployment of parallel and distributed data-processing workflows. Pipelines derive their functionality from user-written, data-coupled components, so PaPy also can be viewed as a lightweight toolkit for extensible, flow-based bioinformatics data-processing. The simplicity and flexibility of distributed PaPy pipelines may help users bridge the gap between traditional desktop/workstation and grid computing. PaPy is freely distributed as open-source Python code at http://muralab.org/PaPy, and includes extensive documentation and annotated usage

  14. A lightweight, flow-based toolkit for parallel and distributed bioinformatics pipelines

    Science.gov (United States)

    2011-01-01

    Background Bioinformatic analyses typically proceed as chains of data-processing tasks. A pipeline, or 'workflow', is a well-defined protocol, with a specific structure defined by the topology of data-flow interdependencies, and a particular functionality arising from the data transformations applied at each step. In computer science, the dataflow programming (DFP) paradigm defines software systems constructed in this manner, as networks of message-passing components. Thus, bioinformatic workflows can be naturally mapped onto DFP concepts. Results To enable the flexible creation and execution of bioinformatics dataflows, we have written a modular framework for parallel pipelines in Python ('PaPy'). A PaPy workflow is created from re-usable components connected by data-pipes into a directed acyclic graph, which together define nested higher-order map functions. The successive functional transformations of input data are evaluated on flexibly pooled compute resources, either local or remote. Input items are processed in batches of adjustable size, all flowing one to tune the trade-off between parallelism and lazy-evaluation (memory consumption). An add-on module ('NuBio') facilitates the creation of bioinformatics workflows by providing domain specific data-containers (e.g., for biomolecular sequences, alignments, structures) and functionality (e.g., to parse/write standard file formats). Conclusions PaPy offers a modular framework for the creation and deployment of parallel and distributed data-processing workflows. Pipelines derive their functionality from user-written, data-coupled components, so PaPy also can be viewed as a lightweight toolkit for extensible, flow-based bioinformatics data-processing. The simplicity and flexibility of distributed PaPy pipelines may help users bridge the gap between traditional desktop/workstation and grid computing. PaPy is freely distributed as open-source Python code at http://muralab.org/PaPy, and includes extensive

  15. Rapid development of entity-based data models for bioinformatics with persistence object-oriented design and structured interfaces.

    Science.gov (United States)

    Ezra Tsur, Elishai

    2017-01-01

    Databases are imperative for research in bioinformatics and computational biology. Current challenges in database design include data heterogeneity and context-dependent interconnections between data entities. These challenges drove the development of unified data interfaces and specialized databases. The curation of specialized databases is an ever-growing challenge due to the introduction of new data sources and the emergence of new relational connections between established datasets. Here, an open-source framework for the curation of specialized databases is proposed. The framework supports user-designed models of data encapsulation, objects persistency and structured interfaces to local and external data sources such as MalaCards, Biomodels and the National Centre for Biotechnology Information (NCBI) databases. The proposed framework was implemented using Java as the development environment, EclipseLink as the data persistency agent and Apache Derby as the database manager. Syntactic analysis was based on J3D, jsoup, Apache Commons and w3c.dom open libraries. Finally, a construction of a specialized database for aneurysms associated vascular diseases is demonstrated. This database contains 3-dimensional geometries of aneurysms, patient's clinical information, articles, biological models, related diseases and our recently published model of aneurysms' risk of rapture. Framework is available in: http://nbel-lab.com.

  16. Structural and evolutionary bioinformatics of the SPOUT superfamily of methyltransferases

    Directory of Open Access Journals (Sweden)

    Purta Elzbieta

    2007-03-01

    , and one (COG1756 represented by Nep1p has been already implicated in RNA metabolism, but its biochemical function has been unknown. Based on the inference of orthologous and paralogous relationships between all SPOUT families we propose that the Last Universal Common Ancestor (LUCA of all extant organisms contained at least three SPOUT members, ancestors of contemporary RNA MTases that carry out m1G, m3U, and 2'O-ribose methylation, respectively. In this work we also speculate on the origin of the knot and propose possible 'unknotted' ancestors. The results of our analysis provide a comprehensive 'roadmap' for experimental characterization of SPOUT MTases and interpretation of functional studies in the light of sequence-structure relationships.

  17. Remote data retrieval for bioinformatics applications: an agent migration approach.

    Directory of Open Access Journals (Sweden)

    Lei Gao

    Full Text Available Some of the approaches have been developed to retrieve data automatically from one or multiple remote biological data sources. However, most of them require researchers to remain online and wait for returned results. The latter not only requires highly available network connection, but also may cause the network overload. Moreover, so far none of the existing approaches has been designed to address the following problems when retrieving the remote data in a mobile network environment: (1 the resources of mobile devices are limited; (2 network connection is relatively of low quality; and (3 mobile users are not always online. To address the aforementioned problems, we integrate an agent migration approach with a multi-agent system to overcome the high latency or limited bandwidth problem by moving their computations to the required resources or services. More importantly, the approach is fit for the mobile computing environments. Presented in this paper are also the system architecture, the migration strategy, as well as the security authentication of agent migration. As a demonstration, the remote data retrieval from GenBank was used to illustrate the feasibility of the proposed approach.

  18. H3ABioNet, a sustainable pan-African bioinformatics network for human heredity and health in Africa

    Science.gov (United States)

    Mulder, Nicola J.; Adebiyi, Ezekiel; Alami, Raouf; Benkahla, Alia; Brandful, James; Doumbia, Seydou; Everett, Dean; Fadlelmola, Faisal M.; Gaboun, Fatima; Gaseitsiwe, Simani; Ghazal, Hassan; Hazelhurst, Scott; Hide, Winston; Ibrahimi, Azeddine; Jaufeerally Fakim, Yasmina; Jongeneel, C. Victor; Joubert, Fourie; Kassim, Samar; Kayondo, Jonathan; Kumuthini, Judit; Lyantagaye, Sylvester; Makani, Julie; Mansour Alzohairy, Ahmed; Masiga, Daniel; Moussa, Ahmed; Nash, Oyekanmi; Ouwe Missi Oukem-Boyer, Odile; Owusu-Dabo, Ellis; Panji, Sumir; Patterton, Hugh; Radouani, Fouzia; Sadki, Khalid; Seghrouchni, Fouad; Tastan Bishop, Özlem; Tiffin, Nicki; Ulenga, Nzovu

    2016-01-01

    The application of genomics technologies to medicine and biomedical research is increasing in popularity, made possible by new high-throughput genotyping and sequencing technologies and improved data analysis capabilities. Some of the greatest genetic diversity among humans, animals, plants, and microbiota occurs in Africa, yet genomic research outputs from the continent are limited. The Human Heredity and Health in Africa (H3Africa) initiative was established to drive the development of genomic research for human health in Africa, and through recognition of the critical role of bioinformatics in this process, spurred the establishment of H3ABioNet, a pan-African bioinformatics network for H3Africa. The limitations in bioinformatics capacity on the continent have been a major contributory factor to the lack of notable outputs in high-throughput biology research. Although pockets of high-quality bioinformatics teams have existed previously, the majority of research institutions lack experienced faculty who can train and supervise bioinformatics students. H3ABioNet aims to address this dire need, specifically in the area of human genetics and genomics, but knock-on effects are ensuring this extends to other areas of bioinformatics. Here, we describe the emergence of genomics research and the development of bioinformatics in Africa through H3ABioNet. PMID:26627985

  19. Role of remote sensing, geographical information system (GIS) and bioinformatics in kala-azar epidemiology.

    Science.gov (United States)

    Bhunia, Gouri Sankar; Dikhit, Manas Ranjan; Kesari, Shreekant; Sahoo, Ganesh Chandra; Das, Pradeep

    2011-11-01

    Visceral leishmaniasis or kala-azar is a potent parasitic infection causing death of thousands of people each year. Medicinal compounds currently available for the treatment of kala-azar have serious side effects and decreased efficacy owing to the emergence of resistant strains. The type of immune reaction is also to be considered in patients infected with Leishmania donovani (L. donovani). For complete eradication of this disease, a high level modern research is currently being applied both at the molecular level as well as at the field level. The computational approaches like remote sensing, geographical information system (GIS) and bioinformatics are the key resources for the detection and distribution of vectors, patterns, ecological and environmental factors and genomic and proteomic analysis. Novel approaches like GIS and bioinformatics have been more appropriately utilized in determining the cause of visearal leishmaniasis and in designing strategies for preventing the disease from spreading from one region to another.

  20. [Bioinformatics Analysis of Clustered Regularly Interspaced Short Palindromic Repeats in the Genomes of Shigella].

    Science.gov (United States)

    Wang, Pengfei; Wang, Yingfang; Duan, Guangcai; Xue, Zerun; Wang, Linlin; Guo, Xiangjiao; Yang, Haiyan; Xi, Yuanlin

    2015-04-01

    This study was aimed to explore the features of clustered regularly interspaced short palindromic repeats (CRISPR) structures in Shigella by using bioinformatics. We used bioinformatics methods, including BLAST, alignment and RNA structure prediction, to analyze the CRISPR structures of Shigella genomes. The results showed that the CRISPRs existed in the four groups of Shigella, and the flanking sequences of upstream CRISPRs could be classified into the same group with those of the downstream. We also found some relatively conserved palindromic motifs in the leader sequences. Repeat sequences had the same group with corresponding flanking sequences, and could be classified into two different types by their RNA secondary structures, which contain "stem" and "ring". Some spacers were found to homologize with part sequences of plasmids or phages. The study indicated that there were correlations between repeat sequences and flanking sequences, and the repeats might act as a kind of recognition mechanism to mediate the interaction between foreign genetic elements and Cas proteins.

  1. Marine microbial biodiversity, bioinformatics and biotechnology (M2B3) data reporting and service standards.

    Science.gov (United States)

    Ten Hoopen, Petra; Pesant, Stéphane; Kottmann, Renzo; Kopf, Anna; Bicak, Mesude; Claus, Simon; Deneudt, Klaas; Borremans, Catherine; Thijsse, Peter; Dekeyzer, Stefanie; Schaap, Dick Ma; Bowler, Chris; Glöckner, Frank Oliver; Cochrane, Guy

    2015-01-01

    Contextual data collected concurrently with molecular samples are critical to the use of metagenomics in the fields of marine biodiversity, bioinformatics and biotechnology. We present here Marine Microbial Biodiversity, Bioinformatics and Biotechnology (M2B3) standards for "Reporting" and "Serving" data. The M2B3 Reporting Standard (1) describes minimal mandatory and recommended contextual information for a marine microbial sample obtained in the epipelagic zone, (2) includes meaningful information for researchers in the oceanographic, biodiversity and molecular disciplines, and (3) can easily be adopted by any marine laboratory with minimum sampling resources. The M2B3 Service Standard defines a software interface through which these data can be discovered and explored in data repositories. The M2B3 Standards were developed by the European project Micro B3, funded under 7(th) Framework Programme "Ocean of Tomorrow", and were first used with the Ocean Sampling Day initiative. We believe that these standards have value in broader marine science.

  2. GeneDig: a web application for accessing genomic and bioinformatics knowledge.

    Science.gov (United States)

    Suciu, Radu M; Aydin, Emir; Chen, Brian E

    2015-02-28

    With the exponential increase and widespread availability of genomic, transcriptomic, and proteomic data, accessing these '-omics' data is becoming increasingly difficult. The current resources for accessing and analyzing these data have been created to perform highly specific functions intended for specialists, and thus typically emphasize functionality over user experience. We have developed a web-based application, GeneDig.org, that allows any general user access to genomic information with ease and efficiency. GeneDig allows for searching and browsing genes and genomes, while a dynamic navigator displays genomic, RNA, and protein information simultaneously for co-navigation. We demonstrate that our application allows more than five times faster and efficient access to genomic information than any currently available methods. We have developed GeneDig as a platform for bioinformatics integration focused on usability as its central design. This platform will introduce genomic navigation to broader audiences while aiding the bioinformatics analyses performed in everyday biology research.

  3. Bioinformatics in the plant genomic and phenomic domain: The German contribution to resources, services and perspectives.

    Science.gov (United States)

    Schmutzer, Thomas; Bolger, Marie E; Rudd, Stephen; Chen, Jinbo; Gundlach, Heidrun; Arend, Daniel; Oppermann, Markus; Weise, Stephan; Lange, Matthias; Spannagl, Manuel; Usadel, Björn; Mayer, Klaus F X; Scholz, Uwe

    2017-11-10

    Plant genetic resources are a substantial opportunity for plant breeding, preservation and maintenance of biological diversity. As part of the German Network for Bioinformatics Infrastructure (de.NBI) the German Crop BioGreenformatics Network (GCBN) focuses mainly on crop plants and provides both data and software infrastructure which are tailored to the needs of the plant research community. Our mission and key objectives include: (1) provision of transparent access to germplasm seeds, (2) the delivery of improved workflows for plant gene annotation, and (3) implementation of bioinformatics services that link genotypes and phenotypes. This review introduces the GCBN's spectrum of web-services and integrated data resources that address common research problems in the plant genomics community. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.

  4. Biologically inspired intelligent decision making: a commentary on the use of artificial neural networks in bioinformatics.

    Science.gov (United States)

    Manning, Timmy; Sleator, Roy D; Walsh, Paul

    2014-01-01

    Artificial neural networks (ANNs) are a class of powerful machine learning models for classification and function approximation which have analogs in nature. An ANN learns to map stimuli to responses through repeated evaluation of exemplars of the mapping. This learning approach results in networks which are recognized for their noise tolerance and ability to generalize meaningful responses for novel stimuli. It is these properties of ANNs which make them appealing for applications to bioinformatics problems where interpretation of data may not always be obvious, and where the domain knowledge required for deductive techniques is incomplete or can cause a combinatorial explosion of rules. In this paper, we provide an introduction to artificial neural network theory and review some interesting recent applications to bioinformatics problems.

  5. 6th International Conference on Practical Applications of Computational Biology & Bioinformatics

    CERN Document Server

    Luscombe, Nicholas; Fdez-Riverola, Florentino; Rodríguez, Juan; Practical Applications of Computational Biology & Bioinformatics

    2012-01-01

    The growth in the Bioinformatics and Computational Biology fields over the last few years has been remarkable.. The analysis of the datasets of Next Generation Sequencing needs new algorithms and approaches from fields such as Databases, Statistics, Data Mining, Machine Learning, Optimization, Computer Science and Artificial Intelligence. Also Systems Biology has also been emerging as an alternative to the reductionist view that dominated biological research in the last decades. This book presents the results of the  6th International Conference on Practical Applications of Computational Biology & Bioinformatics held at University of Salamanca, Spain, 28-30th March, 2012 which brought together interdisciplinary scientists that have a strong background in the biological and computational sciences.

  6. Dugong: a Docker image, based on Ubuntu Linux, focused on reproducibility and replicability for bioinformatics analyses.

    Science.gov (United States)

    Menegidio, Fabiano B; Jabes, Daniela L; Costa de Oliveira, Regina; Nunes, Luiz R

    2018-02-01

    This manuscript introduces and describes Dugong, a Docker image based on Ubuntu 16.04, which automates installation of more than 3500 bioinformatics tools (along with their respective libraries and dependencies), in alternative computational environments. The software operates through a user-friendly XFCE4 graphic interface that allows software management and installation by users not fully familiarized with the Linux command line and provides the Jupyter Notebook to assist in the delivery and exchange of consistent and reproducible protocols and results across laboratories, assisting in the development of open science projects. Source code and instructions for local installation are available at https://github.com/DugongBioinformatics, under the MIT open source license. Luiz.nunes@ufabc.edu.br. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  7. Cluster Flow: A user-friendly bioinformatics workflow tool [version 1; referees: 3 approved

    Directory of Open Access Journals (Sweden)

    Philip Ewels

    2016-12-01

    Full Text Available Pipeline tools are becoming increasingly important within the field of bioinformatics. Using a pipeline manager to manage and run workflows comprised of multiple tools reduces workload and makes analysis results more reproducible. Existing tools require significant work to install and get running, typically needing pipeline scripts to be written from scratch before running any analysis. We present Cluster Flow, a simple and flexible bioinformatics pipeline tool designed to be quick and easy to install. Cluster Flow comes with 40 modules for common NGS processing steps, ready to work out of the box. Pipelines are assembled using these modules with a simple syntax that can be easily modified as required. Core helper functions automate many common NGS procedures, making running pipelines simple. Cluster Flow is available with an GNU GPLv3 license on GitHub. Documentation, examples and an online demo are available at http://clusterflow.io.

  8. Detecting the genetic link between Alzheimer's disease and obesity using bioinformatics analysis of GWAS data

    OpenAIRE

    Zhuang, Qi-Shuai; Zheng, Hao; Gu, Xiao-Dan; Shen, Liang; Ji, Hong-Fang

    2017-01-01

    Alzheimer's disease (AD) represents the major form of dementia in the elderly. In recent years, accumulating evidence indicate that obesity may act as a risk factor for AD, while the genetic link between the two conditions remains unclear. This bioinformatics analysis aimed to detect the genetic link between AD and obesity on single nucleotide polymorphisms (SNPs), gene, and pathway levels based on genome-wide association studies data. A total of 31 SNPs were found to be shared by AD and obes...

  9. Using registries to integrate bioinformatics tools and services into workbench environments

    DEFF Research Database (Denmark)

    Ménager, Hervé; Kalaš, Matúš; Rapacki, Kristoffer

    2016-01-01

    within convenient, integrated “workbench” environments. Resource descriptions are the core element of registry and workbench systems, which are used to both help the user find and comprehend available software tools, data resources, and Web Services, and to localise, execute and combine them......, a software component that will ease the integration of bioinformatics resources in a workbench environment, using their description provided by the existing ELIXIR Tools and Data Services Registry....

  10. Interinstitutional collaboration for end-user bioinformatics training: Cytoscape as a case study

    Directory of Open Access Journals (Sweden)

    Marci D. Brandenburg, MS, MSI

    2017-04-01

    Conclusions: This collaboration furthered the U-M bioinformationist’s role in the field as an expert in Cytoscape instruction, while also establishing the CWML as a leader in providing support for analyzing and visualizing molecular data at Yale University. The authors found this collaboration to be a successful way for librarians to fill end-user training gaps in rapidly changing fields such as bioinformatics.

  11. PyPedia: using the wiki paradigm as crowd sourcing environment for bioinformatics protocols.

    Science.gov (United States)

    Kanterakis, Alexandros; Kuiper, Joël; Potamias, George; Swertz, Morris A

    2015-01-01

    Today researchers can choose from many bioinformatics protocols for all types of life sciences research, computational environments and coding languages. Although the majority of these are open source, few of them possess all virtues to maximize reuse and promote reproducible science. Wikipedia has proven a great tool to disseminate information and enhance collaboration between users with varying expertise and background to author qualitative content via crowdsourcing. However, it remains an open question whether the wiki paradigm can be applied to bioinformatics protocols. We piloted PyPedia, a wiki where each article is both implementation and documentation of a bioinformatics computational protocol in the python language. Hyperlinks within the wiki can be used to compose complex workflows and induce reuse. A RESTful API enables code execution outside the wiki. Initial content of PyPedia contains articles for population statistics, bioinformatics format conversions and genotype imputation. Use of the easy to learn wiki syntax effectively lowers the barriers to bring expert programmers and less computer savvy researchers on the same page. PyPedia demonstrates how wiki can provide a collaborative development, sharing and even execution environment for biologists and bioinformaticians that complement existing resources, useful for local and multi-center research teams. PyPedia is available online at: http://www.pypedia.com. The source code and installation instructions are available at: https://github.com/kantale/PyPedia_server. The PyPedia python library is available at: https://github.com/kantale/pypedia. PyPedia is open-source, available under the BSD 2-Clause License.

  12. Advantages and disadvantages in usage of bioinformatic programs in promoter region analysis

    Science.gov (United States)

    Pawełkowicz, Magdalena E.; Skarzyńska, Agnieszka; Posyniak, Kacper; ZiÄ bska, Karolina; PlÄ der, Wojciech; Przybecki, Zbigniew

    2015-09-01

    An important computational challenge is finding the regulatory elements across the promotor region. In this work we present the advantages and disadvantages from the application of different bioinformatics programs for localization of transcription factor binding sites in the upstream region of genes connected with sex determination in cucumber. We use PlantCARE, PlantPAN and SignalScan to find motifs in the promotor regions. The results have been compared and possible function of chosen motifs has been described.

  13. The discrepancies in the results of bioinformatics tools for genomic structural annotation

    Science.gov (United States)

    Pawełkowicz, Magdalena; Nowak, Robert; Osipowski, Paweł; Rymuszka, Jacek; Świerkula, Katarzyna; Wojcieszek, Michał; Przybecki, Zbigniew

    2014-11-01

    A major focus of sequencing project is to identify genes in genomes. However it is necessary to define the variety of genes and the criteria for identifying them. In this work we present discrepancies and dependencies from the application of different bioinformatic programs for structural annotation performed on the cucumber data set from Polish Consortium of Cucumber Genome Sequencing. We use Fgenesh, GenScan and GeneMark to automated structural annotation, the results have been compared to reference annotation.

  14. FASTdoop: a versatile and efficient library for the input of FASTA and FASTQ files for MapReduce Hadoop bioinformatics applications.

    Science.gov (United States)

    Ferraro Petrillo, Umberto; Roscigno, Gianluca; Cattaneo, Giuseppe; Giancarlo, Raffaele

    2017-05-15

    MapReduce Hadoop bioinformatics applications require the availability of special-purpose routines to manage the input of sequence files. Unfortunately, the Hadoop framework does not provide any built-in support for the most popular sequence file formats like FASTA or BAM. Moreover, the development of these routines is not easy, both because of the diversity of these formats and the need for managing efficiently sequence datasets that may count up to billions of characters. We present FASTdoop, a generic Hadoop library for the management of FASTA and FASTQ files. We show that, with respect to analogous input management routines that have appeared in the Literature, it offers versatility and efficiency. That is, it can handle collections of reads, with or without quality scores, as well as long genomic sequences while the existing routines concentrate mainly on NGS sequence data. Moreover, in the domain where a comparison is possible, the routines proposed here are faster than the available ones. In conclusion, FASTdoop is a much needed addition to Hadoop-BAM. The software and the datasets are available at http://www.di.unisa.it/FASTdoop/ . umberto.ferraro@uniroma1.it. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  15. Food Safety in the Age of Next Generation Sequencing, Bioinformatics, and Open Data Access.

    Science.gov (United States)

    Taboada, Eduardo N; Graham, Morag R; Carriço, João A; Van Domselaar, Gary

    2017-01-01

    Public health labs and food regulatory agencies globally are embracing whole genome sequencing (WGS) as a revolutionary new method that is positioned to replace numerous existing diagnostic and microbial typing technologies with a single new target: the microbial draft genome. The ability to cheaply generate large amounts of microbial genome sequence data, combined with emerging policies of food regulatory and public health institutions making their microbial sequences increasingly available and public, has served to open up the field to the general scientific community. This open data access policy shift has resulted in a proliferation of data being deposited into sequence repositories and of novel bioinformatics software designed to analyze these vast datasets. There also has been a more recent drive for improved data sharing to achieve more effective global surveillance, public health and food safety. Such developments have heightened the need for enhanced analytical systems in order to process and interpret this new type of data in a timely fashion. In this review we outline the emergence of genomics, bioinformatics and open data in the context of food safety. We also survey major efforts to translate genomics and bioinformatics technologies out of the research lab and into routine use in modern food safety labs. We conclude by discussing the challenges and opportunities that remain, including those expected to play a major role in the future of food safety science.

  16. Host-parasite interactions and ecology of the malaria parasite-a bioinformatics approach.

    Science.gov (United States)

    Izak, Dariusz; Klim, Joanna; Kaczanowski, Szymon

    2018-04-25

    Malaria remains one of the highest mortality infectious diseases. Malaria is caused by parasites from the genus Plasmodium. Most deaths are caused by infections involving Plasmodium falciparum, which has a complex life cycle. Malaria parasites are extremely well adapted for interactions with their host and their host's immune system and are able to suppress the human immune system, erase immunological memory and rapidly alter exposed antigens. Owing to this rapid evolution, parasites develop drug resistance and express novel forms of antigenic proteins that are not recognized by the host immune system. There is an emerging need for novel interventions, including novel drugs and vaccines. Designing novel therapies requires knowledge about host-parasite interactions, which is still limited. However, significant progress has recently been achieved in this field through the application of bioinformatics analysis of parasite genome sequences. In this review, we describe the main achievements in 'malarial' bioinformatics and provide examples of successful applications of protein sequence analysis. These examples include the prediction of protein functions based on homology and the prediction of protein surface localization via domain and motif analysis. Additionally, we describe PlasmoDB, a database that stores accumulated experimental data. This tool allows data mining of the stored information and will play an important role in the development of malaria science. Finally, we illustrate the application of bioinformatics in the development of population genetics research on malaria parasites, an approach referred to as reverse ecology.

  17. A Review of Recent Advances in Translational Bioinformatics: Bridges from Biology to Medicine.

    Science.gov (United States)

    Vamathevan, J; Birney, E

    2017-08-01

    Objectives: To highlight and provide insights into key developments in translational bioinformatics between 2014 and 2016. Methods: This review describes some of the most influential bioinformatics papers and resources that have been published between 2014 and 2016 as well as the national genome sequencing initiatives that utilize these resources to routinely embed genomic medicine into healthcare. Also discussed are some applications of the secondary use of patient data followed by a comprehensive view of the open challenges and emergent technologies. Results: Although data generation can be performed routinely, analyses and data integration methods still require active research and standardization to improve streamlining of clinical interpretation. The secondary use of patient data has resulted in the development of novel algorithms and has enabled a refined understanding of cellular and phenotypic mechanisms. New data storage and data sharing approaches are required to enable diverse biomedical communities to contribute to genomic discovery. Conclusion: The translation of genomics data into actionable knowledge for use in healthcare is transforming the clinical landscape in an unprecedented way. Exciting and innovative models that bridge the gap between clinical and academic research are set to open up the field of translational bioinformatics for rapid growth in a digital era. Georg Thieme Verlag KG Stuttgart.

  18. A selective review of robust variable selection with applications in bioinformatics.

    Science.gov (United States)

    Wu, Cen; Ma, Shuangge

    2015-09-01

    A drastic amount of data have been and are being generated in bioinformatics studies. In the analysis of such data, the standard modeling approaches can be challenged by the heavy-tailed errors and outliers in response variables, the contamination in predictors (which may be caused by, for instance, technical problems in microarray gene expression studies), model mis-specification and others. Robust methods are needed to tackle these challenges. When there are a large number of predictors, variable selection can be as important as estimation. As a generic variable selection and regularization tool, penalization has been extensively adopted. In this article, we provide a selective review of robust penalized variable selection approaches especially designed for high-dimensional data from bioinformatics and biomedical studies. We discuss the robust loss functions, penalty functions and computational algorithms. The theoretical properties and implementation are also briefly examined. Application examples of the robust penalization approaches in representative bioinformatics and biomedical studies are also illustrated. © The Author 2014. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  19. Strategies for using peer-assisted learning effectively in an undergraduate bioinformatics course.

    Science.gov (United States)

    Shapiro, Casey; Ayon, Carlos; Moberg-Parker, Jordan; Levis-Fitzgerald, Marc; Sanders, Erin R

    2013-01-01

    This study used a mixed methods approach to evaluate hybrid peer-assisted learning approaches incorporated into a bioinformatics tutorial for a genome annotation research project. Quantitative and qualitative data were collected from undergraduates who enrolled in a research-based laboratory course during two different academic terms at UCLA. Findings indicate that a critical feature of the peer-assisted learning approach is to have near-peer leaders with genome annotation experience, allowing them to communicate technical and conceptual aspects of the process in the context of a research project (a.k.a., the "big picture"). These characteristics are important for creating connections between the wet lab experiments and the computer lab activities, engendering excitement about the research project and fostering engagement in bioinformatics as a discipline. Likewise, it is essential to couple tutorial training in genome annotation with appropriate instructional materials, providing detailed, step-by-step instructions for database navigation. Finally, the assessment results support this hybrid peer-assisted learning approach as a model for undergraduates to successfully learn bioinformatics in a course setting. Copyright © 2012 International Union of Biochemistry and Molecular Biology, Inc.

  20. The Web as an educational tool for/in learning/teaching bioinformatics statistics.

    Science.gov (United States)

    Oliver, J; Pisano, M E; Alonso, T; Roca, P

    2005-12-01

    Statistics provides essential tool in Bioinformatics to interpret the results of a database search or for the management of enormous amounts of information provided from genomics, proteomics and metabolomics. The goal of this project was the development of a software tool that would be as simple as possible to demonstrate the use of the Bioinformatics statistics. Computer Simulation Methods (CSMs) developed using Microsoft Excel were chosen for their broad range of applications, immediate and easy formula calculation, immediate testing and easy graphics representation, and of general use and acceptance by the scientific community. The result of these endeavours is a set of utilities which can be accessed from the following URL: http://gmein.uib.es/bioinformatica/statistics. When tested on students with previous coursework with traditional statistical teaching methods, the general opinion/overall consensus was that Web-based instruction had numerous advantages, but traditional methods with manual calculations were also needed for their theory and practice. Once having mastered the basic statistical formulas, Excel spreadsheets and graphics were shown to be very useful for trying many parameters in a rapid fashion without having to perform tedious calculations. CSMs will be of great importance for the formation of the students and professionals in the field of bioinformatics, and for upcoming applications of self-learning and continuous formation.

  1. Food Safety in the Age of Next Generation Sequencing, Bioinformatics, and Open Data Access

    Directory of Open Access Journals (Sweden)

    Eduardo N. Taboada

    2017-05-01

    Full Text Available Public health labs and food regulatory agencies globally are embracing whole genome sequencing (WGS as a revolutionary new method that is positioned to replace numerous existing diagnostic and microbial typing technologies with a single new target: the microbial draft genome. The ability to cheaply generate large amounts of microbial genome sequence data, combined with emerging policies of food regulatory and public health institutions making their microbial sequences increasingly available and public, has served to open up the field to the general scientific community. This open data access policy shift has resulted in a proliferation of data being deposited into sequence repositories and of novel bioinformatics software designed to analyze these vast datasets. There also has been a more recent drive for improved data sharing to achieve more effective global surveillance, public health and food safety. Such developments have heightened the need for enhanced analytical systems in order to process and interpret this new type of data in a timely fashion. In this review we outline the emergence of genomics, bioinformatics and open data in the context of food safety. We also survey major efforts to translate genomics and bioinformatics technologies out of the research lab and into routine use in modern food safety labs. We conclude by discussing the challenges and opportunities that remain, including those expected to play a major role in the future of food safety science.

  2. Tissue Banking, Bioinformatics, and Electronic Medical Records: The Front-End Requirements for Personalized Medicine

    Science.gov (United States)

    Suh, K. Stephen; Sarojini, Sreeja; Youssif, Maher; Nalley, Kip; Milinovikj, Natasha; Elloumi, Fathi; Russell, Steven; Pecora, Andrew; Schecter, Elyssa; Goy, Andre

    2013-01-01

    Personalized medicine promises patient-tailored treatments that enhance patient care and decrease overall treatment costs by focusing on genetics and “-omics” data obtained from patient biospecimens and records to guide therapy choices that generate good clinical outcomes. The approach relies on diagnostic and prognostic use of novel biomarkers discovered through combinations of tissue banking, bioinformatics, and electronic medical records (EMRs). The analytical power of bioinformatic platforms combined with patient clinical data from EMRs can reveal potential biomarkers and clinical phenotypes that allow researchers to develop experimental strategies using selected patient biospecimens stored in tissue banks. For cancer, high-quality biospecimens collected at diagnosis, first relapse, and various treatment stages provide crucial resources for study designs. To enlarge biospecimen collections, patient education regarding the value of specimen donation is vital. One approach for increasing consent is to offer publically available illustrations and game-like engagements demonstrating how wider sample availability facilitates development of novel therapies. The critical value of tissue bank samples, bioinformatics, and EMR in the early stages of the biomarker discovery process for personalized medicine is often overlooked. The data obtained also require cross-disciplinary collaborations to translate experimental results into clinical practice and diagnostic and prognostic use in personalized medicine. PMID:23818899

  3. EDAM: an ontology of bioinformatics operations, types of data and identifiers, topics and formats

    Science.gov (United States)

    Ison, Jon; Kalaš, Matúš; Jonassen, Inge; Bolser, Dan; Uludag, Mahmut; McWilliam, Hamish; Malone, James; Lopez, Rodrigo; Pettifer, Steve; Rice, Peter

    2013-01-01

    Motivation: Advancing the search, publication and integration of bioinformatics tools and resources demands consistent machine-understandable descriptions. A comprehensive ontology allowing such descriptions is therefore required. Results: EDAM is an ontology of bioinformatics operations (tool or workflow functions), types of data and identifiers, application domains and data formats. EDAM supports semantic annotation of diverse entities such as Web services, databases, programmatic libraries, standalone tools, interactive applications, data schemas, datasets and publications within bioinformatics. EDAM applies to organizing and finding suitable tools and data and to automating their integration into complex applications or workflows. It includes over 2200 defined concepts and has successfully been used for annotations and implementations. Availability: The latest stable version of EDAM is available in OWL format from http://edamontology.org/EDAM.owl and in OBO format from http://edamontology.org/EDAM.obo. It can be viewed online at the NCBO BioPortal and the EBI Ontology Lookup Service. For documentation and license please refer to http://edamontology.org. This article describes version 1.2 available at http://edamontology.org/EDAM_1.2.owl. Contact: jison@ebi.ac.uk PMID:23479348

  4. Novel approaches for bioinformatic analysis of salivary RNA sequencing data for development.

    Science.gov (United States)

    Kaczor-Urbanowicz, Karolina Elzbieta; Kim, Yong; Li, Feng; Galeev, Timur; Kitchen, Rob R; Gerstein, Mark; Koyano, Kikuye; Jeong, Sung-Hee; Wang, Xiaoyan; Elashoff, David; Kang, So Young; Kim, Su Mi; Kim, Kyoung; Kim, Sung; Chia, David; Xiao, Xinshu; Rozowsky, Joel; Wong, David T W

    2018-01-01

    Analysis of RNA sequencing (RNA-Seq) data in human saliva is challenging. Lack of standardization and unification of the bioinformatic procedures undermines saliva's diagnostic potential. Thus, it motivated us to perform this study. We applied principal pipelines for bioinformatic analysis of small RNA-Seq data of saliva of 98 healthy Korean volunteers including either direct or indirect mapping of the reads to the human genome using Bowtie1. Analysis of alignments to exogenous genomes by another pipeline revealed that almost all of the reads map to bacterial genomes. Thus, salivary exRNA has fundamental properties that warrant the design of unique additional steps while performing the bioinformatic analysis. Our pipelines can serve as potential guidelines for processing of RNA-Seq data of human saliva. Processing and analysis results of the experimental data generated by the exceRpt (v4.6.3) small RNA-seq pipeline (github.gersteinlab.org/exceRpt) are available from exRNA atlas (exrna-atlas.org). Alignment to exogenous genomes and their quantification results were used in this paper for the analyses of small RNAs of exogenous origin. dtww@ucla.edu. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  5. Community annotation and bioinformatics workforce development in concert--Little Skate Genome Annotation Workshops and Jamborees.

    Science.gov (United States)

    Wang, Qinghua; Arighi, Cecilia N; King, Benjamin L; Polson, Shawn W; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F; Page, Shallee T; Rendino, Marc Farnum; Thomas, William Kelley; Udwary, Daniel W; Wu, Cathy H

    2012-01-01

    Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome.

  6. Community annotation and bioinformatics workforce development in concert—Little Skate Genome Annotation Workshops and Jamborees

    Science.gov (United States)

    Wang, Qinghua; Arighi, Cecilia N.; King, Benjamin L.; Polson, Shawn W.; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F.; Page, Shallee T.; Farnum Rendino, Marc; Thomas, William Kelley; Udwary, Daniel W.; Wu, Cathy H.

    2012-01-01

    Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome. PMID:22434832

  7. Experimental tests of FTE theories. [Flux Transfer Event

    Science.gov (United States)

    Sonnerup, B. U. O.

    1988-01-01

    A brief overview is presented of a number of qualitative geometrical models as well as simulation models of flux transfer events (FTEs) and an attempt is made to identify critical observational tests that would help discriminate between these models. These tests are concerned with flux tube orientation, speed and structure. Available single and multiple spacecraft observations are discussed in the context of these tests. Particular attention is given to the problem of determining flux-tube orientation and speed from a single spacecraft.

  8. Improvement of the banana "Musa acuminata" reference sequence using NGS data and semi-automated bioinformatics methods

    Czech Academy of Sciences Publication Activity Database

    Martin, G.; Baurens, F.C.; Droc, G.; Rouard, M.; Cenci, A.; Kilian, A.; Hastie, A.; Doležel, Jaroslav; Aury, J. M.; Alberti, A.; Carreel, F.; D'Hont, A.

    2016-01-01

    Roč. 17, MAR 16 (2016), s. 243 ISSN 1471-2164 Institutional support: RVO:61389030 Keywords : Musa acuminata * Genome assembly * Bioinformatics tool Subject RIV: EB - Gene tics ; Molecular Biology Impact factor: 3.729, year: 2016

  9. Facilitating the use of large-scale biological data and tools in the era of translational bioinformatics

    DEFF Research Database (Denmark)

    Kouskoumvekaki, Irene; Shublaq, Nour; Brunak, Søren

    2014-01-01

    As both the amount of generated biological data and the processing compute power increase, computational experimentation is no longer the exclusivity of bioinformaticians, but it is moving across all biomedical domains. For bioinformatics to realize its translational potential, domain experts need...... access to user-friendly solutions to navigate, integrate and extract information out of biological databases, as well as to combine tools and data resources in bioinformatics workflows. In this review, we present services that assist biomedical scientists in incorporating bioinformatics tools...... into their research.We review recent applications of Cytoscape, BioGPS and DAVID for data visualization, integration and functional enrichment. Moreover, we illustrate the use of Taverna, Kepler, GenePattern, and Galaxy as open-access workbenches for bioinformatics workflows. Finally, we mention services...

  10. MACBenAbim: A Multi-platform Mobile Application for searching keyterms in Computational Biology and Bioinformatics.

    Science.gov (United States)

    Oluwagbemi, Olugbenga O; Adewumi, Adewole; Esuruoso, Abimbola

    2012-01-01

    Computational biology and bioinformatics are gradually gaining grounds in Africa and other developing nations of the world. However, in these countries, some of the challenges of computational biology and bioinformatics education are inadequate infrastructures, and lack of readily-available complementary and motivational tools to support learning as well as research. This has lowered the morale of many promising undergraduates, postgraduates and researchers from aspiring to undertake future study in these fields. In this paper, we developed and described MACBenAbim (Multi-platform Mobile Application for Computational Biology and Bioinformatics), a flexible user-friendly tool to search for, define and describe the meanings of keyterms in computational biology and bioinformatics, thus expanding the frontiers of knowledge of the users. This tool also has the capability of achieving visualization of results on a mobile multi-platform context. MACBenAbim is available from the authors for non-commercial purposes.

  11. [From bioinformatics to systems biology: account of the 12th international conference on intelligent systems in molecular biology].

    Science.gov (United States)

    Ivakhno, S S

    2004-01-01

    The paper reviews the 12th International Conference on Intelligent Systems for Molecular Biology/Third European Conference on Computational Biology 2004 that was held in Glasgow, UK, during July 31-August 4. A number of talks, papers and software demos from the conference in bioinformatics, genomics, proteomics, transcriptomics and systems biology are described. Recent applications of liquid chromatography - tandem mass spectrometry, comparative genomics and DNA microarrays are given along with the discussion of bioinformatics curricular in higher education.

  12. BioShaDock: a community driven bioinformatics shared Docker-based tools registry [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    François Moreews

    2015-12-01

    Full Text Available Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community.

  13. Making authentic science accessible—the benefits and challenges of integrating bioinformatics into a high-school science curriculum

    Science.gov (United States)

    Gelbart, Hadas; Ben-Dor, Shifra; Yarden, Anat

    2017-01-01

    Despite the central place held by bioinformatics in modern life sciences and related areas, it has only recently been integrated to a limited extent into high-school teaching and learning programs. Here we describe the assessment of a learning environment entitled ‘Bioinformatics in the Service of Biotechnology’. Students’ learning outcomes and attitudes toward the bioinformatics learning environment were measured by analyzing their answers to questions embedded within the activities, questionnaires, interviews and observations. Students’ difficulties and knowledge acquisition were characterized based on four categories: the required domain-specific knowledge (declarative, procedural, strategic or situational), the scientific field that each question stems from (biology, bioinformatics or their combination), the associated cognitive-process dimension (remember, understand, apply, analyze, evaluate, create) and the type of question (open-ended or multiple choice). Analysis of students’ cognitive outcomes revealed learning gains in bioinformatics and related scientific fields, as well as appropriation of the bioinformatics approach as part of the students’ scientific ‘toolbox’. For students, questions stemming from the ‘old world’ biology field and requiring declarative or strategic knowledge were harder to deal with. This stands in contrast to their teachers’ prediction. Analysis of students’ affective outcomes revealed positive attitudes toward bioinformatics and the learning environment, as well as their perception of the teacher’s role. Insights from this analysis yielded implications and recommendations for curriculum design, classroom enactment, teacher education and research. For example, we recommend teaching bioinformatics in an integrative and comprehensive manner, through an inquiry process, and linking it to the wider science curriculum. PMID:26801769

  14. Designing a course model for distance-based online bioinformatics training in Africa: The H3ABioNet experience.

    Directory of Open Access Journals (Sweden)

    Kim T Gurwitz

    2017-10-01

    Full Text Available Africa is not unique in its need for basic bioinformatics training for individuals from a diverse range of academic backgrounds. However, particular logistical challenges in Africa, most notably access to bioinformatics expertise and internet stability, must be addressed in order to meet this need on the continent. H3ABioNet (www.h3abionet.org, the Pan African Bioinformatics Network for H3Africa, has therefore developed an innovative, free-of-charge "Introduction to Bioinformatics" course, taking these challenges into account as part of its educational efforts to provide on-site training and develop local expertise inside its network. A multiple-delivery-mode learning model was selected for this 3-month course in order to increase access to (mostly African, expert bioinformatics trainers. The content of the course was developed to include a range of fundamental bioinformatics topics at the introductory level. For the first iteration of the course (2016, classrooms with a total of 364 enrolled participants were hosted at 20 institutions across 10 African countries. To ensure that classroom success did not depend on stable internet, trainers pre-recorded their lectures, and classrooms downloaded and watched these locally during biweekly contact sessions. The trainers were available via video conferencing to take questions during contact sessions, as well as via online "question and discussion" forums outside of contact session time. This learning model, developed for a resource-limited setting, could easily be adapted to other settings.

  15. Designing a course model for distance-based online bioinformatics training in Africa: The H3ABioNet experience.

    Science.gov (United States)

    Gurwitz, Kim T; Aron, Shaun; Panji, Sumir; Maslamoney, Suresh; Fernandes, Pedro L; Judge, David P; Ghouila, Amel; Domelevo Entfellner, Jean-Baka; Guerfali, Fatma Z; Saunders, Colleen; Mansour Alzohairy, Ahmed; Salifu, Samson P; Ahmed, Rehab; Cloete, Ruben; Kayondo, Jonathan; Ssemwanga, Deogratius; Mulder, Nicola

    2017-10-01

    Africa is not unique in its need for basic bioinformatics training for individuals from a diverse range of academic backgrounds. However, particular logistical challenges in Africa, most notably access to bioinformatics expertise and internet stability, must be addressed in order to meet this need on the continent. H3ABioNet (www.h3abionet.org), the Pan African Bioinformatics Network for H3Africa, has therefore developed an innovative, free-of-charge "Introduction to Bioinformatics" course, taking these challenges into account as part of its educational efforts to provide on-site training and develop local expertise inside its network. A multiple-delivery-mode learning model was selected for this 3-month course in order to increase access to (mostly) African, expert bioinformatics trainers. The content of the course was developed to include a range of fundamental bioinformatics topics at the introductory level. For the first iteration of the course (2016), classrooms with a total of 364 enrolled participants were hosted at 20 institutions across 10 African countries. To ensure that classroom success did not depend on stable internet, trainers pre-recorded their lectures, and classrooms downloaded and watched these locally during biweekly contact sessions. The trainers were available via video conferencing to take questions during contact sessions, as well as via online "question and discussion" forums outside of contact session time. This learning model, developed for a resource-limited setting, could easily be adapted to other settings.

  16. Bioinformatics as a pedagogical resource for the biology course in the State University of Ceara - UECE - Fortaleza, Ceará State - doi: 10.4025/actascieduc.v34i1.14584

    Directory of Open Access Journals (Sweden)

    Howard Lopes Ribeiro Junior

    2012-05-01

    Full Text Available The objective of this study was to evaluate and apply the Bioinformatics theoretical contents and practical for the course students in Biological Sciences Degree Fully enrolled in the disciplines of General Genetics and Molecular Biology, State University of Ceara in 2010. The theoretical approach previously tested (RIBEIRO JUNIOR, 2011 consisted of a presentation of historical concepts, basic and specific to current advances in research involved the areas of molecular biology. The practice of "Building a Molecular Phylogeny in Silico" is designed to become functional in practice the concepts presented above, using the database of the National Center for Biotechnology Information, NCBI, and their sequence alignment tool, the BLASTp (Basic Local Alignment Search Tool Protein-Protein. positive results obtained with the application of the lecture Introduction to Bioinformatics and practical activities were highlighted with the characterizations of molecular phylogenies of the sequences hypothetical proposals for the implementation of the alignments and the statements of students mentioned above. These activities were seen as essential so that students could experience step by step to a better understanding of the emerging field of life sciences: the Bioinformatics

  17. CROSSWORK for Glycans: Glycan Identificatin Through Mass Spectrometry and Bioinformatics. / Rasmussen, Morten ; Thaysen-Andersen, Morten ; Højrup, Peter. 2009

    DEFF Research Database (Denmark)

    Rasmussen, Morten

    automates the identification of both the glycopeptides and their N-linked glycosylation(s) from standard MS2 scans of glycoproteins. We have tested the efficiency of GLYCANthrope by searching MS2 data (CID mode) from 6 tryptically digested glycoproteins with a total of 11 known N-linked glycosylation sites......Chemical cross-linking is a useful method for deriving information on protein structure and protein-protein interaction. We have developed a method combining chemical cross-linking with mass-spectrometry and bioinformatics (CrossWork) to automate search and validation of cross-links in large......-scale experiments. Glycoproteins however have proposed a special challenge to the method, since the glycan moiety of any glycosylated residue tends to be heterogeneous within the same sample, which vastly complicates the search for cross-links . Here we present a new software application, GLYCANthrope, which...

  18. G-DOC Plus - an integrative bioinformatics platform for precision medicine.

    Science.gov (United States)

    Bhuvaneshwar, Krithika; Belouali, Anas; Singh, Varun; Johnson, Robert M; Song, Lei; Alaoui, Adil; Harris, Michael A; Clarke, Robert; Weiner, Louis M; Gusev, Yuriy; Madhavan, Subha

    2016-04-30

    G-DOC Plus is a data integration and bioinformatics platform that uses cloud computing and other advanced computational tools to handle a variety of biomedical BIG DATA including gene expression arrays, NGS and medical images so that they can be analyzed in the full context of other omics and clinical information. G-DOC Plus currently holds data from over 10,000 patients selected from private and public resources including Gene Expression Omnibus (GEO), The Cancer Genome Atlas (TCGA) and the recently added datasets from REpository for Molecular BRAin Neoplasia DaTa (REMBRANDT), caArray studies of lung and colon cancer, ImmPort and the 1000 genomes data sets. The system allows researchers to explore clinical-omic data one sample at a time, as a cohort of samples; or at the level of population, providing the user with a comprehensive view of the data. G-DOC Plus tools have been leveraged in cancer and non-cancer studies for hypothesis generation and validation; biomarker discovery and multi-omics analysis, to explore somatic mutations and cancer MRI images; as well as for training and graduate education in bioinformatics, data and computational sciences. Several of these use cases are described in this paper to demonstrate its multifaceted usability. G-DOC Plus can be used to support a variety of user groups in multiple domains to enable hypothesis generation for precision medicine research. The long-term vision of G-DOC Plus is to extend this translational bioinformatics platform to stay current with emerging omics technologies and analysis methods to continue supporting novel hypothesis generation, analysis and validation for integrative biomedical research. By integrating several aspects of the disease and exposing various data elements, such as outpatient lab workup, pathology, radiology, current treatments, molecular signatures and expected outcomes over a web interface, G-DOC Plus will continue to strengthen precision medicine research. G-DOC Plus is available

  19. Bioinformatics for Precision Medicine in Oncology: principles and application to the SHIVA clinical trial

    Directory of Open Access Journals (Sweden)

    Nicolas eServant

    2014-05-01

    Full Text Available Precision medicine (PM requires the delivery of individually adapted medical care based on the genetic characteristics of each patient and his/her tumor. The last decade witnessed the development of high-throughput technologies such as microarrays and next-generation sequencing which paved the way to PM in the field of oncology. While the cost of these technologies decreases, we are facing an exponential increase in the amount of data produced. Our ability to use this information in daily practice relies strongly on the availability of an efficient bioinformatics system that assists in the translation of knowledge from the bench towards molecular targeting and diagnosis. Clinical trials and routine diagnoses constitute different approaches, both requiring a strong bioinformatics environment capable of i warranting the integration and the traceability of data, ii ensuring the correct processing and analyses of genomic data and iii applying well-defined and reproducible procedures for workflow management and decision-making. To address the issues, a seamless information system was developed at Institut Curie which facilitates the data integration and tracks in real-time the processing of individual samples. Moreover, computational pipelines were developed to identify reliably genomic alterations and mutations from the molecular profiles of each patient. After a rigorous quality control, a meaningful report is delivered to the clinicians and biologists for the therapeutic decision. The complete bioinformatics environment and the key points of its implementation are presented in the context of the SHIVA clinical trial, a multicentric randomized phase II trial comparing targeted therapy based on tumor molecular profiling versus conventional therapy in patients with refractory cancer. The numerous challenges faced in practice during the setting up and the conduct of this trial are discussed as an illustration of PM application.

  20. BioXSD: the common data-exchange format for everyday bioinformatics web services.

    Science.gov (United States)

    Kalas, Matús; Puntervoll, Pål; Joseph, Alexandre; Bartaseviciūte, Edita; Töpfer, Armin; Venkataraman, Prabakar; Pettifer, Steve; Bryne, Jan Christian; Ison, Jon; Blanchet, Christophe; Rapacki, Kristoffer; Jonassen, Inge

    2010-09-15

    The world-wide community of life scientists has access to a large number of public bioinformatics databases and tools, which are developed and deployed using diverse technologies and designs. More and more of the resources offer programmatic web-service interface. However, efficient use of the resources is hampered by the lack of widely used, standard data-exchange formats for the basic, everyday bioinformatics data types. BioXSD has been developed as a candidate for standard, canonical exchange format for basic bioinformatics data. BioXSD is represented by a dedicated XML Schema and defines syntax for biological sequences, sequence annotations, alignments and references to resources. We have adapted a set of web services to use BioXSD as the input and output format, and implemented a test-case workflow. This demonstrates that the approach is feasible and provides smooth interoperability. Semantics for BioXSD is provided by annotation with the EDAM ontology. We discuss in a separate section how BioXSD relates to other initiatives and approaches, including existing standards and the Semantic Web. The BioXSD 1.0 XML Schema is freely available at http://www.bioxsd.org/BioXSD-1.0.xsd under the Creative Commons BY-ND 3.0 license. The http://bioxsd.org web page offers documentation, examples of data in BioXSD format, example workflows with source codes in common programming languages, an updated list of compatible web services and tools and a repository of feature requests from the community.

  1. Bioinformatics: Cheap and robust method to explore biomaterial from Indonesia biodiversity

    Science.gov (United States)

    Widodo

    2015-02-01

    Indonesia has a huge amount of biodiversity, which may contain many biomaterials for pharmaceutical application. These resources potency should be explored to discover new drugs for human wealth. However, the bioactive screening using conventional methods is very expensive and time-consuming. Therefore, we developed a methodology for screening the potential of natural resources based on bioinformatics. The method is developed based on the fact that organisms in the same taxon will have similar genes, metabolism and secondary metabolites product. Then we employ bioinformatics to explore the potency of biomaterial from Indonesia biodiversity by comparing species with the well-known taxon containing the active compound through published paper or chemical database. Then we analyze drug-likeness, bioactivity and the target proteins of the active compound based on their molecular structure. The target protein was examined their interaction with other proteins in the cell to determine action mechanism of the active compounds in the cellular level, as well as to predict its side effects and toxicity. By using this method, we succeeded to screen anti-cancer, immunomodulators and anti-inflammation from Indonesia biodiversity. For example, we found anticancer from marine invertebrate by employing the method. The anti-cancer was explore based on the isolated compounds of marine invertebrate from published article and database, and then identified the protein target, followed by molecular pathway analysis. The data suggested that the active compound of the invertebrate able to kill cancer cell. Further, we collect and extract the active compound from the invertebrate, and then examined the activity on cancer cell (MCF7). The MTT result showed that the methanol extract of marine invertebrate was highly potent in killing MCF7 cells. Therefore, we concluded that bioinformatics is cheap and robust way to explore bioactive from Indonesia biodiversity for source of drug and another

  2. BioXSD: the common data-exchange format for everyday bioinformatics web services

    Science.gov (United States)

    Kalaš, Matúš; Puntervoll, Pæl; Joseph, Alexandre; Bartaševičiūtė, Edita; Töpfer, Armin; Venkataraman, Prabakar; Pettifer, Steve; Bryne, Jan Christian; Ison, Jon; Blanchet, Christophe; Rapacki, Kristoffer; Jonassen, Inge

    2010-01-01

    Motivation: The world-wide community of life scientists has access to a large number of public bioinformatics databases and tools, which are developed and deployed using diverse technologies and designs. More and more of the resources offer programmatic web-service interface. However, efficient use of the resources is hampered by the lack of widely used, standard data-exchange formats for the basic, everyday bioinformatics data types. Results: BioXSD has been developed as a candidate for standard, canonical exchange format for basic bioinformatics data. BioXSD is represented by a dedicated XML Schema and defines syntax for biological sequences, sequence annotations, alignments and references to resources. We have adapted a set of web services to use BioXSD as the input and output format, and implemented a test-case workflow. This demonstrates that the approach is feasible and provides smooth interoperability. Semantics for BioXSD is provided by annotation with the EDAM ontology. We discuss in a separate section how BioXSD relates to other initiatives and approaches, including existing standards and the Semantic Web. Availability: The BioXSD 1.0 XML Schema is freely available at http://www.bioxsd.org/BioXSD-1.0.xsd under the Creative Commons BY-ND 3.0 license. The http://bioxsd.org web page offers documentation, examples of data in BioXSD format, example workflows with source codes in common programming languages, an updated list of compatible web services and tools and a repository of feature requests from the community. Contact: matus.kalas@bccs.uib.no; developers@bioxsd.org; support@bioxsd.org PMID:20823319

  3. Quantum Bio-Informatics:From Quantum Information to Bio-Informatics

    CERN Document Server

    Freudenberg, W; Ohya, M

    2008-01-01

    The purpose of this volume is examine bio-informatics and quantum information, which are growing rapidly at present, and to attempt to connect the two, with a view to enumerating and solving the many fundamental problems they entail. To this end, we look for interdisciplinary bridges in mathematics, physics, and information and life sciences. In particular, research into a new paradigm for information science and life science on the basis of quantum theory is emphasized. Sample Chapter(s). Markov Fields on Graphs (599 KB). Contents: Markov Fields on Graphs (L Accardi & H Ohno); Some Aspects of

  4. Bioinformatic identification and characterization of human endothelial cell-restricted genes

    Directory of Open Access Journals (Sweden)

    Keskin Derin B

    2010-05-01

    Full Text Available Abstract Background In this study, we used a systematic bioinformatics analysis approach to elucidate genes that exhibit an endothelial cell (EC restricted expression pattern, and began to define their regulation, tissue distribution, and potential biological role. Results Using a high throughput microarray platform, a primary set of 1,191 transcripts that are enriched in different primary ECs compared to non-ECs was identified (LCB >3, FDR Conclusion The study provides an initial catalogue of EC-restricted genes most of which are ubiquitously expressed in different endothelial cells.

  5. ClusterControl: a web interface for distributing and monitoring bioinformatics applications on a Linux cluster.

    Science.gov (United States)

    Stocker, Gernot; Rieder, Dietmar; Trajanoski, Zlatko

    2004-03-22

    ClusterControl is a web interface to simplify distributing and monitoring bioinformatics applications on Linux cluster systems. We have developed a modular concept that enables integration of command line oriented program into the application framework of ClusterControl. The systems facilitate integration of different applications accessed through one interface and executed on a distributed cluster system. The package is based on freely available technologies like Apache as web server, PHP as server-side scripting language and OpenPBS as queuing system and is available free of charge for academic and non-profit institutions. http://genome.tugraz.at/Software/ClusterControl

  6. bioalcidae, samjs and vcffilterjs: object-oriented formatters and filters for bioinformatics files.

    Science.gov (United States)

    Lindenbaum, Pierre; Redon, Richard

    2018-04-01

    Reformatting and filtering bioinformatics files are common tasks for bioinformaticians. Standard Linux tools and specific programs are usually used to perform such tasks but there is still a gap between using these tools and the programming interface of some existing libraries. In this study, we developed a set of tools namely bioalcidae, samjs and vcffilterjs that reformat or filter files using a JavaScript engine or a pure java expression and taking advantage of the java API for high-throughput sequencing data (htsjdk). https://github.com/lindenb/jvarkit. pierre.lindenbaum@univ-nantes.fr.

  7. Swarm intelligence in bioinformatics: methods and implementations for discovering patterns of multiple sequences.

    Science.gov (United States)

    Cui, Zhihua; Zhang, Yi

    2014-02-01

    As a promising and innovative research field, bioinformatics has attracted increasing attention recently. Beneath the enormous number of open problems in this field, one fundamental issue is about the accurate and efficient computational methodology that can deal with tremendous amounts of data. In this paper, we survey some applications of swarm intelligence to discover patterns of multiple sequences. To provide a deep insight, ant colony optimization, particle swarm optimization, artificial bee colony and artificial fish swarm algorithm are selected, and their applications to multiple sequence alignment and motif detecting problem are discussed.

  8. Bioinformatics-Driven Identification and Examination of Candidate Genes for Non-Alcoholic Fatty Liver Disease

    DEFF Research Database (Denmark)

    Banasik, Karina; Justesen, Johanne M.; Hornbak, Malene

    2011-01-01

    Objective: Candidate genes for non-alcoholic fatty liver disease (NAFLD) identified by a bioinformatics approach were examined for variant associations to quantitative traits of NAFLD-related phenotypes. Research Design and Methods: By integrating public database text mining, trans-organism protein......-protein interaction transferal, and information on liver protein expression a protein-protein interaction network was constructed and from this a smaller isolated interactome was identified. Five genes from this interactome were selected for genetic analysis. Twenty-one tag single-nucleotide polymorphisms (SNPs...

  9. Hermes: Seamless delivery of containerized bioinformatics workflows in hybrid cloud (HTC environments

    Directory of Open Access Journals (Sweden)

    Athanassios M. Kintsakis

    2017-01-01

    Full Text Available Hermes introduces a new “describe once, run anywhere” paradigm for the execution of bioinformatics workflows in hybrid cloud environments. It combines the traditional features of parallelization-enabled workflow management systems and of distributed computing platforms in a container-based approach. It offers seamless deployment, overcoming the burden of setting up and configuring the software and network requirements. Most importantly, Hermes fosters the reproducibility of scientific workflows by supporting standardization of the software execution environment, thus leading to consistent scientific workflow results and accelerating scientific output.

  10. Proceedings of the 2013 MidSouth Computational Biology and Bioinformatics Society (MCBIOS) Conference.

    Science.gov (United States)

    Wren, Jonathan D; Dozmorov, Mikhail G; Burian, Dennis; Kaundal, Rakesh; Perkins, Andy; Perkins, Ed; Kupfer, Doris M; Springer, Gordon K

    2013-01-01

    The tenth annual conference of the MidSouth Computational Biology and Bioinformatics Society (MCBIOS 2013), "The 10th Anniversary in a Decade of Change: Discovery in a Sea of Data", took place at the Stoney Creek Inn & Conference Center in Columbia, Missouri on April 5-6, 2013. This year's Conference Chairs were Gordon Springer and Chi-Ren Shyu from the University of Missouri and Edward Perkins from the US Army Corps of Engineers Engineering Research and Development Center, who is also the current MCBIOS President (2012-3). There were 151 registrants and a total of 111 abstracts (51 oral presentations and 60 poster session abstracts).

  11. An interdepartmental Ph.D. program in computational biology and bioinformatics: the Yale perspective.

    Science.gov (United States)

    Gerstein, Mark; Greenbaum, Dov; Cheung, Kei; Miller, Perry L

    2007-02-01

    Computational biology and bioinformatics (CBB), the terms often used interchangeably, represent a rapidly evolving biological discipline. With the clear potential for discovery and innovation, and the need to deal with the deluge of biological data, many academic institutions are committing significant resources to develop CBB research and training programs. Yale formally established an interdepartmental Ph.D. program in CBB in May 2003. This paper describes Yale's program, discussing the scope of the field, the program's goals and curriculum, as well as a number of issues that arose in implementing the program. (Further updated information is available from the program's website, www.cbb.yale.edu.)

  12. Hermes: Seamless delivery of containerized bioinformatics workflows in hybrid cloud (HTC) environments

    Science.gov (United States)

    Kintsakis, Athanassios M.; Psomopoulos, Fotis E.; Symeonidis, Andreas L.; Mitkas, Pericles A.

    Hermes introduces a new "describe once, run anywhere" paradigm for the execution of bioinformatics workflows in hybrid cloud environments. It combines the traditional features of parallelization-enabled workflow management systems and of distributed computing platforms in a container-based approach. It offers seamless deployment, overcoming the burden of setting up and configuring the software and network requirements. Most importantly, Hermes fosters the reproducibility of scientific workflows by supporting standardization of the software execution environment, thus leading to consistent scientific workflow results and accelerating scientific output.

  13. Designing a course model for distance-based online bioinformatics training in Africa: The H3ABioNet experience

    Science.gov (United States)

    Panji, Sumir; Fernandes, Pedro L.; Judge, David P.; Ghouila, Amel; Salifu, Samson P.; Ahmed, Rehab; Kayondo, Jonathan; Ssemwanga, Deogratius

    2017-01-01

    Africa is not unique in its need for basic bioinformatics training for individuals from a diverse range of academic backgrounds. However, particular logistical challenges in Africa, most notably access to bioinformatics expertise and internet stability, must be addressed in order to meet this need on the continent. H3ABioNet (www.h3abionet.org), the Pan African Bioinformatics Network for H3Africa, has therefore developed an innovative, free-of-charge “Introduction to Bioinformatics” course, taking these challenges into account as part of its educational efforts to provide on-site training and develop local expertise inside its network. A multiple-delivery–mode learning model was selected for this 3-month course in order to increase access to (mostly) African, expert bioinformatics trainers. The content of the course was developed to include a range of fundamental bioinformatics topics at the introductory level. For the first iteration of the course (2016), classrooms with a total of 364 enrolled participants were hosted at 20 institutions across 10 African countries. To ensure that classroom success did not depend on stable internet, trainers pre-recorded their lectures, and classrooms downloaded and watched these locally during biweekly contact sessions. The trainers were available via video conferencing to take questions during contact sessions, as well as via online “question and discussion” forums outside of contact session time. This learning model, developed for a resource-limited setting, could easily be adapted to other settings. PMID:28981516

  14. A generally applicable lightweight method for calculating a value structure for tools and services in bioinformatics infrastructure projects.

    Science.gov (United States)

    Mayer, Gerhard; Quast, Christian; Felden, Janine; Lange, Matthias; Prinz, Manuel; Pühler, Alfred; Lawerenz, Chris; Scholz, Uwe; Glöckner, Frank Oliver; Müller, Wolfgang; Marcus, Katrin; Eisenacher, Martin

    2017-10-30

    Sustainable noncommercial bioinformatics infrastructures are a prerequisite to use and take advantage of the potential of big data analysis for research and economy. Consequently, funders, universities and institutes as well as users ask for a transparent value model for the tools and services offered. In this article, a generally applicable lightweight method is described by which bioinformatics infrastructure projects can estimate the value of tools and services offered without determining exactly the total costs of ownership. Five representative scenarios for value estimation from a rough estimation to a detailed breakdown of costs are presented. To account for the diversity in bioinformatics applications and services, the notion of service-specific 'service provision units' is introduced together with the factors influencing them and the main underlying assumptions for these 'value influencing factors'. Special attention is given on how to handle personnel costs and indirect costs such as electricity. Four examples are presented for the calculation of the value of tools and services provided by the German Network for Bioinformatics Infrastructure (de.NBI): one for tool usage, one for (Web-based) database analyses, one for consulting services and one for bioinformatics training events. Finally, from the discussed values, the costs of direct funding and the costs of payment of services by funded projects are calculated and compared. © The Author 2017. Published by Oxford University Press.

  15. OralCard: a bioinformatic tool for the study of oral proteome.

    Science.gov (United States)

    Arrais, Joel P; Rosa, Nuno; Melo, José; Coelho, Edgar D; Amaral, Diana; Correia, Maria José; Barros, Marlene; Oliveira, José Luís

    2013-07-01

    The molecular complexity of the human oral cavity can only be clarified through identification of components that participate within it. However current proteomic techniques produce high volumes of information that are dispersed over several online databases. Collecting all of this data and using an integrative approach capable of identifying unknown associations is still an unsolved problem. This is the main motivation for this work. We present the online bioinformatic tool OralCard, which comprises results from 55 manually curated articles reflecting the oral molecular ecosystem (OralPhysiOme). It comprises experimental information available from the oral proteome both of human (OralOme) and microbial origin (MicroOralOme) structured in protein, disease and organism. This tool is a key resource for researchers to understand the molecular foundations implicated in biology and disease mechanisms of the oral cavity. The usefulness of this tool is illustrated with the analysis of the oral proteome associated with diabetes melitus type 2. OralCard is available at http://bioinformatics.ua.pt/oralcard. Copyright © 2013 Elsevier Ltd. All rights reserved.

  16. XCluSim: a visual analytics tool for interactively comparing multiple clustering results of bioinformatics data

    Science.gov (United States)

    2015-01-01

    Background Though cluster analysis has become a routine analytic task for bioinformatics research, it is still arduous for researchers to assess the quality of a clustering result. To select the best clustering method and its parameters for a dataset, researchers have to run multiple clustering algorithms and compare them. However, such a comparison task with multiple clustering results is cognitively demanding and laborious. Results In this paper, we present XCluSim, a visual analytics tool that enables users to interactively compare multiple clustering results based on the Visual Information Seeking Mantra. We build a taxonomy for categorizing existing techniques of clustering results visualization in terms of the Gestalt principles of grouping. Using the taxonomy, we choose the most appropriate interactive visualizations for presenting individual clustering results from different types of clustering algorithms. The efficacy of XCluSim is shown through case studies with a bioinformatician. Conclusions Compared to other relevant tools, XCluSim enables users to compare multiple clustering results in a more scalable manner. Moreover, XCluSim supports diverse clustering algorithms and dedicated visualizations and interactions for different types of clustering results, allowing more effective exploration of details on demand. Through case studies with a bioinformatics researcher, we received positive feedback on the functionalities of XCluSim, including its ability to help identify stably clustered items across multiple clustering results. PMID:26328893

  17. Documenting the emergence of bio-ontologies: or, why researching bioinformatics requires HPSSB.

    Science.gov (United States)

    Leonelli, Sabina

    2010-01-01

    This paper reflects on the analytic challenges emerging from the study of bioinformatic tools recently created to store and disseminate biological data, such as databases, repositories, and bio-ontologies. I focus my discussion on the Gene Ontology, a term that defines three entities at once: a classification system facilitating the distribution and use of genomic data as evidence towards new insights; an expert community specialised in the curation of those data; and a scientific institution promoting the use of this tool among experimental biologists. These three dimensions of the Gene Ontology can be clearly distinguished analytically, but are tightly intertwined in practice. I suggest that this is true of all bioinformatic tools: they need to be understood simultaneously as epistemic, social, and institutional entities, since they shape the knowledge extracted from data and at the same time regulate the organisation, development, and communication of research. This viewpoint has one important implication for the methodologies used to study these tools; that is, the need to integrate historical, philosophical, and sociological approaches. I illustrate this claim through examples of misunderstandings that may result from a narrowly disciplinary study of the Gene Ontology, as I experienced them in my own research.

  18. The secondary metabolite bioinformatics portal: Computational tools to facilitate synthetic biology of secondary metabolite production

    Directory of Open Access Journals (Sweden)

    Tilmann Weber

    2016-06-01

    Full Text Available Natural products are among the most important sources of lead molecules for drug discovery. With the development of affordable whole-genome sequencing technologies and other ‘omics tools, the field of natural products research is currently undergoing a shift in paradigms. While, for decades, mainly analytical and chemical methods gave access to this group of compounds, nowadays genomics-based methods offer complementary approaches to find, identify and characterize such molecules. This paradigm shift also resulted in a high demand for computational tools to assist researchers in their daily work. In this context, this review gives a summary of tools and databases that currently are available to mine, identify and characterize natural product biosynthesis pathways and their producers based on ‘omics data. A web portal called Secondary Metabolite Bioinformatics Portal (SMBP at http://www.secondarymetabolites.org is introduced to provide a one-stop catalog and links to these bioinformatics resources. In addition, an outlook is presented how the existing tools and those to be developed will influence synthetic biology approaches in the natural products field.

  19. A Critical Analysis of Assessment Quality in Genomics and Bioinformatics Education Research

    Science.gov (United States)

    Campbell, Chad E.; Nehm, Ross H.

    2013-01-01

    The growing importance of genomics and bioinformatics methods and paradigms in biology has been accompanied by an explosion of new curricula and pedagogies. An important question to ask about these educational innovations is whether they are having a meaningful impact on students’ knowledge, attitudes, or skills. Although assessments are necessary tools for answering this question, their outputs are dependent on their quality. Our study 1) reviews the central importance of reliability and construct validity evidence in the development and evaluation of science assessments and 2) examines the extent to which published assessments in genomics and bioinformatics education (GBE) have been developed using such evidence. We identified 95 GBE articles (out of 226) that contained claims of knowledge increases, affective changes, or skill acquisition. We found that 1) the purpose of most of these studies was to assess summative learning gains associated with curricular change at the undergraduate level, and 2) a minority (quality of evidence derived from these instruments. We end with recommendations for improving assessment quality in GBE. PMID:24006400

  20. Design and bioinformatics analysis of novel biomimetic peptides as nanocarriers for gene transfer

    Directory of Open Access Journals (Sweden)

    Asia Majidi

    2015-01-01

    Full Text Available Objective(s: The introduction of nucleic acids into cells for therapeutic objectives is significantly hindered by the size and charge of these molecules and therefore requires efficient vectors that assist cellular uptake. For several years great efforts have been devoted to the study of development of recombinant vectors based on biological domains with potential applications in gene therapy. Such vectors have been synthesized in genetically engineered approach, resulting in biomacromolecules with new properties that are not present in nature. Materials and Methods: In this study, we have designed new peptides using homology modeling with the purpose of overcoming the cell barriers for successful gene delivery through Bioinformatics tools. Three different carriers were designed and one of those with better score through Bioinformatics tools was cloned, expressed and its affinity for pDNA was monitored. Results: The resultszz demonstrated that the vector can effectively condense pDNAinto nanoparticles with the average sizes about 100 nm. Conclusion: We hope these peptides can overcome the biological barriers associated with gene transfer, and mediate efficient gene delivery.

  1. BioGPS descriptors for rational engineering of enzyme promiscuity and structure based bioinformatic analysis.

    Directory of Open Access Journals (Sweden)

    Valerio Ferrario

    Full Text Available A new bioinformatic methodology was developed founded on the Unsupervised Pattern Cognition Analysis of GRID-based BioGPS descriptors (Global Positioning System in Biological Space. The procedure relies entirely on three-dimensional structure analysis of enzymes and does not stem from sequence or structure alignment. The BioGPS descriptors account for chemical, geometrical and physical-chemical features of enzymes and are able to describe comprehensively the active site of enzymes in terms of "pre-organized environment" able to stabilize the transition state of a given reaction. The efficiency of this new bioinformatic strategy was demonstrated by the consistent clustering of four different Ser hydrolases classes, which are characterized by the same active site organization but able to catalyze different reactions. The method was validated by considering, as a case study, the engineering of amidase activity into the scaffold of a lipase. The BioGPS tool predicted correctly the properties of lipase variants, as demonstrated by the projection of mutants inside the BioGPS "roadmap".

  2. Tools and data services registry: a community effort to document bioinformatics resources

    Science.gov (United States)

    Ison, Jon; Rapacki, Kristoffer; Ménager, Hervé; Kalaš, Matúš; Rydza, Emil; Chmura, Piotr; Anthon, Christian; Beard, Niall; Berka, Karel; Bolser, Dan; Booth, Tim; Bretaudeau, Anthony; Brezovsky, Jan; Casadio, Rita; Cesareni, Gianni; Coppens, Frederik; Cornell, Michael; Cuccuru, Gianmauro; Davidsen, Kristian; Vedova, Gianluca Della; Dogan, Tunca; Doppelt-Azeroual, Olivia; Emery, Laura; Gasteiger, Elisabeth; Gatter, Thomas; Goldberg, Tatyana; Grosjean, Marie; Grüning, Björn; Helmer-Citterich, Manuela; Ienasescu, Hans; Ioannidis, Vassilios; Jespersen, Martin Closter; Jimenez, Rafael; Juty, Nick; Juvan, Peter; Koch, Maximilian; Laibe, Camille; Li, Jing-Woei; Licata, Luana; Mareuil, Fabien; Mičetić, Ivan; Friborg, Rune Møllegaard; Moretti, Sebastien; Morris, Chris; Möller, Steffen; Nenadic, Aleksandra; Peterson, Hedi; Profiti, Giuseppe; Rice, Peter; Romano, Paolo; Roncaglia, Paola; Saidi, Rabie; Schafferhans, Andrea; Schwämmle, Veit; Smith, Callum; Sperotto, Maria Maddalena; Stockinger, Heinz; Vařeková, Radka Svobodová; Tosatto, Silvio C.E.; de la Torre, Victor; Uva, Paolo; Via, Allegra; Yachdav, Guy; Zambelli, Federico; Vriend, Gert; Rost, Burkhard; Parkinson, Helen; Løngreen, Peter; Brunak, Søren

    2016-01-01

    Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand. Here we present a community-driven curation effort, supported by ELIXIR—the European infrastructure for biological information—that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners. As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools. PMID:26538599

  3. Ergatis: a web interface and scalable software system for bioinformatics workflows

    Science.gov (United States)

    Orvis, Joshua; Crabtree, Jonathan; Galens, Kevin; Gussman, Aaron; Inman, Jason M.; Lee, Eduardo; Nampally, Sreenath; Riley, David; Sundaram, Jaideep P.; Felix, Victor; Whitty, Brett; Mahurkar, Anup; Wortman, Jennifer; White, Owen; Angiuoli, Samuel V.

    2010-01-01

    Motivation: The growth of sequence data has been accompanied by an increasing need to analyze data on distributed computer clusters. The use of these systems for routine analysis requires scalable and robust software for data management of large datasets. Software is also needed to simplify data management and make large-scale bioinformatics analysis accessible and reproducible to a wide class of target users. Results: We have developed a workflow management system named Ergatis that enables users to build, execute and monitor pipelines for computational analysis of genomics data. Ergatis contains preconfigured components and template pipelines for a number of common bioinformatics tasks such as prokaryotic genome annotation and genome comparisons. Outputs from many of these components can be loaded into a Chado relational database. Ergatis was designed to be accessible to a broad class of users and provides a user friendly, web-based interface. Ergatis supports high-throughput batch processing on distributed compute clusters and has been used for data management in a number of genome annotation and comparative genomics projects. Availability: Ergatis is an open-source project and is freely available at http://ergatis.sourceforge.net Contact: jorvis@users.sourceforge.net PMID:20413634

  4. LipSpin: A New Bioinformatics Tool for Quantitative 1H NMR Lipid Profiling.

    Science.gov (United States)

    Barrilero, Rubén; Gil, Miriam; Amigó, Núria; Dias, Cintia B; Wood, Lisa G; Garg, Manohar L; Ribalta, Josep; Heras, Mercedes; Vinaixa, Maria; Correig, Xavier

    2018-02-06

    The structural similarity among lipid species and the low sensitivity and spectral resolution of nuclear magnetic resonance (NMR) have traditionally hampered the routine use of 1 H NMR lipid profiling of complex biological samples in metabolomics, which remains mostly manual and lacks freely available bioinformatics tools. However, 1 H NMR lipid profiling provides fast quantitative screening of major lipid classes (fatty acids, glycerolipids, phospholipids, and sterols) and some individual species and has been used in several clinical and nutritional studies, leading to improved risk prediction models. In this Article, we present LipSpin, a free and open-source bioinformatics tool for quantitative 1 H NMR lipid profiling. LipSpin implements a constrained line shape fitting algorithm based on voigt profiles and spectral templates from spectra of lipid standards, which automates the analysis of severely overlapped spectral regions and lipid signals with complex coupling patterns. LipSpin provides the most detailed quantification of fatty acid families and choline phospholipids in serum lipid samples by 1 H NMR to date. Moreover, analytical and clinical results using LipSpin quantifications conform with other techniques commonly used for lipid analysis.

  5. Galaxy Workflows for Web-based Bioinformatics Analysis of Aptamer High-throughput Sequencing Data

    Directory of Open Access Journals (Sweden)

    William H Thiel

    2016-01-01

    Full Text Available Development of RNA and DNA aptamers for diagnostic and therapeutic applications is a rapidly growing field. Aptamers are identified through iterative rounds of selection in a process termed SELEX (Systematic Evolution of Ligands by EXponential enrichment. High-throughput sequencing (HTS revolutionized the modern SELEX process by identifying millions of aptamer sequences across multiple rounds of aptamer selection. However, these vast aptamer HTS datasets necessitated bioinformatics techniques. Herein, we describe a semiautomated approach to analyze aptamer HTS datasets using the Galaxy Project, a web-based open source collection of bioinformatics tools that were originally developed to analyze genome, exome, and transcriptome HTS data. Using a series of Workflows created in the Galaxy webserver, we demonstrate efficient processing of aptamer HTS data and compilation of a database of unique aptamer sequences. Additional Workflows were created to characterize the abundance and persistence of aptamer sequences within a selection and to filter sequences based on these parameters. A key advantage of this approach is that the online nature of the Galaxy webserver and its graphical interface allow for the analysis of HTS data without the need to compile code or install multiple programs.

  6. Bioinformatics-driven identification and examination of candidate genes for non-alcoholic fatty liver disease.

    Directory of Open Access Journals (Sweden)

    Karina Banasik

    2011-01-01

    Full Text Available Candidate genes for non-alcoholic fatty liver disease (NAFLD identified by a bioinformatics approach were examined for variant associations to quantitative traits of NAFLD-related phenotypes.By integrating public database text mining, trans-organism protein-protein interaction transferal, and information on liver protein expression a protein-protein interaction network was constructed and from this a smaller isolated interactome was identified. Five genes from this interactome were selected for genetic analysis. Twenty-one tag single-nucleotide polymorphisms (SNPs which captured all common variation in these genes were genotyped in 10,196 Danes, and analyzed for association with NAFLD-related quantitative traits, type 2 diabetes (T2D, central obesity, and WHO-defined metabolic syndrome (MetS.273 genes were included in the protein-protein interaction analysis and EHHADH, ECHS1, HADHA, HADHB, and ACADL were selected for further examination. A total of 10 nominal statistical significant associations (P<0.05 to quantitative metabolic traits were identified. Also, the case-control study showed associations between variation in the five genes and T2D, central obesity, and MetS, respectively. Bonferroni adjustments for multiple testing negated all associations.Using a bioinformatics approach we identified five candidate genes for NAFLD. However, we failed to provide evidence of associations with major effects between SNPs in these five genes and NAFLD-related quantitative traits, T2D, central obesity, and MetS.

  7. [Bioinformatic analysis of adenoma-normal mucosa SSH library of colon].

    Science.gov (United States)

    Lü, Bing-Jian; Cui, Jing; Xu, Jing; Zhang, Hao; Luo, Min-Jie; Zhu, Yi-Min; Lai, Mao-De

    2006-04-01

    We established a colonic adenoma-normal mucosa suppressive subtraction hybridization (SSH) library in 1999. In this study, we wanted to explore the expression profile of all candidate genes in this library. We developed an EST pipeline which contained two in-house software packages, nucleic acid analytical software and GetUni. The nucleic acid analytical software, an integrator of the universal bioinformatics tools including phred, phd2fasta, cross_match, repeatmasker and blast2.0, can blast sequences of differential clones with the downloaded non-redundant nucleotide (NR) database. GetUni can cluster these NR sequences into Unigene via matching with the downloaded Homo Sapiens UniGene database. Sixty-two candidate genes in A-N library were obtained via the high throughput automatic gene expression bioinformatics pipeline. Gene Ontology online analysis revealed that ribosome genes and immunity-regulating genes were the two most common categories in the KEGG or Biocarta Pathway. We also detected the expression of 2 genes with highest hits, Reg4 and FAM46A, by semi-quantitative RT-PCR. Both genes were up-regulated in 10 or 9 out of 10 adenomas in comparison with the paired normal mucosa, respectively. The candidate genes in A-N library would be of great significance in disclosing the molecular mechanism underlying in colonic adenoma initiation and progression.

  8. Opportunities and challenges provided by cloud repositories for bioinformatics-enabled drug discovery.

    Science.gov (United States)

    Dalpé, Gratien; Joly, Yann

    2014-09-01

    Healthcare-related bioinformatics databases are increasingly offering the possibility to maintain, organize, and distribute DNA sequencing data. Different national and international institutions are currently hosting such databases that offer researchers website platforms where they can obtain sequencing data on which they can perform different types of analysis. Until recently, this process remained mostly one-dimensional, with most analysis concentrated on a limited amount of data. However, newer genome sequencing technology is producing a huge amount of data that current computer facilities are unable to handle. An alternative approach has been to start adopting cloud computing services for combining the information embedded in genomic and model system biology data, patient healthcare records, and clinical trials' data. In this new technological paradigm, researchers use virtual space and computing power from existing commercial or not-for-profit cloud service providers to access, store, and analyze data via different application programming interfaces. Cloud services are an alternative to the need of larger data storage; however, they raise different ethical, legal, and social issues. The purpose of this Commentary is to summarize how cloud computing can contribute to bioinformatics-based drug discovery and to highlight some of the outstanding legal, ethical, and social issues that are inherent in the use of cloud services. © 2014 Wiley Periodicals, Inc.

  9. Novel bioinformatics strategies for prediction of directional sequence changes in influenza virus genomes and for surveillance of potentially hazardous strains.

    Science.gov (United States)

    Iwasaki, Yuki; Abe, Takashi; Wada, Yoshiko; Wada, Kennosuke; Ikemura, Toshimichi

    2013-08-21

    With the remarkable increase of microbial and viral sequence data obtained from high-throughput DNA sequencers, novel tools are needed for comprehensive analysis of the big sequence data. We have developed "Batch-Learning Self-Organizing Map (BLSOM)" which can characterize very many, even millions of, genomic sequences on one plane. Influenza virus is one of zoonotic viruses and shows clear host tropism. Important issues for bioinformatics studies of influenza viruses are prediction of genomic sequence changes in the near future and surveillance of potentially hazardous strains. To characterize sequence changes in influenza virus genomes after invasion into humans from other animal hosts, we applied BLSOMs to analyses of mono-, di-, tri-, and tetranucleotide compositions in all genome sequences of influenza A and B viruses and found clear host-dependent clustering (self-organization) of the sequences. Viruses isolated from humans and birds differed in mononucleotide composition from each other. In addition, host-dependent oligonucleotide compositions that could not be explained with the host-dependent mononucleotide composition were revealed by oligonucleotide BLSOMs. Retrospective time-dependent directional changes of mono- and oligonucleotide compositions, which were visualized for human strains on BLSOMs, could provide predictive information about sequence changes in newly invaded viruses from other animal hosts (e.g. the swine-derived pandemic H1N1/09). Basing on the host-dependent oligonucleotide composition, we proposed a strategy for prediction of directional changes of virus sequences and for surveillance of potentially hazardous strains when introduced into human populations from non-human sources. Millions of genomic sequences from infectious microbes and viruses have become available because of their medical and social importance, and BLSOM can characterize the big data and support efficient knowledge discovery.

  10. Caracterização física, química e sensorial de cookies confeccionados com farinha de talo de couve (FTC e farinha de talo de espinafre (FTE ricas em fibra alimentar Phisical, chemical, and sensorial characterization of cookies made with kale stalk flour (KSF and spinach stalk flour (SSF rich in nourishing fiber

    Directory of Open Access Journals (Sweden)

    Ana Karina Mauro

    2010-09-01

    Full Text Available Objetivou-se utilizar Farinhas de Talos (FT na confecção de cookies. As farinhas foram obtidas a partir de talos de couve manteiga e espinafre desidratados. Confeccionaram-se 3 tipos de cookies por modificação da formulação padrão de sugar-sanap cookie, método 10-50D (AMERICAN..., 1995: cookie controle, cookie com 15% de FTC e com 15% de FTE. Realizou-se a composição centesimal nas FT segundo as metodologias descritas da AOAC (ASSOCIATION..., 1995, sendo as fibras alimentares analisadas pelo método de van Soest (1963. Nos cookies, procedeu-se às seguintes análises: peso, espessura, diâmetro, volume e densidade aparente segundo método 10-50D da AACC (AMERICAN..., 1995, acidez titulável e pH através da técnica do Instituto Adolfo Lutz (SÃO PAULO, 1985, rendimento e fator térmico. A composição centesimal dos cookies seguiu normas da AOAC (ASSOCIATION..., 1995 e, para aceitação dos experimentais, utilizou-se escala hedônica de 9 pontos. Constatou-se que as FT possuem baixa densidade energética e alto teor de fibra alimentar e cinzas. Os cookies experimentais apresentaram maior (p The objective of the study was to use Stalk Flours (SF to prepare cookies. The ST was made with dehydrated Kale Stalks (KSF and Spinach Stalks (SSF. Three kinds of cookies were prepared by modification of the sugar-snap cookie formula, 10-50D method (AMERICAN..., 1995: control cookie, cookie with 15% of KSF, and cookie with 15% of SSF. The centesimal composition was performed on the SF by the AOAC (ASSOCIATION..., 1995 and the nourishing fibers were prepared according to van Soest (1963. The following parameters were analyzed: weight, thickness, diameter, volume, and apparent density according to the 10-50D method of the AACC (AMERICAN..., 1995, titratable acidity and pH using the Adolfo Lutz Institute (2005 technique, and also revenue and caloric factor. The determination of the centesimal composition of the cookies followed the AOAC regulations

  11. Improvement in the determination of HIV-1 tropism using the V3 gene sequence and a combination of bioinformatic tools.

    Science.gov (United States)

    Chueca, Natalia; Garrido, Carolina; Alvarez, Marta; Poveda, Eva; de Dios Luna, Juan; Zahonero, Natalia; Hernández-Quero, José; Soriano, Vicente; Maroto, Carmen; de Mendoza, Carmen; García, Federico

    2009-05-01

    Assessment of HIV tropism using bioinformatic tools based on V3 sequences correlates poorly with results provided by phenotypic tropism assays, particularly for recognizing X4 viruses. This may represent an obstacle for the use of CCR5 antagonists. An algorithm combining several bioinformatic tools might improve the correlation with phenotypic tropism results. A total of 200 V3 sequences from HIV-1 subtype B, available in several databases with known phenotypic tropism results, were used to evaluate the sensitivity and specificity of seven different bioinformatic tools (PSSM, SVM, C4.5 decision tree generator and C4.5, PART, Charge Rule, and Geno2pheno). The best predictive bioinformatic tools were identified, and a model combining several of these was built. Using the 200 reference sequences, SVM and geno2-pheno showed the highest sensitivity for detecting X4 viruses (98.8% and 93.7%, respectively); however, their specificity was relatively low (62.5% and 86.6%, respectively). For R5 viruses, PSSM and C4.5 gave the same results and outperformed other bioinformatic tools (95.7% sensitivity, 82% specificity). When results from three out of these four tools were concordant, the sensitivity and specificity, taking as reference the results from phenotypic tropism assays, were over 90% in predicting either R5 or X4 viruses (AUC: 0.9701; 95% CI: 0.9358-0.9889). An algorithm combining four distinct bioinformatic tools (SVM, geno2pheno, PSSM and C4.5), improves the genotypic prediction of HIV tropism, and merits further evaluation, as it might prove useful as a screening strategy in clinical practice. Copyright 2009 Wiley-Liss, Inc.

  12. Elucidation of cross-species proteomic effects in human and hominin bone proteome identification through a bioinformatics experiment

    DEFF Research Database (Denmark)

    Welker, F.

    2018-01-01

    not been demonstrated. If error-tolerant searches do not overcome the cross-species proteomic issue then there might be inherent biases in the identified proteomes. Here, a bioinformatics experiment is performed to test this using a set of modern human bone proteomes and three independent searches against...... positions in the search against the chimpanzee proteome (≈90%, 6-8 Ma). This provides a bioinformatic background to future phylogenetic and proteomic analysis of ancient hominin proteomes, including the future description of novel hominin amino acid sequences, but also has negative implications...

  13. Suppression subtractive hybridization (SSH) combined with bioinformatics method: an integrated functional annotation approach for analysis of differentially expressed immune-genes in insects.

    Science.gov (United States)

    Badapanda, Chandan

    2013-01-01

    The suppression subtractive hybridization (SSH) approach, a PCR based approach which amplifies differentially expressed cDNAs (complementary DNAs), while simultaneously suppressing amplification of common cDNAs, was employed to identify immuneinducible genes in insects. This technique has been used as a suitable tool for experimental identification of novel genes in eukaryotes as well as prokaryotes; whose genomes have been sequenced, or the species whose genomes have yet to be sequenced. In this article, I have proposed a method for in silico functional characterization of immune-inducible genes from insects. Apart from immune-inducible genes from insects, this method can be applied for the analysis of genes from other species, starting from bacteria to plants and animals. This article is provided with a background of SSH-based method taking specific examples from innate immune-inducible genes in insects, and subsequently a bioinformatics pipeline is proposed for functional characterization of newly sequenced genes. The proposed workflow presented here, can also be applied for any newly sequenced species generated from Next Generation Sequencing (NGS) platforms.

  14. Assessing computational genomics skills: Our experience in the H3ABioNet African bioinformatics network.

    Science.gov (United States)

    Jongeneel, C Victor; Achinike-Oduaran, Ovokeraye; Adebiyi, Ezekiel; Adebiyi, Marion; Adeyemi, Seun; Akanle, Bola; Aron, Shaun; Ashano, Efejiro; Bendou, Hocine; Botha, Gerrit; Chimusa, Emile; Choudhury, Ananyo; Donthu, Ravikiran; Drnevich, Jenny; Falola, Oluwadamila; Fields, Christopher J; Hazelhurst, Scott; Hendry, Liesl; Isewon, Itunuoluwa; Khetani, Radhika S; Kumuthini, Judit; Kimuda, Magambo Phillip; Magosi, Lerato; Mainzer, Liudmila Sergeevna; Maslamoney, Suresh; Mbiyavanga, Mamana; Meintjes, Ayton; Mugutso, Danny; Mpangase, Phelelani; Munthali, Richard; Nembaware, Victoria; Ndhlovu, Andrew; Odia, Trust; Okafor, Adaobi; Oladipo, Olaleye; Panji, Sumir; Pillay, Venesa; Rendon, Gloria; Sengupta, Dhriti; Mulder, Nicola

    2017-06-01

    The H3ABioNet pan-African bioinformatics network, which is funded to support the Human Heredity and Health in Africa (H3Africa) program, has developed node-assessment exercises to gauge the ability of its participating research and service groups to analyze typical genome-wide datasets being generated by H3Africa research groups. We describe a framework for the assessment of computational genomics analysis skills, which includes standard operating procedures, training and test datasets, and a process for administering the exercise. We present the experiences of 3 research groups that have taken the exercise and the impact on their ability to manage complex projects. Finally, we discuss the reasons why many H3ABioNet nodes have declined so far to participate and potential strategies to encourage them to do so.

  15. A novel bioinformatic strategy to characterise microbial communities in biogas reactors

    DEFF Research Database (Denmark)

    Treu, Laura; Campanaro, Stefano; De Francisci, Davide

    2014-01-01

    , J.J. et al., 2011). For this reason we developed a bioinformatics strategy in order to create a tool to review the generated dataset and to obtain a more strict control on the bacterial composition at the species level, with estimation of its reliability. The program perform local similarity search...... comparative analyses with MG-RAST (Meyer F. et al., 2008) results have been performed to test our strategy. We also found that our method can be used to understand which hypervariable region of 16S rDNA is more efficient in the identification at the species level in different genera. Our conclusion...... 16S hypervariable regions, especially when working with the not high quality very short reads characteristics of next generation sequencers (Mande S.S. et al., 2012). Previous works analysed the microbial community composition in biogas reactors via 16S rDNA sequencing (Luo, G. et al., 2013; Werner...

  16. The World-Wide Web: An Interface between Research and Teaching in Bioinformatics

    Directory of Open Access Journals (Sweden)

    James F. Aiton

    1994-01-01

    Full Text Available The rapid expansion occurring in World-Wide Web activity is beginning to make the concepts of ‘global hypermedia’ and ‘universal document readership’ realistic objectives of the new revolution in information technology. One consequence of this increase in usage is that educators and students are becoming more aware of the diversity of the knowledge base which can be accessed via the Internet. Although computerised databases and information services have long played a key role in bioinformatics these same resources can also be used to provide core materials for teaching and learning. The large datasets and arch ives th at have been compiled for biomedical research can be enhanced with the addition of a variety of multimedia elements (images. digital videos. animation etc.. The use of this digitally stored information in structured and self-directed learning environments is likely to increase as activity across World-Wide Web increases.

  17. A bioinformatics study concerning structural and functional properties of human caveolin proteins

    Directory of Open Access Journals (Sweden)

    Isvoran Adriana

    2014-01-01

    Full Text Available We performed a bioinformatics study to predict and compare the structural and functional properties of human caveolins: caveolin-1, -2 and -3. The computed local physico-chemical properties, predictions of their secondary structure elements and interacting partners of caveolin-2 and -3 are compared to experimentally proved structural and functional properties of caveolin-1. These data combined with the sequences alignment of the three caveolins, allowed predicting and characterizing the functional domains of caveolin-2 and -3. The hydrophobic regions of these proteins are highly similar in sequences and physicochemical properties and it is in good agreement with their known membrane locations and functions. The most divergent in sequences and properties are the C-terminal regions of caveolins suggesting that they might be responsible for their distinct predicted interactions, with direct consequences on signalling processes.

  18. OPTSDNA: Performance evaluation of an efficient distributed bioinformatics system for DNA sequence analysis.

    Science.gov (United States)

    Khan, Mohammad Ibrahim; Sheel, Chotan

    2013-01-01

    Storage of sequence data is a big concern as the amount of data generated is exponential in nature at several locations. Therefore, there is a need to develop techniques to store data using compression algorithm. Here we describe optimal storage algorithm (OPTSDNA) for storing large amount of DNA sequences of varying length. This paper provides performance analysis of optimal storage algorithm (OPTSDNA) of a distributed bioinformatics computing system for analysis of DNA sequences. OPTSDNA algorithm is used for storing various sizes of DNA sequences into database. DNA sequences of different lengths were stored by using this algorithm. These input DNA sequences are varied in size from very small to very large. Storage size is calculated by this algorithm. Response time is also calculated in this work. The efficiency and performance of the algorithm is high (in size calculation with percentage) when compared with other known with sequential approach.

  19. Lecture 10: The European Bioinformatics Institute - "Big data" for biomedical sciences

    CERN Multimedia

    CERN. Geneva; Dana, Jose

    2013-01-01

    Part 1: Big data for biomedical sciences (Tom Hancocks) Ten years ago witnessed the completion of the first international 'Big Biology' project that sequenced the human genome. In the years since biological sciences, have seen a vast growth in data. In the coming years advances will come from integration of experimental approaches and the translation into applied technologies is the hospital, clinic and even at home. This talk will examine the development of infrastructure, physical and virtual, that will allow millions of life scientists across Europe better access to biological data Tom studied Human Genetics at the University of Leeds and McMaster University, before completing an MSc in Analytical Genomics at the University of Birmingham. He has worked for the UK National Health Service in diagnostic genetics and in training healthcare scientists and clinicians in bioinformatics. Tom joined the EBI in 2012 and is responsible for the scientific development and delivery of training for the BioMedBridges pr...

  20. Assessing computational genomics skills: Our experience in the H3ABioNet African bioinformatics network.

    Directory of Open Access Journals (Sweden)

    C Victor Jongeneel

    2017-06-01

    Full Text Available The H3ABioNet pan-African bioinformatics network, which is funded to support the Human Heredity and Health in Africa (H3Africa program, has developed node-assessment exercises to gauge the ability of its participating research and service groups to analyze typical genome-wide datasets being generated by H3Africa research groups. We describe a framework for the assessment of computational genomics analysis skills, which includes standard operating procedures, training and test datasets, and a process for administering the exercise. We present the experiences of 3 research groups that have taken the exercise and the impact on their ability to manage complex projects. Finally, we discuss the reasons why many H3ABioNet nodes have declined so far to participate and potential strategies to encourage them to do so.