WorldWideScience

Sample records for bioinformatics supporting discovery

  1. XMPP for cloud computing in bioinformatics supporting discovery and invocation of asynchronous web services

    Directory of Open Access Journals (Sweden)

    Willighagen Egon L

    2009-09-01

    Full Text Available Abstract Background Life sciences make heavily use of the web for both data provision and analysis. However, the increasing amount of available data and the diversity of analysis tools call for machine accessible interfaces in order to be effective. HTTP-based Web service technologies, like the Simple Object Access Protocol (SOAP and REpresentational State Transfer (REST services, are today the most common technologies for this in bioinformatics. However, these methods have severe drawbacks, including lack of discoverability, and the inability for services to send status notifications. Several complementary workarounds have been proposed, but the results are ad-hoc solutions of varying quality that can be difficult to use. Results We present a novel approach based on the open standard Extensible Messaging and Presence Protocol (XMPP, consisting of an extension (IO Data to comprise discovery, asynchronous invocation, and definition of data types in the service. That XMPP cloud services are capable of asynchronous communication implies that clients do not have to poll repetitively for status, but the service sends the results back to the client upon completion. Implementations for Bioclipse and Taverna are presented, as are various XMPP cloud services in bio- and cheminformatics. Conclusion XMPP with its extensions is a powerful protocol for cloud services that demonstrate several advantages over traditional HTTP-based Web services: 1 services are discoverable without the need of an external registry, 2 asynchronous invocation eliminates the need for ad-hoc solutions like polling, and 3 input and output types defined in the service allows for generation of clients on the fly without the need of an external semantics description. The many advantages over existing technologies make XMPP a highly interesting candidate for next generation online services in bioinformatics.

  2. Discovery and Classification of Bioinformatics Web Services

    Energy Technology Data Exchange (ETDEWEB)

    Rocco, D; Critchlow, T

    2002-09-02

    The transition of the World Wide Web from a paradigm of static Web pages to one of dynamic Web services provides new and exciting opportunities for bioinformatics with respect to data dissemination, transformation, and integration. However, the rapid growth of bioinformatics services, coupled with non-standardized interfaces, diminish the potential that these Web services offer. To face this challenge, we examine the notion of a Web service class that defines the functionality provided by a collection of interfaces. These descriptions are an integral part of a larger framework that can be used to discover, classify, and wrapWeb services automatically. We discuss how this framework can be used in the context of the proliferation of sites offering BLAST sequence alignment services for specialized data sets.

  3. Bioinformatics Assisted Gene Discovery and Annotation of Human Genome

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    As the sequencing stage of human genome project is near the end, the work has begun for discovering novel genes from genome sequences and annotating their biological functions. Here are reviewed current major bioinformatics tools and technologies available for large scale gene discovery and annotation from human genome sequences. Some ideas about possible future development are also provided.

  4. Bioinformatics and biomarker discovery "Omic" data analysis for personalized medicine

    CERN Document Server

    Azuaje, Francisco

    2010-01-01

    This book is designed to introduce biologists, clinicians and computational researchers to fundamental data analysis principles, techniques and tools for supporting the discovery of biomarkers and the implementation of diagnostic/prognostic systems. The focus of the book is on how fundamental statistical and data mining approaches can support biomarker discovery and evaluation, emphasising applications based on different types of "omic" data. The book also discusses design factors, requirements and techniques for disease screening, diagnostic and prognostic applications. Readers are provided w

  5. Bioinformatics

    DEFF Research Database (Denmark)

    Baldi, Pierre; Brunak, Søren

    , and medicine will be particularly affected by the new results and the increased understanding of life at the molecular level. Bioinformatics is the development and application of computer methods for analysis, interpretation, and prediction, as well as for the design of experiments. It has emerged...

  6. Supporting knowledge discovery in medicine.

    Science.gov (United States)

    Girardi, Dominic; Arthofer, Klaus

    2014-01-01

    Our ontology-based benchmarking infrastructure for hospitals, we presented on the eHealth 2012, has meanwhile proven useful. Besides, we gathered manifold experience in supporting knowledge discovery in medicine. This also led to further functions and plans with our software. We could confirm and extent our experience by a literature review on the knowledge discovery process in medicine, visual analytics and data mining and drafted an according approach for extending our software. We validated our approach by exemplarily implementing a parallel-coordinate data visualization into our software and plan to integrate further algorithms for visual analytics and machine learning to support knowledge discovery in medicine in diverse ways. This is very promising but can also fail due to technical or organizational details.

  7. Automatic Discovery and Inferencing of Complex Bioinformatics Web Interfaces

    Energy Technology Data Exchange (ETDEWEB)

    Ngu, A; Rocco, D; Critchlow, T; Buttler, D

    2003-12-22

    The World Wide Web provides a vast resource to genomics researchers in the form of web-based access to distributed data sources--e.g. BLAST sequence homology search interfaces. However, the process for seeking the desired scientific information is still very tedious and frustrating. While there are several known servers on genomic data (e.g., GeneBank, EMBL, NCBI), that are shared and accessed frequently, new data sources are created each day in laboratories all over the world. The sharing of these newly discovered genomics results are hindered by the lack of a common interface or data exchange mechanism. Moreover, the number of autonomous genomics sources and their rate of change out-pace the speed at which they can be manually identified, meaning that the available data is not being utilized to its full potential. An automated system that can find, classify, describe and wrap new sources without tedious and low-level coding of source specific wrappers is needed to assist scientists to access to hundreds of dynamically changing bioinformatics web data sources through a single interface. A correct classification of any kind of Web data source must address both the capability of the source and the conversation/interaction semantics which is inherent in the design of the Web data source. In this paper, we propose an automatic approach to classify Web data sources that takes into account both the capability and the conversational semantics of the source. The ability to discover the interaction pattern of a Web source leads to increased accuracy in the classification process. At the same time, it facilitates the extraction of process semantics, which is necessary for the automatic generation of wrappers that can interact correctly with the sources.

  8. Bioinformatics Training: A Review of Challenges, Actions and Support Requirements

    DEFF Research Database (Denmark)

    Schneider, M.V.; Watson, J.; Attwood, T.;

    2010-01-01

    As bioinformatics becomes increasingly central to research in the molecular life sciences, the need to train non-bioinformaticians to make the most of bioinformatics resources is growing. Here, we review the key challenges and pitfalls to providing effective training for users of bioinformatics...

  9. Discovery-driven research and bioinformatics in nuclear receptor and coregulator signaling.

    Science.gov (United States)

    McKenna, Neil J

    2011-08-01

    Nuclear receptors (NRs) are a superfamily of ligand-regulated transcription factors that interact with coregulators and other transcription factors to direct tissue-specific programs of gene expression. Recent years have witnessed a rapid acceleration of the output of high-content data platforms in this field, generating discovery-driven datasets that have collectively described: the organization of the NR superfamily (phylogenomics); the expression patterns of NRs, coregulators and their target genes (transcriptomics); ligand- and tissue-specific functional NR and coregulator sites in DNA (cistromics); the organization of nuclear receptors and coregulators into higher order complexes (proteomics); and their downstream effects on homeostasis and metabolism (metabolomics). Significant bioinformatics challenges lie ahead both in the integration of this information into meaningful models of NR and coregulator biology, as well as in the archiving and communication of datasets to the global nuclear receptor signaling community. While holding great promise for the field, the ascendancy of discovery-driven research in this field brings with it a collective responsibility for researchers, publishers and funding agencies alike to ensure the effective archiving and management of these data. This review will discuss factors lying behind the increasing impact of discovery-driven research, examples of high-content datasets and their bioinformatic analysis, as well as a summary of currently curated web resources in this field. This article is part of a Special Issue entitled: Translating nuclear receptors from health to disease. PMID:21029773

  10. Opportunities and challenges provided by cloud repositories for bioinformatics-enabled drug discovery.

    Science.gov (United States)

    Dalpé, Gratien; Joly, Yann

    2014-09-01

    Healthcare-related bioinformatics databases are increasingly offering the possibility to maintain, organize, and distribute DNA sequencing data. Different national and international institutions are currently hosting such databases that offer researchers website platforms where they can obtain sequencing data on which they can perform different types of analysis. Until recently, this process remained mostly one-dimensional, with most analysis concentrated on a limited amount of data. However, newer genome sequencing technology is producing a huge amount of data that current computer facilities are unable to handle. An alternative approach has been to start adopting cloud computing services for combining the information embedded in genomic and model system biology data, patient healthcare records, and clinical trials' data. In this new technological paradigm, researchers use virtual space and computing power from existing commercial or not-for-profit cloud service providers to access, store, and analyze data via different application programming interfaces. Cloud services are an alternative to the need of larger data storage; however, they raise different ethical, legal, and social issues. The purpose of this Commentary is to summarize how cloud computing can contribute to bioinformatics-based drug discovery and to highlight some of the outstanding legal, ethical, and social issues that are inherent in the use of cloud services. PMID:25195583

  11. On Supporting Physical Skill Discovery

    Science.gov (United States)

    Furukawa, Koichi; Suwa, Masaki; Kato, Takaaki

    One of the main difficulties in motor skill acquisition is attributed to body control based on wrong mental models. This is true to various domains such as playing sports and playing musical instruments. In order to acquire adequate motor skill by modifying false belief, we need to help people find appropriate key points in achieving a body control and integrate them. In this paper, we investigate three approaches to realize such support. The first one is to encourage exploration of the relations among key points constituting a motor skill, using a technique of meta-cognitive verbalization. The second one is to represent a motor skill by appropriate mechanical models. The third one is to integrate rules for component tasks in achieving a compound task. These three approaches, we argue, help people build an integrated mental model consisting of multiple relations among various key points, one that seems to be indispensable for acquisition of motor skills. These ideas suggest the possibility to create new skill rules to perform difficult tasks automatically.

  12. Applications and Methods Utilizing the Simple Semantic Web Architecture and Protocol (SSWAP) for Bioinformatics Resource Discovery and Disparate Data and Service Integration

    Science.gov (United States)

    Scientific data integration and computational service discovery are challenges for the bioinformatic community. This process is made more difficult by the separate and independent construction of biological databases, which makes the exchange of scientific data between information resources difficu...

  13. First Multitarget Chemo-Bioinformatic Model To Enable the Discovery of Antibacterial Peptides against Multiple Gram-Positive Pathogens.

    Science.gov (United States)

    Speck-Planche, Alejandro; Kleandrova, Valeria V; Ruso, Juan M; Cordeiro, M N D S

    2016-03-28

    Antimicrobial peptides (AMPs) have emerged as promising therapeutic alternatives to fight against the diverse infections caused by different pathogenic microorganisms. In this context, theoretical approaches in bioinformatics have paved the way toward the creation of several in silico models capable of predicting antimicrobial activities of peptides. All current models have several significant handicaps, which prevent the efficient search for highly active AMPs. Here, we introduce the first multitarget (mt) chemo-bioinformatic model devoted to performing alignment-free prediction of antibacterial activity of peptides against multiple Gram-positive bacterial strains. The model was constructed from a data set containing 2488 cases of AMPs sequences assayed against at least 1 out of 50 Gram-positive bacterial strains. This mt-chemo-bioinformatic model displayed percentages of correct classification higher than 90.00% in both training and prediction (test) sets. For the first time, two computational approaches derived from basic concepts in genetics and molecular biology were applied, allowing the calculations of the relative contributions of any amino acid (in a defined position) to the antibacterial activity of an AMP and depending on the bacterial strain used in the biological assay. The present mt-chemo-bioinformatic model constitutes a powerful tool to enable the discovery of potent and versatile AMPs.

  14. An Abstract Description Approach to the Discovery and Classification of Bioinformatics Web Sources

    Energy Technology Data Exchange (ETDEWEB)

    Rocco, D; Critchlow, T J

    2003-05-01

    The World Wide Web provides an incredible resource to genomics researchers in the form of dynamic data sources--e.g. BLAST sequence homology search interfaces. The growth rate of these sources outpaces the speed at which they can be manually classified, meaning that the available data is not being utilized to its full potential. Existing research has not addressed the problems of automatically locating, classifying, and integrating classes of bioinformatics data sources. This paper presents an overview of a system for finding classes of bioinformatics data sources and integrating them behind a unified interface. We examine an approach to classifying these sources automatically that relies on an abstract description format: the service class description. This format allows a domain expert to describe the important features of an entire class of services without tying that description to any particular Web source. We present the features of this description format in the context of BLAST sources to show how the service class description relates to Web sources that are being described. We then show how a service class description can be used to classify an arbitrary Web source to determine if that source is an instance of the described service. To validate the effectiveness of this approach, we have constructed a prototype that can correctly classify approximately two-thirds of the BLAST sources we tested. We then examine these results, consider the factors that affect correct automatic classification, and discuss future work.

  15. Discovery-driven research and bioinformatics in nuclear receptor and coregulator signaling

    OpenAIRE

    McKenna, Neil J

    2010-01-01

    Nuclear receptors (NRs) are a superfamily of ligand-regulated transcription factors that interact with coregulators and other transcription factors to direct tissue-specific programs of gene expression. Recent years have witnessed a rapid acceleration of the output of high content data platforms in this field, generating discovery-driven datasets that have collectively described: the organization of the NR superfamily (phylogenomics); the expression patterns of NRs, coregulators and their tar...

  16. Translational Bioinformatics and Clinical Research (Biomedical) Informatics.

    Science.gov (United States)

    Sirintrapun, S Joseph; Zehir, Ahmet; Syed, Aijazuddin; Gao, JianJiong; Schultz, Nikolaus; Cheng, Donavan T

    2016-03-01

    Translational bioinformatics and clinical research (biomedical) informatics are the primary domains related to informatics activities that support translational research. Translational bioinformatics focuses on computational techniques in genetics, molecular biology, and systems biology. Clinical research (biomedical) informatics involves the use of informatics in discovery and management of new knowledge relating to health and disease. This article details 3 projects that are hybrid applications of translational bioinformatics and clinical research (biomedical) informatics: The Cancer Genome Atlas, the cBioPortal for Cancer Genomics, and the Memorial Sloan Kettering Cancer Center clinical variants and results database, all designed to facilitate insights into cancer biology and clinical/therapeutic correlations.

  17. FY02 CBNP Annual Report Input: Bioinformatics Support for CBNP Research and Deployments

    Energy Technology Data Exchange (ETDEWEB)

    Slezak, T; Wolinsky, M

    2002-10-31

    The events of FY01 dynamically reprogrammed the objectives of the CBNP bioinformatics support team, to meet rapidly-changing Homeland Defense needs and requests from other agencies for assistance: Use computational techniques to determine potential unique DNA signature candidates for microbial and viral pathogens of interest to CBNP researcher and to our collaborating partner agencies such as the Centers for Disease Control and Prevention (CDC), U.S. Department of Agriculture (USDA), Department of Defense (DOD), and Food and Drug Administration (FDA). Develop effective electronic screening measures for DNA signatures to reduce the cost and time of wet-bench screening. Build a comprehensive system for tracking the development and testing of DNA signatures. Build a chain-of-custody sample tracking system for field deployment of the DNA signatures as part of the BASIS project. Provide computational tools for use by CBNP Biological Foundations researchers.

  18. Integration of Bioinformatics and Synthetic Promoters Leads to the Discovery of Novel Elicitor-Responsive cis-Regulatory Sequences in Arabidopsis1[C][W][OA

    Science.gov (United States)

    Koschmann, Jeannette; Machens, Fabian; Becker, Marlies; Niemeyer, Julia; Schulze, Jutta; Bülow, Lorenz; Stahl, Dietmar J.; Hehl, Reinhard

    2012-01-01

    A combination of bioinformatic tools, high-throughput gene expression profiles, and the use of synthetic promoters is a powerful approach to discover and evaluate novel cis-sequences in response to specific stimuli. With Arabidopsis (Arabidopsis thaliana) microarray data annotated to the PathoPlant database, 732 different queries with a focus on fungal and oomycete pathogens were performed, leading to 510 up-regulated gene groups. Using the binding site estimation suite of tools, BEST, 407 conserved sequence motifs were identified in promoter regions of these coregulated gene sets. Motif similarities were determined with STAMP, classifying the 407 sequence motifs into 37 families. A comparative analysis of these 37 families with the AthaMap, PLACE, and AGRIS databases revealed similarities to known cis-elements but also led to the discovery of cis-sequences not yet implicated in pathogen response. Using a parsley (Petroselinum crispum) protoplast system and a modified reporter gene vector with an internal transformation control, 25 elicitor-responsive cis-sequences from 10 different motif families were identified. Many of the elicitor-responsive cis-sequences also drive reporter gene expression in an Agrobacterium tumefaciens infection assay in Nicotiana benthamiana. This work significantly increases the number of known elicitor-responsive cis-sequences and demonstrates the successful integration of a diverse set of bioinformatic resources combined with synthetic promoter analysis for data mining and functional screening in plant-pathogen interaction. PMID:22744985

  19. Integration of bioinformatics and synthetic promoters leads to the discovery of novel elicitor-responsive cis-regulatory sequences in Arabidopsis.

    Science.gov (United States)

    Koschmann, Jeannette; Machens, Fabian; Becker, Marlies; Niemeyer, Julia; Schulze, Jutta; Bülow, Lorenz; Stahl, Dietmar J; Hehl, Reinhard

    2012-09-01

    A combination of bioinformatic tools, high-throughput gene expression profiles, and the use of synthetic promoters is a powerful approach to discover and evaluate novel cis-sequences in response to specific stimuli. With Arabidopsis (Arabidopsis thaliana) microarray data annotated to the PathoPlant database, 732 different queries with a focus on fungal and oomycete pathogens were performed, leading to 510 up-regulated gene groups. Using the binding site estimation suite of tools, BEST, 407 conserved sequence motifs were identified in promoter regions of these coregulated gene sets. Motif similarities were determined with STAMP, classifying the 407 sequence motifs into 37 families. A comparative analysis of these 37 families with the AthaMap, PLACE, and AGRIS databases revealed similarities to known cis-elements but also led to the discovery of cis-sequences not yet implicated in pathogen response. Using a parsley (Petroselinum crispum) protoplast system and a modified reporter gene vector with an internal transformation control, 25 elicitor-responsive cis-sequences from 10 different motif families were identified. Many of the elicitor-responsive cis-sequences also drive reporter gene expression in an Agrobacterium tumefaciens infection assay in Nicotiana benthamiana. This work significantly increases the number of known elicitor-responsive cis-sequences and demonstrates the successful integration of a diverse set of bioinformatic resources combined with synthetic promoter analysis for data mining and functional screening in plant-pathogen interaction. PMID:22744985

  20. Discovery of novel xylosides in co-culture of basidiomycetes Trametes versicolor and Ganoderma applanatum by integrated metabolomics and bioinformatics.

    Science.gov (United States)

    Yao, Lu; Zhu, Li-Ping; Xu, Xiao-Yan; Tan, Ling-Ling; Sadilek, Martin; Fan, Huan; Hu, Bo; Shen, Xiao-Ting; Yang, Jie; Qiao, Bin; Yang, Song

    2016-01-01

    Transcriptomic analysis of cultured fungi suggests that many genes for secondary metabolite synthesis are presumably silent under standard laboratory condition. In order to investigate the expression of silent genes in symbiotic systems, 136 fungi-fungi symbiotic systems were built up by co-culturing seventeen basidiomycetes, among which the co-culture of Trametes versicolor and Ganoderma applanatum demonstrated the strongest coloration of confrontation zones. Metabolomics study of this co-culture discovered that sixty-two features were either newly synthesized or highly produced in the co-culture compared with individual cultures. Molecular network analysis highlighted a subnetwork including two novel xylosides (compounds 2 and 3). Compound 2 was further identified as N-(4-methoxyphenyl)formamide 2-O-β-D-xyloside and was revealed to have the potential to enhance the cell viability of human immortalized bronchial epithelial cell line of Beas-2B. Moreover, bioinformatics and transcriptional analysis of T. versicolor revealed a potential candidate gene (GI: 636605689) encoding xylosyltransferases for xylosylation. Additionally, 3-phenyllactic acid and orsellinic acid were detected for the first time in G. applanatum, which may be ascribed to response against T.versicolor stress. In general, the described co-culture platform provides a powerful tool to discover novel metabolites and help gain insights into the mechanism of silent gene activation in fungal defense. PMID:27616058

  1. Applications and methods utilizing the Simple Semantic Web Architecture and Protocol (SSWAP for bioinformatics resource discovery and disparate data and service integration

    Directory of Open Access Journals (Sweden)

    Nelson Rex T

    2010-06-01

    Full Text Available Abstract Background Scientific data integration and computational service discovery are challenges for the bioinformatic community. This process is made more difficult by the separate and independent construction of biological databases, which makes the exchange of data between information resources difficult and labor intensive. A recently described semantic web protocol, the Simple Semantic Web Architecture and Protocol (SSWAP; pronounced "swap" offers the ability to describe data and services in a semantically meaningful way. We report how three major information resources (Gramene, SoyBase and the Legume Information System [LIS] used SSWAP to semantically describe selected data and web services. Methods We selected high-priority Quantitative Trait Locus (QTL, genomic mapping, trait, phenotypic, and sequence data and associated services such as BLAST for publication, data retrieval, and service invocation via semantic web services. Data and services were mapped to concepts and categories as implemented in legacy and de novo community ontologies. We used SSWAP to express these offerings in OWL Web Ontology Language (OWL, Resource Description Framework (RDF and eXtensible Markup Language (XML documents, which are appropriate for their semantic discovery and retrieval. We implemented SSWAP services to respond to web queries and return data. These services are registered with the SSWAP Discovery Server and are available for semantic discovery at http://sswap.info. Results A total of ten services delivering QTL information from Gramene were created. From SoyBase, we created six services delivering information about soybean QTLs, and seven services delivering genetic locus information. For LIS we constructed three services, two of which allow the retrieval of DNA and RNA FASTA sequences with the third service providing nucleic acid sequence comparison capability (BLAST. Conclusions The need for semantic integration technologies has preceded

  2. Applications and methods utilizing the Simple Semantic Web Architecture and Protocol (SSWAP) for bioinformatics resource discovery and disparate data and service integration

    Science.gov (United States)

    2010-01-01

    Background Scientific data integration and computational service discovery are challenges for the bioinformatic community. This process is made more difficult by the separate and independent construction of biological databases, which makes the exchange of data between information resources difficult and labor intensive. A recently described semantic web protocol, the Simple Semantic Web Architecture and Protocol (SSWAP; pronounced "swap") offers the ability to describe data and services in a semantically meaningful way. We report how three major information resources (Gramene, SoyBase and the Legume Information System [LIS]) used SSWAP to semantically describe selected data and web services. Methods We selected high-priority Quantitative Trait Locus (QTL), genomic mapping, trait, phenotypic, and sequence data and associated services such as BLAST for publication, data retrieval, and service invocation via semantic web services. Data and services were mapped to concepts and categories as implemented in legacy and de novo community ontologies. We used SSWAP to express these offerings in OWL Web Ontology Language (OWL), Resource Description Framework (RDF) and eXtensible Markup Language (XML) documents, which are appropriate for their semantic discovery and retrieval. We implemented SSWAP services to respond to web queries and return data. These services are registered with the SSWAP Discovery Server and are available for semantic discovery at http://sswap.info. Results A total of ten services delivering QTL information from Gramene were created. From SoyBase, we created six services delivering information about soybean QTLs, and seven services delivering genetic locus information. For LIS we constructed three services, two of which allow the retrieval of DNA and RNA FASTA sequences with the third service providing nucleic acid sequence comparison capability (BLAST). Conclusions The need for semantic integration technologies has preceded available solutions. We

  3. A New System To Support Knowledge Discovery: Telemakus.

    Science.gov (United States)

    Revere, Debra; Fuller, Sherrilynne S.; Bugni, Paul F.; Martin, George M.

    2003-01-01

    The Telemakus System builds on the areas of concept representation, schema theory, and information visualization to enhance knowledge discovery from scientific literature. This article describes the underlying theories and an overview of a working implementation designed to enhance the knowledge discovery process through retrieval, visual and…

  4. Translational bioinformatics in psychoneuroimmunology: methods and applications.

    Science.gov (United States)

    Yan, Qing

    2012-01-01

    Translational bioinformatics plays an indispensable role in transforming psychoneuroimmunology (PNI) into personalized medicine. It provides a powerful method to bridge the gaps between various knowledge domains in PNI and systems biology. Translational bioinformatics methods at various systems levels can facilitate pattern recognition, and expedite and validate the discovery of systemic biomarkers to allow their incorporation into clinical trials and outcome assessments. Analysis of the correlations between genotypes and phenotypes including the behavioral-based profiles will contribute to the transition from the disease-based medicine to human-centered medicine. Translational bioinformatics would also enable the establishment of predictive models for patient responses to diseases, vaccines, and drugs. In PNI research, the development of systems biology models such as those of the neurons would play a critical role. Methods based on data integration, data mining, and knowledge representation are essential elements in building health information systems such as electronic health records and computerized decision support systems. Data integration of genes, pathophysiology, and behaviors are needed for a broad range of PNI studies. Knowledge discovery approaches such as network-based systems biology methods are valuable in studying the cross-talks among pathways in various brain regions involved in disorders such as Alzheimer's disease.

  5. Visualising "Junk" DNA through Bioinformatics

    Science.gov (United States)

    Elwess, Nancy L.; Latourelle, Sandra M.; Cauthorn, Olivia

    2005-01-01

    One of the hottest areas of science today is the field in which biology, information technology,and computer science are merged into a single discipline called bioinformatics. This field enables the discovery and analysis of biological data, including nucleotide and amino acid sequences that are easily accessed through the use of computers. As…

  6. ADS Labs - Supporting Information Discovery in Science Education

    CERN Document Server

    Henneken, Edwin A

    2012-01-01

    The SAO/NASA Astrophysics Data System (ADS) is an open access digital library portal for researchers in astronomy and physics, operated by the Smithsonian Astrophysical Observatory (SAO) under a NASA grant, successfully serving the professional science community for two decades. Currently there are about 55,000 frequent users (100+ queries per year), and up to 10 million infrequent users per year. Access by the general public now accounts for about half of all ADS use, demonstrating the vast reach of the content in our databases. The visibility and use of content in the ADS can be measured by the fact that there are over 17,000 links from Wikipedia pages to ADS content, a figure comparable to the number of links that Wikipedia has to OCLCs WorldCat catalog. The ADS, through its holdings and innovative techniques available in ADS Labs (http://adslabs.org), offers an environment for information discovery that is unlike any other service currently available to the astrophysics community. Literature discovery and...

  7. ADS Labs: Supporting Information Discovery in Science Education

    Science.gov (United States)

    Henneken, E. A.

    2013-04-01

    The SAO/NASA Astrophysics Data System (ADS) is an open access digital library portal for researchers in astronomy and physics, operated by the Smithsonian Astrophysical Observatory (SAO) under a NASA grant, successfully serving the professional science community for two decades. Currently there are about 55,000 frequent users (100+ queries per year), and up to 10 million infrequent users per year. Access by the general public now accounts for about half of all ADS use, demonstrating the vast reach of the content in our databases. The visibility and use of content in the ADS can be measured by the fact that there are over 17,000 links from Wikipedia pages to ADS content, a figure comparable to the number of links that Wikipedia has to OCLC's WorldCat catalog. The ADS, through its holdings and innovative techniques available in ADS Labs, offers an environment for information discovery that is unlike any other service currently available to the astrophysics community. Literature discovery and review are important components of science education, aiding the process of preparing for a class, project, or presentation. The ADS has been recognized as a rich source of information for the science education community in astronomy, thanks to its collaborations within the astronomy community, publishers and projects like ComPADRE. One element that makes the ADS uniquely relevant for the science education community is the availability of powerful tools to explore aspects of the astronomy literature as well as the relationship between topics, people, observations and scientific papers. The other element is the extensive repository of scanned literature, a significant fraction of which consists of historical literature.

  8. Bioinformatics for Exploration

    Science.gov (United States)

    Johnson, Kathy A.

    2006-01-01

    For the purpose of this paper, bioinformatics is defined as the application of computer technology to the management of biological information. It can be thought of as the science of developing computer databases and algorithms to facilitate and expedite biological research. This is a crosscutting capability that supports nearly all human health areas ranging from computational modeling, to pharmacodynamics research projects, to decision support systems within autonomous medical care. Bioinformatics serves to increase the efficiency and effectiveness of the life sciences research program. It provides data, information, and knowledge capture which further supports management of the bioastronautics research roadmap - identifying gaps that still remain and enabling the determination of which risks have been addressed.

  9. Flow cytometry bioinformatics.

    Directory of Open Access Journals (Sweden)

    Kieran O'Neill

    Full Text Available Flow cytometry bioinformatics is the application of bioinformatics to flow cytometry data, which involves storing, retrieving, organizing, and analyzing flow cytometry data using extensive computational resources and tools. Flow cytometry bioinformatics requires extensive use of and contributes to the development of techniques from computational statistics and machine learning. Flow cytometry and related methods allow the quantification of multiple independent biomarkers on large numbers of single cells. The rapid growth in the multidimensionality and throughput of flow cytometry data, particularly in the 2000s, has led to the creation of a variety of computational analysis methods, data standards, and public databases for the sharing of results. Computational methods exist to assist in the preprocessing of flow cytometry data, identifying cell populations within it, matching those cell populations across samples, and performing diagnosis and discovery using the results of previous steps. For preprocessing, this includes compensating for spectral overlap, transforming data onto scales conducive to visualization and analysis, assessing data for quality, and normalizing data across samples and experiments. For population identification, tools are available to aid traditional manual identification of populations in two-dimensional scatter plots (gating, to use dimensionality reduction to aid gating, and to find populations automatically in higher dimensional space in a variety of ways. It is also possible to characterize data in more comprehensive ways, such as the density-guided binary space partitioning technique known as probability binning, or by combinatorial gating. Finally, diagnosis using flow cytometry data can be aided by supervised learning techniques, and discovery of new cell types of biological importance by high-throughput statistical methods, as part of pipelines incorporating all of the aforementioned methods. Open standards, data

  10. ADS Services in support of the Discovery, Management and Evaluation of Science Data

    Science.gov (United States)

    Accomazzi, Alberto

    2015-12-01

    The NASA Astrophysics Data System (ADS) has long been used as a discovery platform for the scientific literature in Astronomy and Physics. With the addition of records describing datasets linked to publications, observing proposals and software used in refereed astronomy papers, the ADS is now increasingly used to find, access and cite an wider number of scientific resources. In this talk, I will discuss the recent efforts involving the indexing of software metadata, and our ongoing discussions with publishers in support of software and data citation. I will demonstrate the use of ADS's new services in support of discovery and evaluation of individual researchers as well as archival data products.

  11. Pladipus Enables Universal Distributed Computing in Proteomics Bioinformatics.

    Science.gov (United States)

    Verheggen, Kenneth; Maddelein, Davy; Hulstaert, Niels; Martens, Lennart; Barsnes, Harald; Vaudel, Marc

    2016-03-01

    The use of proteomics bioinformatics substantially contributes to an improved understanding of proteomes, but this novel and in-depth knowledge comes at the cost of increased computational complexity. Parallelization across multiple computers, a strategy termed distributed computing, can be used to handle this increased complexity; however, setting up and maintaining a distributed computing infrastructure requires resources and skills that are not readily available to most research groups. Here we propose a free and open-source framework named Pladipus that greatly facilitates the establishment of distributed computing networks for proteomics bioinformatics tools. Pladipus is straightforward to install and operate thanks to its user-friendly graphical interface, allowing complex bioinformatics tasks to be run easily on a network instead of a single computer. As a result, any researcher can benefit from the increased computational efficiency provided by distributed computing, hence empowering them to tackle more complex bioinformatics challenges. Notably, it enables any research group to perform large-scale reprocessing of publicly available proteomics data, thus supporting the scientific community in mining these data for novel discoveries. PMID:26510693

  12. Discovery of Candidate SNP by Bioinformatic Methods%基于生物信息学的SNP候选位点搜寻方法

    Institute of Scientific and Technical Information of China (English)

    陈炜; 张戈; 张思仲

    2001-01-01

    单核苷酸多态性 (Single Nucleotide Polymorphism, SNP)是人类基因组中最常见的遗传多态,在遗传学研究的很多方面具有重要的作用。它的搜寻正受到广泛关注。近年来,国际上出现了一种基于生物信息学的发掘SNP新方法。本文对该方法的两种策略及其各自所存在的问题作一介绍。%Single Nucleotide Polymorphism (SNP), the most common form of human genetic variation, represents a valuable resources for a variety of genetic research. There is considerable interest in the discovery of it. Recently, a new method based on bioinfomatics has been developed for the discovery of SNP. In this paper, the two strategy of this method and their respective problem are discussed.

  13. Bioinformatics for transporter pharmacogenomics and systems biology: data integration and modeling with UML.

    Science.gov (United States)

    Yan, Qing

    2010-01-01

    Bioinformatics is the rational study at an abstract level that can influence the way we understand biomedical facts and the way we apply the biomedical knowledge. Bioinformatics is facing challenges in helping with finding the relationships between genetic structures and functions, analyzing genotype-phenotype associations, and understanding gene-environment interactions at the systems level. One of the most important issues in bioinformatics is data integration. The data integration methods introduced here can be used to organize and integrate both public and in-house data. With the volume of data and the high complexity, computational decision support is essential for integrative transporter studies in pharmacogenomics, nutrigenomics, epigenetics, and systems biology. For the development of such a decision support system, object-oriented (OO) models can be constructed using the Unified Modeling Language (UML). A methodology is developed to build biomedical models at different system levels and construct corresponding UML diagrams, including use case diagrams, class diagrams, and sequence diagrams. By OO modeling using UML, the problems of transporter pharmacogenomics and systems biology can be approached from different angles with a more complete view, which may greatly enhance the efforts in effective drug discovery and development. Bioinformatics resources of membrane transporters and general bioinformatics databases and tools that are frequently used in transporter studies are also collected here. An informatics decision support system based on the models presented here is available at http://www.pharmtao.com/transporter . The methodology developed here can also be used for other biomedical fields.

  14. An Introduction to Bioinformatics

    Institute of Scientific and Technical Information of China (English)

    SHENG Qi-zheng; De Moor Bart

    2004-01-01

    As a newborn interdisciplinary field, bioinformatics is receiving increasing attention from biologists, computer scientists, statisticians, mathematicians and engineers. This paper briefly introduces the birth, importance, and extensive applications of bioinformatics in the different fields of biological research. A major challenge in bioinformatics - the unraveling of gene regulation - is discussed in detail.

  15. Microscopy and Bioinformatic Analyses of Lipid Metabolism Implicate a Sporophytic Signaling Network Supporting Pollen Development in Arabidopsis

    Institute of Scientific and Technical Information of China (English)

    Yixing Wang; Hong Wu; Ming Yang

    2008-01-01

    The Arabidopsis sporophytic tapetum undergoes a programmed degeneration process to secrete lipid and other materials to support pollen development.However,the molecular mechanism regulating the degeneration process is unknown.To gain insight into this molecular mechanism,we first determined that the most critical period for tapetal secretion to support pollen development iS from the vacuolate microspore stage to the early binucleate pollen stage.We then analyzed the expression of enzymes responsible for lipid biosynthesis and degradation with available in-silico data.The genes for these enzymes that are expressed in the stamen but not in the concurrent uninucleate microspore and binucleate pollen are of particular interest,as they presumably hold the clues to unique molecular processes in the sporophytic tissues compared to the gametophytic tissue.No gene for lipid biosynthesis but a single gene encoding a patatin-like protein likely for lipid mobilization was identified based on the selection criterion.A search for genes co-expressed with this gene identified additional genes encoding typical signal transduction components such as a leucine-rich repeat receptor kinase,an extra-large G-protein,other protein kinases,and transcription factors.In addition,proteases,cell wall degradation enzymes,and other proteins were also identified.These proteins thus may be components of a signaling network leading to degradation of a broad range of cellular components.Since a broad range of degradation activities is expected to occur only in the tapetal degeneration process at this stage in the stamen,it iS further hypothesized that the signaling network acts in the tapetal degeneration process.

  16. Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum

    Energy Technology Data Exchange (ETDEWEB)

    Ditty, Jayna L.; Kvaal, Christopher A.; Goodner, Brad; Freyermuth, Sharyn K.; Bailey, Cheryl; Britton, Robert A.; Gordon, Stuart G.; Heinhorst, Sabine; Reed, Kelynne; Xu, Zhaohui; Sanders-Lorenz, Erin R.; Axen, Seth; Kim, Edwin; Johns, Mitrick; Scott, Kathleen; Kerfeld, Cheryl A.

    2011-08-01

    Undergraduate life sciences education needs an overhaul, as clearly described in the National Research Council of the National Academies publication BIO 2010: Transforming Undergraduate Education for Future Research Biologists. Among BIO 2010's top recommendations is the need to involve students in working with real data and tools that reflect the nature of life sciences research in the 21st century. Education research studies support the importance of utilizing primary literature, designing and implementing experiments, and analyzing results in the context of a bona fide scientific question in cultivating the analytical skills necessary to become a scientist. Incorporating these basic scientific methodologies in undergraduate education leads to increased undergraduate and post-graduate retention in the sciences. Toward this end, many undergraduate teaching organizations offer training and suggestions for faculty to update and improve their teaching approaches to help students learn as scientists, through design and discovery (e.g., Council of Undergraduate Research [www.cur.org] and Project Kaleidoscope [www.pkal.org]). With the advent of genome sequencing and bioinformatics, many scientists now formulate biological questions and interpret research results in the context of genomic information. Just as the use of bioinformatic tools and databases changed the way scientists investigate problems, it must change how scientists teach to create new opportunities for students to gain experiences reflecting the influence of genomics, proteomics, and bioinformatics on modern life sciences research. Educators have responded by incorporating bioinformatics into diverse life science curricula. While these published exercises in, and guidelines for, bioinformatics curricula are helpful and inspirational, faculty new to the area of bioinformatics inevitably need training in the theoretical underpinnings of the algorithms. Moreover, effectively integrating bioinformatics

  17. Deep Learning in Bioinformatics

    OpenAIRE

    Min, Seonwoo; Lee, Byunghan; Yoon, Sungroh

    2016-01-01

    In the era of big data, transformation of biomedical big data into valuable knowledge has been one of the most important challenges in bioinformatics. Deep learning has advanced rapidly since the early 2000s and now demonstrates state-of-the-art performance in various fields. Accordingly, application of deep learning in bioinformatics to gain insight from data has been emphasized in both academia and industry. Here, we review deep learning in bioinformatics, presenting examples of current res...

  18. Introduction to Bioinformatics

    OpenAIRE

    Thampi, Sabu M.

    2009-01-01

    Bioinformatics is a new discipline that addresses the need to manage and interpret the data that in the past decade was massively generated by genomic research. This discipline represents the convergence of genomics, biotechnology and information technology, and encompasses analysis and interpretation of data, modeling of biological phenomena, and development of algorithms and statistics. This article presents an introduction to bioinformatics

  19. Bioinformatics for cancer immunotherapy target discovery

    DEFF Research Database (Denmark)

    Olsen, Lars Rønn; Campos, Benito; Barnkob, Mike Stein;

    2014-01-01

    cancer immunotherapies has yet to be fulfilled. The insufficient efficacy of existing treatments can be attributed to a number of biological and technical issues. In this review, we detail the current limitations of immunotherapy target selection and design, and review computational methods to streamline...

  20. Disaster Response Tools for Decision Support and Data Discovery - E-DECIDER and GeoGateway

    Science.gov (United States)

    Glasscoe, M. T.; Donnellan, A.; Parker, J. W.; Granat, R. A.; Lyzenga, G. A.; Pierce, M. E.; Wang, J.; Grant Ludwig, L.; Eguchi, R. T.; Huyck, C. K.; Hu, Z.; Chen, Z.; Yoder, M. R.; Rundle, J. B.; Rosinski, A.

    2015-12-01

    Providing actionable data for situational awareness following an earthquake or other disaster is critical to decision makers in order to improve their ability to anticipate requirements and provide appropriate resources for response. E-DECIDER (Emergency Data Enhanced Cyber-Infrastructure for Disaster Evaluation and Response) is a decision support system producing remote sensing and geophysical modeling products that are relevant to the emergency preparedness and response communities and serves as a gateway to enable the delivery of actionable information to these communities. GeoGateway is a data product search and analysis gateway for scientific discovery, field use, and disaster response focused on NASA UAVSAR and GPS data that integrates with fault data, seismicity and models. Key information on the nature, magnitude and scope of damage, or Essential Elements of Information (EEI), necessary to achieve situational awareness are often generated from a wide array of organizations and disciplines, using any number of geospatial and non-geospatial technologies. We have worked in partnership with the California Earthquake Clearinghouse to develop actionable data products for use in their response efforts, particularly in regularly scheduled, statewide exercises like the recent May 2015 Capstone/SoCal NLE/Ardent Sentry Exercises and in the August 2014 South Napa earthquake activation. We also provided a number of products, services, and consultation to the NASA agency-wide response to the April 2015 Gorkha, Nepal earthquake. We will present perspectives on developing tools for decision support and data discovery in partnership with the Clearinghouse and for the Nepal earthquake. Products delivered included map layers as part of the common operational data plan for the Clearinghouse, delivered through XchangeCore Web Service Data Orchestration, enabling users to create merged datasets from multiple providers. For the Nepal response effort, products included models

  1. Computational biology and bioinformatics in Nigeria.

    Directory of Open Access Journals (Sweden)

    Segun A Fatumo

    2014-04-01

    Full Text Available Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  2. A Bioinformatics Facility for NASA

    Science.gov (United States)

    Schweighofer, Karl; Pohorille, Andrew

    2006-01-01

    Building on an existing prototype, we have fielded a facility with bioinformatics technologies that will help NASA meet its unique requirements for biological research. This facility consists of a cluster of computers capable of performing computationally intensive tasks, software tools, databases and knowledge management systems. Novel computational technologies for analyzing and integrating new biological data and already existing knowledge have been developed. With continued development and support, the facility will fulfill strategic NASA s bioinformatics needs in astrobiology and space exploration. . As a demonstration of these capabilities, we will present a detailed analysis of how spaceflight factors impact gene expression in the liver and kidney for mice flown aboard shuttle flight STS-108. We have found that many genes involved in signal transduction, cell cycle, and development respond to changes in microgravity, but that most metabolic pathways appear unchanged.

  3. Delivering bioinformatics training: bridging the gaps between computer science and biomedicine.

    Science.gov (United States)

    Dubay, Christopher; Brundege, James M; Hersh, William; Spackman, Kent

    2002-01-01

    Biomedical researchers have always sought innovative methodologies to elucidate the underlying biology in their experimental models. As the pace of research has increased with new technologies that 'scale-up' these experiments, researchers have developed acute needs for the information technologies which assist them in managing and processing their experiments and results into useful data analyses that support scientific discovery. The application of information technology to support this discovery process is often called bioinformatics. We have observed a 'gap' in the training of those individuals who traditionally aid in the delivery of information technology at the level of the end-user (e.g. a systems analyst working with a biomedical researcher) which can negatively impact the successful application of technological solutions to biomedical research problems. In this paper we describe the roots and branches of bioinformatics to illustrate a range of applications and technologies that it encompasses. We then propose a taxonomy of bioinformatics as a framework for the identification of skills employed in the field. The taxonomy can be used to assess a set of skills required by a student to traverse this hierarchy from one area to another. We then describe a curriculum that attempts to deliver the identified skills to a broad audience of participants, and describe our experiences with the curriculum to show how it can help bridge the 'gap'.

  4. Chemistry in Bioinformatics

    OpenAIRE

    Mitchell John; Murray-Rust Peter; Rzepa Henry

    2005-01-01

    Abstract Chemical information is now seen as critical for most areas of life sciences. But unlike Bioinformatics, where data is openly available and freely re-usable, most chemical information is closed and cannot be re-distributed without permission. This has led to a failure to adopt modern informatics and software techniques and therefore paucity of chemistry in bioinformatics. New technology, however, offers the hope of making chemical data (compounds and properties) free during the auth...

  5. Bioinformatics and School Biology

    Science.gov (United States)

    Dalpech, Roger

    2006-01-01

    The rapidly changing field of bioinformatics is fuelling the need for suitably trained personnel with skills in relevant biological "sub-disciplines" such as proteomics, transcriptomics and metabolomics, etc. But because of the complexity--and sheer weight of data--associated with these new areas of biology, many school teachers feel…

  6. Supporting Communication in a Collaborative Discovery Learning Environment: the Effect of Instruction

    NARCIS (Netherlands)

    Saab, Nadira; Joolingen, van Wouter R.; Hout-Wolters, van Bernadette H.A.M.

    2007-01-01

    We present a study on the effect of instruction on collaboration in a collaborative discovery learning environment. The instruction we used, called RIDE, is built upon four principles identified in the literature on collaborative processes: Respect, Intelligent collaboration, Deciding together, and

  7. GSO: Designing a Well-Founded Service Ontology to Support Dynamic Service Discovery and Composition

    NARCIS (Netherlands)

    Bonino da Silva Santos, Luiz Olavo; Guizzardi, Giancarlo; Guizzardi-Silva Souza, Renata; Goncalves da Silva, Eduardo; Ferreira Pires, Luis; Sinderen, van Marten

    2009-01-01

    A pragmatic and straightforward approach to semantic service discovery is to match inputs and outputs of user requests with the input and output requirements of registered service descriptions. This approach can be extended by using pre-conditions, effects and semantic annotations (meta-data) in an

  8. Tools and data services registry: a community effort to document bioinformatics resources.

    Science.gov (United States)

    Ison, Jon; Rapacki, Kristoffer; Ménager, Hervé; Kalaš, Matúš; Rydza, Emil; Chmura, Piotr; Anthon, Christian; Beard, Niall; Berka, Karel; Bolser, Dan; Booth, Tim; Bretaudeau, Anthony; Brezovsky, Jan; Casadio, Rita; Cesareni, Gianni; Coppens, Frederik; Cornell, Michael; Cuccuru, Gianmauro; Davidsen, Kristian; Vedova, Gianluca Della; Dogan, Tunca; Doppelt-Azeroual, Olivia; Emery, Laura; Gasteiger, Elisabeth; Gatter, Thomas; Goldberg, Tatyana; Grosjean, Marie; Grüning, Björn; Helmer-Citterich, Manuela; Ienasescu, Hans; Ioannidis, Vassilios; Jespersen, Martin Closter; Jimenez, Rafael; Juty, Nick; Juvan, Peter; Koch, Maximilian; Laibe, Camille; Li, Jing-Woei; Licata, Luana; Mareuil, Fabien; Mičetić, Ivan; Friborg, Rune Møllegaard; Moretti, Sebastien; Morris, Chris; Möller, Steffen; Nenadic, Aleksandra; Peterson, Hedi; Profiti, Giuseppe; Rice, Peter; Romano, Paolo; Roncaglia, Paola; Saidi, Rabie; Schafferhans, Andrea; Schwämmle, Veit; Smith, Callum; Sperotto, Maria Maddalena; Stockinger, Heinz; Vařeková, Radka Svobodová; Tosatto, Silvio C E; de la Torre, Victor; Uva, Paolo; Via, Allegra; Yachdav, Guy; Zambelli, Federico; Vriend, Gert; Rost, Burkhard; Parkinson, Helen; Løngreen, Peter; Brunak, Søren

    2016-01-01

    Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand.Here we present a community-driven curation effort, supported by ELIXIR-the European infrastructure for biological information-that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners.As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools. PMID:26538599

  9. Bioinformatics approaches for identifying new therapeutic bioactive peptides in food

    Directory of Open Access Journals (Sweden)

    Nora Khaldi

    2012-10-01

    Full Text Available ABSTRACT:The traditional methods for mining foods for bioactive peptides are tedious and long. Similar to the drug industry, the length of time to identify and deliver a commercial health ingredient that reduces disease symptoms can take anything between 5 to 10 years. Reducing this time and effort is crucial in order to create new commercially viable products with clear and important health benefits. In the past few years, bioinformatics, the science that brings together fast computational biology, and efficient genome mining, is appearing as the long awaited solution to this problem. By quickly mining food genomes for characteristics of certain food therapeutic ingredients, researchers can potentially find new ones in a matter of a few weeks. Yet, surprisingly, very little success has been achieved so far using bioinformatics in mining for food bioactives.The absence of food specific bioinformatic mining tools, the slow integration of both experimental mining and bioinformatics, and the important difference between different experimental platforms are some of the reasons for the slow progress of bioinformatics in the field of functional food and more specifically in bioactive peptide discovery.In this paper I discuss some methods that could be easily translated, using a rational peptide bioinformatics design, to food bioactive peptide mining. I highlight the need for an integrated food peptide database. I also discuss how to better integrate experimental work with bioinformatics in order to improve the mining of food for bioactive peptides, therefore achieving a higher success rates.

  10. In the Spotlight: Bioinformatics

    Science.gov (United States)

    Wang, May Dongmei

    2016-01-01

    During 2012, next generation sequencing (NGS) has attracted great attention in the biomedical research community, especially for personalized medicine. Also, third generation sequencing has become available. Therefore, state-of-art sequencing technology and analysis are reviewed in this Bioinformatics spotlight on 2012. Next-generation sequencing (NGS) is high-throughput nucleic acid sequencing technology with wide dynamic range and single base resolution. The full promise of NGS depends on the optimization of NGS platforms, sequence alignment and assembly algorithms, data analytics, novel algorithms for integrating NGS data with existing genomic, proteomic, or metabolomic data, and quantitative assessment of NGS technology in comparing to more established technologies such as microarrays. NGS technology has been predicated to become a cornerstone of personalized medicine. It is argued that NGS is a promising field for motivated young researchers who are looking for opportunities in bioinformatics. PMID:23192635

  11. Feature selection in bioinformatics

    Science.gov (United States)

    Wang, Lipo

    2012-06-01

    In bioinformatics, there are often a large number of input features. For example, there are millions of single nucleotide polymorphisms (SNPs) that are genetic variations which determine the dierence between any two unrelated individuals. In microarrays, thousands of genes can be proled in each test. It is important to nd out which input features (e.g., SNPs or genes) are useful in classication of a certain group of people or diagnosis of a given disease. In this paper, we investigate some powerful feature selection techniques and apply them to problems in bioinformatics. We are able to identify a very small number of input features sucient for tasks at hand and we demonstrate this with some real-world data.

  12. In the Spotlight: Bioinformatics

    OpenAIRE

    Wang, May Dongmei

    2012-01-01

    During 2012, next generation sequencing (NGS) has attracted great attention in the biomedical research community, especially for personalized medicine. Also, third generation sequencing has become available. Therefore, state-of-art sequencing technology and analysis are reviewed in this Bioinformatics spotlight on 2012. Next-generation sequencing (NGS) is high-throughput nucleic acid sequencing technology with wide dynamic range and single base resolution. The full promise of NGS depends on t...

  13. Phylogenetic trees in bioinformatics

    Energy Technology Data Exchange (ETDEWEB)

    Burr, Tom L [Los Alamos National Laboratory

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  14. Applying Instructional Design Theories to Bioinformatics Education in Microarray Analysis and Primer Design Workshops

    Science.gov (United States)

    Shachak, Aviv; Ophir, Ron; Rubin, Eitan

    2005-01-01

    The need to support bioinformatics training has been widely recognized by scientists, industry, and government institutions. However, the discussion of instructional methods for teaching bioinformatics is only beginning. Here we report on a systematic attempt to design two bioinformatics workshops for graduate biology students on the basis of…

  15. Bioinformatics education dissemination with an evolutionary problem solving perspective.

    Science.gov (United States)

    Jungck, John R; Donovan, Samuel S; Weisstein, Anton E; Khiripet, Noppadon; Everse, Stephen J

    2010-11-01

    Bioinformatics is central to biology education in the 21st century. With the generation of terabytes of data per day, the application of computer-based tools to stored and distributed data is fundamentally changing research and its application to problems in medicine, agriculture, conservation and forensics. In light of this 'information revolution,' undergraduate biology curricula must be redesigned to prepare the next generation of informed citizens as well as those who will pursue careers in the life sciences. The BEDROCK initiative (Bioinformatics Education Dissemination: Reaching Out, Connecting and Knitting together) has fostered an international community of bioinformatics educators. The initiative's goals are to: (i) Identify and support faculty who can take leadership roles in bioinformatics education; (ii) Highlight and distribute innovative approaches to incorporating evolutionary bioinformatics data and techniques throughout undergraduate education; (iii) Establish mechanisms for the broad dissemination of bioinformatics resource materials and teaching models; (iv) Emphasize phylogenetic thinking and problem solving; and (v) Develop and publish new software tools to help students develop and test evolutionary hypotheses. Since 2002, BEDROCK has offered more than 50 faculty workshops around the world, published many resources and supported an environment for developing and sharing bioinformatics education approaches. The BEDROCK initiative builds on the established pedagogical philosophy and academic community of the BioQUEST Curriculum Consortium to assemble the diverse intellectual and human resources required to sustain an international reform effort in undergraduate bioinformatics education. PMID:21036947

  16. The application of the open pharmacological concepts triple store (open PHACTS to support drug discovery research.

    Directory of Open Access Journals (Sweden)

    Joseline Ratnam

    Full Text Available Integration of open access, curated, high-quality information from multiple disciplines in the Life and Biomedical Sciences provides a holistic understanding of the domain. Additionally, the effective linking of diverse data sources can unearth hidden relationships and guide potential research strategies. However, given the lack of consistency between descriptors and identifiers used in different resources and the absence of a simple mechanism to link them, gathering and combining relevant, comprehensive information from diverse databases remains a challenge. The Open Pharmacological Concepts Triple Store (Open PHACTS is an Innovative Medicines Initiative project that uses semantic web technology approaches to enable scientists to easily access and process data from multiple sources to solve real-world drug discovery problems. The project draws together sources of publicly-available pharmacological, physicochemical and biomolecular data, represents it in a stable infrastructure and provides well-defined information exploration and retrieval methods. Here, we highlight the utility of this platform in conjunction with workflow tools to solve pharmacological research questions that require interoperability between target, compound, and pathway data. Use cases presented herein cover 1 the comprehensive identification of chemical matter for a dopamine receptor drug discovery program 2 the identification of compounds active against all targets in the Epidermal growth factor receptor (ErbB signaling pathway that have a relevance to disease and 3 the evaluation of established targets in the Vitamin D metabolism pathway to aid novel Vitamin D analogue design. The example workflows presented illustrate how the Open PHACTS Discovery Platform can be used to exploit existing knowledge and generate new hypotheses in the process of drug discovery.

  17. Scalable pattern recognition algorithms applications in computational biology and bioinformatics

    CERN Document Server

    Maji, Pradipta

    2014-01-01

    Reviews the development of scalable pattern recognition algorithms for computational biology and bioinformatics Includes numerous examples and experimental results to support the theoretical concepts described Concludes each chapter with directions for future research and a comprehensive bibliography

  18. Bioinformatics tools for analysing viral genomic data.

    Science.gov (United States)

    Orton, R J; Gu, Q; Hughes, J; Maabar, M; Modha, S; Vattipally, S B; Wilkie, G S; Davison, A J

    2016-04-01

    The field of viral genomics and bioinformatics is experiencing a strong resurgence due to high-throughput sequencing (HTS) technology, which enables the rapid and cost-effective sequencing and subsequent assembly of large numbers of viral genomes. In addition, the unprecedented power of HTS technologies has enabled the analysis of intra-host viral diversity and quasispecies dynamics in relation to important biological questions on viral transmission, vaccine resistance and host jumping. HTS also enables the rapid identification of both known and potentially new viruses from field and clinical samples, thus adding new tools to the fields of viral discovery and metagenomics. Bioinformatics has been central to the rise of HTS applications because new algorithms and software tools are continually needed to process and analyse the large, complex datasets generated in this rapidly evolving area. In this paper, the authors give a brief overview of the main bioinformatics tools available for viral genomic research, with a particular emphasis on HTS technologies and their main applications. They summarise the major steps in various HTS analyses, starting with quality control of raw reads and encompassing activities ranging from consensus and de novo genome assembly to variant calling and metagenomics, as well as RNA sequencing.

  19. Bioinformatic methods in protein characterization

    OpenAIRE

    Kallberg, Yvonne

    2002-01-01

    Bioinformatics is an emerging interdisciplinary research field in which mathematics. computer science and biology meet. In this thesis. bioinformatic methods for analysis of functional and structural properties among proteins will be presented. I have developed and applied bioinformatic methods on the enzyme superfamily of short-chain dehydrogenases/reductases (SDRs), coenzyme-binding enzymes of the Rossmann fold type, and amyloid-forming proteins and peptides. The basis...

  20. AGENTS AND OWL-S BASED SEMANTIC WEB SERVICE DISCOVERY WITH USER PREFERENCE SUPPORT

    Directory of Open Access Journals (Sweden)

    Rohallah Benaboud

    2013-05-01

    Full Text Available Service-oriented computing (SOC is an interdisciplinary paradigm that revolutionizes the very fabric ofdistributed software development applications that adopt service-oriented architectures (SOA can evolveduring their lifespan and adapt to changing or unpredictable environments more easily. SOA is builtaround the concept of Web Services. Although the Web services constitute a revolution in Word Wide Web,they are always regarded as non-autonomous entities and can be exploited only after their discovery. Withthe help of software agents, Web services are becoming more efficient and more dynamic.The topic of this paper is the development of an agent based approach for Web services discovery andselection in witch, OWL-S is used to describe Web services, QoS and service customer request. We developan efficient semantic service matching which takes into account concepts properties to match concepts inWeb service and service customer request descriptions. Our approach is based on an architecturecomposed of four layers: Web service and Request description layer, Functional match layer, QoScomputing layer and Reputation computing layer.

  1. Academic Training - Bioinformatics: Decoding the Genome

    CERN Multimedia

    Chris Jones

    2006-01-01

    ACADEMIC TRAINING LECTURE SERIES 27, 28 February 1, 2, 3 March 2006 from 11:00 to 12:00 - Auditorium, bldg. 500 Decoding the Genome A special series of 5 lectures on: Recent extraordinary advances in the life sciences arising through new detection technologies and bioinformatics The past five years have seen an extraordinary change in the information and tools available in the life sciences. The sequencing of the human genome, the discovery that we possess far fewer genes than foreseen, the measurement of the tiny changes in the genomes that differentiate us, the sequencing of the genomes of many pathogens that lead to diseases such as malaria are all examples of completely new information that is now available in the quest for improved healthcare. New tools have allowed similar strides in the discovery of the associated protein structures, providing invaluable information for those searching for new drugs. New DNA microarray chips permit simultaneous measurement of the state of expression of tens...

  2. Pattern recognition in bioinformatics.

    Science.gov (United States)

    de Ridder, Dick; de Ridder, Jeroen; Reinders, Marcel J T

    2013-09-01

    Pattern recognition is concerned with the development of systems that learn to solve a given problem using a set of example instances, each represented by a number of features. These problems include clustering, the grouping of similar instances; classification, the task of assigning a discrete label to a given instance; and dimensionality reduction, combining or selecting features to arrive at a more useful representation. The use of statistical pattern recognition algorithms in bioinformatics is pervasive. Classification and clustering are often applied to high-throughput measurement data arising from microarray, mass spectrometry and next-generation sequencing experiments for selecting markers, predicting phenotype and grouping objects or genes. Less explicitly, classification is at the core of a wide range of tools such as predictors of genes, protein function, functional or genetic interactions, etc., and used extensively in systems biology. A course on pattern recognition (or machine learning) should therefore be at the core of any bioinformatics education program. In this review, we discuss the main elements of a pattern recognition course, based on material developed for courses taught at the BSc, MSc and PhD levels to an audience of bioinformaticians, computer scientists and life scientists. We pay attention to common problems and pitfalls encountered in applications and in interpretation of the results obtained.

  3. Use of model organism and disease databases to support matchmaking for human disease gene discovery.

    Science.gov (United States)

    Mungall, Christopher J; Washington, Nicole L; Nguyen-Xuan, Jeremy; Condit, Christopher; Smedley, Damian; Köhler, Sebastian; Groza, Tudor; Shefchek, Kent; Hochheiser, Harry; Robinson, Peter N; Lewis, Suzanna E; Haendel, Melissa A

    2015-10-01

    The Matchmaker Exchange application programming interface (API) allows searching a patient's genotypic or phenotypic profiles across clinical sites, for the purposes of cohort discovery and variant disease causal validation. This API can be used not only to search for matching patients, but also to match against public disease and model organism data. This public disease data enable matching known diseases and variant-phenotype associations using phenotype semantic similarity algorithms developed by the Monarch Initiative. The model data can provide additional evidence to aid diagnosis, suggest relevant models for disease mechanism and treatment exploration, and identify collaborators across the translational divide. The Monarch Initiative provides an implementation of this API for searching multiple integrated sources of data that contextualize the knowledge about any given patient or patient family into the greater biomedical knowledge landscape. While this corpus of data can aid diagnosis, it is also the beginning of research to improve understanding of rare human diseases. PMID:26269093

  4. Virtual Bioinformatics Distance Learning Suite

    Science.gov (United States)

    Tolvanen, Martti; Vihinen, Mauno

    2004-01-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material…

  5. Emergent Computation Emphasizing Bioinformatics

    CERN Document Server

    Simon, Matthew

    2005-01-01

    Emergent Computation is concerned with recent applications of Mathematical Linguistics or Automata Theory. This subject has a primary focus upon "Bioinformatics" (the Genome and arising interest in the Proteome), but the closing chapter also examines applications in Biology, Medicine, Anthropology, etc. The book is composed of an organized examination of DNA, RNA, and the assembly of amino acids into proteins. Rather than examine these areas from a purely mathematical viewpoint (that excludes much of the biochemical reality), the author uses scientific papers written mostly by biochemists based upon their laboratory observations. Thus while DNA may exist in its double stranded form, triple stranded forms are not excluded. Similarly, while bases exist in Watson-Crick complements, mismatched bases and abasic pairs are not excluded, nor are Hoogsteen bonds. Just as there are four bases naturally found in DNA, the existence of additional bases is not ignored, nor amino acids in addition to the usual complement of...

  6. Bioinformatics meets parasitology.

    Science.gov (United States)

    Cantacessi, C; Campbell, B E; Jex, A R; Young, N D; Hall, R S; Ranganathan, S; Gasser, R B

    2012-05-01

    The advent and integration of high-throughput '-omics' technologies (e.g. genomics, transcriptomics, proteomics, metabolomics, glycomics and lipidomics) are revolutionizing the way biology is done, allowing the systems biology of organisms to be explored. These technologies are now providing unique opportunities for global, molecular investigations of parasites. For example, studies of a transcriptome (all transcripts in an organism, tissue or cell) have become instrumental in providing insights into aspects of gene expression, regulation and function in a parasite, which is a major step to understanding its biology. The purpose of this article was to review recent applications of next-generation sequencing technologies and bioinformatic tools to large-scale investigations of the transcriptomes of parasitic nematodes of socio-economic significance (particularly key species of the order Strongylida) and to indicate the prospects and implications of these explorations for developing novel methods of parasite intervention.

  7. Engineering BioInformatics

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    @@ With the completion of human genome sequencing, a new era of bioinformatics st arts. On one hand, due to the advance of high throughput DNA microarray technol ogies, functional genomics such as gene expression information has increased exp onentially and will continue to do so for the foreseeable future. Conventional m eans of storing, analysing and comparing related data are already overburdened. Moreover, the rich information in genes , their functions and their associated wide biological implication requires new technologies of analysing data that employ sophisticated statistical and machine learning algorithms, powerful com puters and intensive interaction together different data sources such as seque nce data, gene expression data, proteomics data and metabolic pathway informati on to discover complex genomic structures and functional patterns with other bi ological process to gain a comprehensive understanding of cell physiology.

  8. Virtual bioinformatics distance learning suite*.

    Science.gov (United States)

    Tolvanen, Martti; Vihinen, Mauno

    2004-05-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material over the Internet. Currently, we provide two fully computer-based courses, "Introduction to Bioinformatics" and "Bioinformatics in Functional Genomics." Here we will discuss the application of distance learning in bioinformatics training and our experiences gained during the 3 years that we have run the courses, with about 400 students from a number of universities. The courses are available at bioinf.uta.fi.

  9. Semantically supporting data discovery, markup and aggregation in the European Marine Observation and Data Network (EMODnet)

    Science.gov (United States)

    Lowry, Roy; Leadbetter, Adam

    2014-05-01

    concepts it is acceptable to aggregate for a given application. Another approach, which has been developed as a use case for concept and data discovery and will be implemented as part of the EC/United States/Australian collaboration the Ocean Data Interoperability Platform, is to expose the well defined, but little publicised, semantic model which underpins each and every concept within the PUV. This will be done in a machine readable form, so that tools can be built to aggregate data and concepts by, for example, the measured parameter; the environmental sphere or compartment of the sampling; and the methodology of the analysis of the parameter. There is interesting work being developed by CSIRO which may be used in this approach. The importance of these data aggregations is growing as more data providers use terms from semantic resources to describe their data, and allows for aggregating data from numerous sources. This importance will grow as data become "born semantic", i.e. when semantics are embedded with data from the point of collection. In this presentation we introduce a brief history of the development of the PUV; the use cases for data aggregation and discovery outlined above; and the semantic model from which the PUV is built; and the ideas for embedding semantics in data from the point of collection.

  10. Combining knowledge discovery from databases (KDD) and case-based reasoning (CBR) to support diagnosis of medical images

    Science.gov (United States)

    Stranieri, Andrew; Yearwood, John; Pham, Binh

    1999-07-01

    The development of data warehouses for the storage and analysis of very large corpora of medical image data represents a significant trend in health care and research. Amongst other benefits, the trend toward warehousing enables the use of techniques for automatically discovering knowledge from large and distributed databases. In this paper, we present an application design for knowledge discovery from databases (KDD) techniques that enhance the performance of the problem solving strategy known as case- based reasoning (CBR) for the diagnosis of radiological images. The problem of diagnosing the abnormality of the cervical spine is used to illustrate the method. The design of a case-based medical image diagnostic support system has three essential characteristics. The first is a case representation that comprises textual descriptions of the image, visual features that are known to be useful for indexing images, and additional visual features to be discovered by data mining many existing images. The second characteristic of the approach presented here involves the development of a case base that comprises an optimal number and distribution of cases. The third characteristic involves the automatic discovery, using KDD techniques, of adaptation knowledge to enhance the performance of the case based reasoner. Together, the three characteristics of our approach can overcome real time efficiency obstacles that otherwise mitigate against the use of CBR to the domain of medical image analysis.

  11. Interactive, Online, Adsorption Lab to Support Discovery of the Scientific Process

    Science.gov (United States)

    Carroll, K. C.; Ulery, A. L.; Chamberlin, B.; Dettmer, A.

    2014-12-01

    Science students require more than methods practice in lab activities; they must gain an understanding of the application of the scientific process through lab work. Large classes, time constraints, and funding may limit student access to science labs, denying students access to the types of experiential learning needed to motivate and develop new scientists. Interactive, discovery-based computer simulations and virtual labs provide an alternative, low-risk opportunity for learners to engage in lab processes and activities. Students can conduct experiments, collect data, draw conclusions, and even abort a session. We have developed an online virtual lab, through which students can interactively develop as scientists as they learn about scientific concepts, lab equipment, and proper lab techniques. Our first lab topic is adsorption of chemicals to soil, but the methodology is transferrable to other topics. In addition to learning the specific procedures involved in each lab, the online activities will prompt exploration and practice in key scientific and mathematical concepts, such as unit conversion, significant digits, assessing risks, evaluating bias, and assessing quantity and quality of data. These labs are not designed to replace traditional lab instruction, but to supplement instruction on challenging or particularly time-consuming concepts. To complement classroom instruction, students can engage in a lab experience outside the lab and over a shorter time period than often required with real-world adsorption studies. More importantly, students can reflect, discuss, review, and even fail at their lab experience as part of the process to see why natural processes and scientific approaches work the way they do. Our Media Productions team has completed a series of online digital labs available at virtuallabs.nmsu.edu and scienceofsoil.com, and these virtual labs are being integrated into coursework to evaluate changes in student learning.

  12. Bioinformatics in New Generation Flavivirus Vaccines

    Directory of Open Access Journals (Sweden)

    Penelope Koraka

    2010-01-01

    Full Text Available Flavivirus infections are the most prevalent arthropod-borne infections world wide, often causing severe disease especially among children, the elderly, and the immunocompromised. In the absence of effective antiviral treatment, prevention through vaccination would greatly reduce morbidity and mortality associated with flavivirus infections. Despite the success of the empirically developed vaccines against yellow fever virus, Japanese encephalitis virus and tick-borne encephalitis virus, there is an increasing need for a more rational design and development of safe and effective vaccines. Several bioinformatic tools are available to support such rational vaccine design. In doing so, several parameters have to be taken into account, such as safety for the target population, overall immunogenicity of the candidate vaccine, and efficacy and longevity of the immune responses triggered. Examples of how bio-informatics is applied to assist in the rational design and improvements of vaccines, particularly flavivirus vaccines, are presented and discussed.

  13. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software.

    Science.gov (United States)

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians.

  14. Microbial bioinformatics 2020.

    Science.gov (United States)

    Pallen, Mark J

    2016-09-01

    Microbial bioinformatics in 2020 will remain a vibrant, creative discipline, adding value to the ever-growing flood of new sequence data, while embracing novel technologies and fresh approaches. Databases and search strategies will struggle to cope and manual curation will not be sustainable during the scale-up to the million-microbial-genome era. Microbial taxonomy will have to adapt to a situation in which most microorganisms are discovered and characterised through the analysis of sequences. Genome sequencing will become a routine approach in clinical and research laboratories, with fresh demands for interpretable user-friendly outputs. The "internet of things" will penetrate healthcare systems, so that even a piece of hospital plumbing might have its own IP address that can be integrated with pathogen genome sequences. Microbiome mania will continue, but the tide will turn from molecular barcoding towards metagenomics. Crowd-sourced analyses will collide with cloud computing, but eternal vigilance will be the price of preventing the misinterpretation and overselling of microbial sequence data. Output from hand-held sequencers will be analysed on mobile devices. Open-source training materials will address the need for the development of a skilled labour force. As we boldly go into the third decade of the twenty-first century, microbial sequence space will remain the final frontier! PMID:27471065

  15. Application Of Data Mining In Bioinformatics

    OpenAIRE

    KHALID RAZA

    2012-01-01

    This article highlights some of the basic concepts of bioinformatics and data mining. The major research areas of bioinformatics are highlighted. The application of data mining in the domain of bioinformatics is explained. It also highlights some of the current challenges and opportunities of data mining in bioinformatics.

  16. Clustering Techniques in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Muhammad Ali Masood

    2015-01-01

    Full Text Available Dealing with data means to group information into a set of categories either in order to learn new artifacts or understand new domains. For this purpose researchers have always looked for the hidden patterns in data that can be defined and compared with other known notions based on the similarity or dissimilarity of their attributes according to well-defined rules. Data mining, having the tools of data classification and data clustering, is one of the most powerful techniques to deal with data in such a manner that it can help researchers identify the required information. As a step forward to address this challenge, experts have utilized clustering techniques as a mean of exploring hidden structure and patterns in underlying data. Improved stability, robustness and accuracy of unsupervised data classification in many fields including pattern recognition, machine learning, information retrieval, image analysis and bioinformatics, clustering has proven itself as a reliable tool. To identify the clusters in datasets algorithm are utilized to partition data set into several groups based on the similarity within a group. There is no specific clustering algorithm, but various algorithms are utilized based on domain of data that constitutes a cluster and the level of efficiency required. Clustering techniques are categorized based upon different approaches. This paper is a survey of few clustering techniques out of many in data mining. For the purpose five of the most common clustering techniques out of many have been discussed. The clustering techniques which have been surveyed are: K-medoids, K-means, Fuzzy C-means, Density-Based Spatial Clustering of Applications with Noise (DBSCAN and Self-Organizing Map (SOM clustering.

  17. Discovery of fairy circles in Australia supports self-organization theory.

    Science.gov (United States)

    Getzin, Stephan; Yizhaq, Hezi; Bell, Bronwyn; Erickson, Todd E; Postle, Anthony C; Katra, Itzhak; Tzuk, Omer; Zelnik, Yuval R; Wiegand, Kerstin; Wiegand, Thorsten; Meron, Ehud

    2016-03-29

    Vegetation gap patterns in arid grasslands, such as the "fairy circles" of Namibia, are one of nature's greatest mysteries and subject to a lively debate on their origin. They are characterized by small-scale hexagonal ordering of circular bare-soil gaps that persists uniformly in the landscape scale to form a homogeneous distribution. Pattern-formation theory predicts that such highly ordered gap patterns should be found also in other water-limited systems across the globe, even if the mechanisms of their formation are different. Here we report that so far unknown fairy circles with the same spatial structure exist 10,000 km away from Namibia in the remote outback of Australia. Combining fieldwork, remote sensing, spatial pattern analysis, and process-based mathematical modeling, we demonstrate that these patterns emerge by self-organization, with no correlation with termite activity; the driving mechanism is a positive biomass-water feedback associated with water runoff and biomass-dependent infiltration rates. The remarkable match between the patterns of Australian and Namibian fairy circles and model results indicate that both patterns emerge from a nonuniform stationary instability, supporting a central universality principle of pattern-formation theory. Applied to the context of dryland vegetation, this principle predicts that different systems that go through the same instability type will show similar vegetation patterns even if the feedback mechanisms and resulting soil-water distributions are different, as we indeed found by comparing the Australian and the Namibian fairy-circle ecosystems. These results suggest that biomass-water feedbacks and resultant vegetation gap patterns are likely more common in remote drylands than is currently known. PMID:26976567

  18. Bioinformatic identification of novel putative photoreceptor specific cis-elements

    OpenAIRE

    Knox Barry E; Qin Maochun; McIlvain Vera A; Danko Charles G; Pertsov Arkady M

    2007-01-01

    Abstract Background Cell specific gene expression is largely regulated by different combinations of transcription factors that bind cis-elements in the upstream promoter sequence. However, experimental detection of cis-elements is difficult, expensive, and time-consuming. This provides a motivation for developing bioinformatic methods to identify cis-elements that could prioritize future experimental studies. Here, we use motif discovery algorithms to predict transcription factor binding site...

  19. Generations of interdisciplinarity in bioinformatics

    Science.gov (United States)

    Bartlett, Andrew; Lewis, Jamie; Williams, Matthew L.

    2016-01-01

    Bioinformatics, a specialism propelled into relevance by the Human Genome Project and the subsequent -omic turn in the life science, is an interdisciplinary field of research. Qualitative work on the disciplinary identities of bioinformaticians has revealed the tensions involved in work in this “borderland.” As part of our ongoing work on the emergence of bioinformatics, between 2010 and 2011, we conducted a survey of United Kingdom-based academic bioinformaticians. Building on insights drawn from our fieldwork over the past decade, we present results from this survey relevant to a discussion of disciplinary generation and stabilization. Not only is there evidence of an attitudinal divide between the different disciplinary cultures that make up bioinformatics, but there are distinctions between the forerunners, founders and the followers; as inter/disciplines mature, they face challenges that are both inter-disciplinary and inter-generational in nature. PMID:27453689

  20. Training Experimental Biologists in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Pedro Fernandes

    2012-01-01

    Full Text Available Bioinformatics, for its very nature, is devoted to a set of targets that constantly evolve. Training is probably the best response to the constant need for the acquisition of bioinformatics skills. It is interesting to assess the effects of training in the different sets of researchers that make use of it. While training bench experimentalists in the life sciences, we have observed instances of changes in their attitudes in research that, if well exploited, can have beneficial impacts in the dialogue with professional bioinformaticians and influence the conduction of the research itself.

  1. The secondary metabolite bioinformatics portal

    DEFF Research Database (Denmark)

    Weber, Tilmann; Kim, Hyun Uk

    2016-01-01

    . In this context, this review gives a summary of tools and databases that currently are available to mine, identify and characterize natural product biosynthesis pathways and their producers based on ‘omics data. A web portal called Secondary Metabolite Bioinformatics Portal (SMBP at http......://www.secondarymetabolites.org) is introduced to provide a one-stop catalog and links to these bioinformatics resources. In addition, an outlook is presented how the existing tools and those to be developed will influence synthetic biology approaches in the natural products field....

  2. Bioinformatics interoperability: all together now !

    NARCIS (Netherlands)

    Meganck, B.; Mergen, P.; Meirte, D.

    2009-01-01

    The following text presents some personal ideas about the way (bio)informatics2 is heading, along with some examples of how our institution – the Royal Museum for Central Africa (RMCA) – is gearing up for these new times ahead. It tries to find the important trends amongst the buzzwords, and to demo

  3. Reproducible Bioinformatics Research for Biologists

    Science.gov (United States)

    This book chapter describes the current Big Data problem in Bioinformatics and the resulting issues with performing reproducible computational research. The core of the chapter provides guidelines and summaries of current tools/techniques that a noncomputational researcher would need to learn to pe...

  4. Bioinformatics and the Undergraduate Curriculum

    Science.gov (United States)

    Maloney, Mark; Parker, Jeffrey; LeBlanc, Mark; Woodard, Craig T.; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of…

  5. PayDIBI: Pay-as-you-go data integration for bioinformatics

    NARCIS (Netherlands)

    Wanders, B.

    2012-01-01

    Background: Scientific research in bio-informatics is often data-driven and supported by biolog- ical databases. In a growing number of research projects, researchers like to ask questions that require the combination of information from more than one database. Most bio-informatics papers do not det

  6. Application of bioinformatics in tropical medicine

    Institute of Scientific and Technical Information of China (English)

    Wiwanitkit V

    2008-01-01

    Bioinformatics is a usage of information technology to help solve biological problems by designing novel and in-cisive algorithms and methods of analyses.Bioinformatics becomes a discipline vital in the era of post-genom-ics.In this review article,the application of bioinformatics in tropical medicine will be presented and dis-cussed.

  7. Bioinformatics Tools for the Discovery of New Nonribosomal Peptides

    DEFF Research Database (Denmark)

    Leclère, Valérie; Weber, Tilmann; Jacques, Philippe;

    2016-01-01

    and the deciphering of the domain architecture of the nonribosomal peptide synthetases (NRPSs). In the next step, candidate peptides synthesized by these NRPSs are predicted in silico, considering the specificity of incorporated monomers together with their isomery. To assess their novelty, the two...

  8. An introduction to proteome bioinformatics.

    Science.gov (United States)

    Jones, Andrew R; Hubbard, Simon J

    2010-01-01

    This book is part of the Methods in Molecular Biology series, and provides a general overview of computational approaches used in proteome research. In this chapter, we give an overview of the scope of the book in terms of current proteomics experimental techniques and the reasons why computational approaches are needed. We then give a summary of each chapter, which together provide a picture of the state of the art in proteome bioinformatics research.

  9. Bioinformatics for Genome Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gary J. Olsen

    2005-06-30

    Nesbo, Boucher and Doolittle (2001) used phylogenetic trees of four taxa to assess whether euryarchaeal genes share a common history. They have suggested that of the 521 genes examined, each of the three possible tree topologies relating the four taxa was supported essentially equal numbers of times. They suggest that this might be the result of numerous horizontal gene transfer events, essentially randomizing the relationships between gene histories (as inferred in the 521 gene trees) and organismal relationships (which would be a single underlying tree). Motivated by the fact that the order in which sequences are added to a multiple sequence alignment influences the alignment, and ultimately inferred tree, they were interested in the extent to which the variations among inferred trees might be due to variations in the alignment order. This bears directly on their efforts to evaluate and improve upon methods of multiple sequence alignment. They set out to analyze the influence of alignment order on the tree inferred for 43 genes shared among these same 4 taxa. Because alignments produced by CLUSTALW are directed by a rooted guide tree (the denderogram), there are 15 possible alignment orders of 4 taxa. For each gene they tested all 15 alignment orders, and as a 16th option, allowed CLUSTALW to generate its own guide tree. If we supply all 15 possible rooted guide trees, they expected that at least one of them should be as good at CLUSTAL's own guide tree, but most of the time they differed (sometimes being better than CLUSTAL's default tree and sometimes being worse). The difference seems to be that the user-supplied tree is not given meaningful branch lengths, which effect the assumed probability of amino acid changes. They examined the practicality of modifying CLUSTALW to improve its treatment of user-supplied guide trees. This work became ever increasing bogged down in finding and repairing minor bugs in the CLUSTALW code. This effort was put on hold

  10. A Framework of Knowledge Integration and Discovery for Supporting Pharmacogenomics Target Predication of Adverse Drug Events: A Case Study of Drug-Induced Long QT Syndrome.

    Science.gov (United States)

    Jiang, Guoqian; Wang, Chen; Zhu, Qian; Chute, Christopher G

    2013-01-01

    Knowledge-driven text mining is becoming an important research area for identifying pharmacogenomics target genes. However, few of such studies have been focused on the pharmacogenomics targets of adverse drug events (ADEs). The objective of the present study is to build a framework of knowledge integration and discovery that aims to support pharmacogenomics target predication of ADEs. We integrate a semantically annotated literature corpus Semantic MEDLINE with a semantically coded ADE knowledgebase known as ADEpedia using a semantic web based framework. We developed a knowledge discovery approach combining a network analysis of a protein-protein interaction (PPI) network and a gene functional classification approach. We performed a case study of drug-induced long QT syndrome for demonstrating the usefulness of the framework in predicting potential pharmacogenomics targets of ADEs.

  11. Pay-as-you-go data integration for bio-informatics

    NARCIS (Netherlands)

    Wanders, Brend

    2012-01-01

    Scientific research in bio-informatics is often data-driven and supported by numerous biological databases. A biological database contains factual information collected from scientific experiments and computational analyses about areas including genomics, proteomics, metabolomics, microarray gene ex

  12. Bioinformatics in Africa: The Rise of Ghana?

    Directory of Open Access Journals (Sweden)

    Thomas K Karikari

    2015-09-01

    Full Text Available Until recently, bioinformatics, an important discipline in the biological sciences, was largely limited to countries with advanced scientific resources. Nonetheless, several developing countries have lately been making progress in bioinformatics training and applications. In Africa, leading countries in the discipline include South Africa, Nigeria, and Kenya. However, one country that is less known when it comes to bioinformatics is Ghana. Here, I provide a first description of the development of bioinformatics activities in Ghana and how these activities contribute to the overall development of the discipline in Africa. Over the past decade, scientists in Ghana have been involved in publications incorporating bioinformatics analyses, aimed at addressing research questions in biomedical science and agriculture. Scarce research funding and inadequate training opportunities are some of the challenges that need to be addressed for Ghanaian scientists to continue developing their expertise in bioinformatics.

  13. Design of experiments utilization to map the processing capabilities of a micro-spray dryer: particle design and throughput optimization in support of drug discovery.

    Science.gov (United States)

    Ormes, James D; Zhang, Dan; Chen, Alex M; Hou, Shirley; Krueger, Davida; Nelson, Todd; Templeton, Allen

    2013-02-01

    There has been a growing interest in amorphous solid dispersions for bioavailability enhancement in drug discovery. Spray drying, as shown in this study, is well suited to produce prototype amorphous dispersions in the Candidate Selection stage where drug supply is limited. This investigation mapped the processing window of a micro-spray dryer to achieve desired particle characteristics and optimize throughput/yield. Effects of processing variables on the properties of hypromellose acetate succinate were evaluated by a fractional factorial design of experiments. Parameters studied include solid loading, atomization, nozzle size, and spray rate. Response variables include particle size, morphology and yield. Unlike most other commercial small-scale spray dryers, the ProCepT was capable of producing particles with a relatively wide mean particle size, ca. 2-35 µm, allowing material properties to be tailored to support various applications. In addition, an optimized throughput of 35 g/hour with a yield of 75-95% was achieved, which affords to support studies from Lead-identification/Lead-optimization to early safety studies. A regression model was constructed to quantify the relationship between processing parameters and the response variables. The response surface curves provide a useful tool to design processing conditions, leading to a reduction in development time and drug usage to support drug discovery. PMID:22414114

  14. Establishing bioinformatics research in the Asia Pacific

    Directory of Open Access Journals (Sweden)

    Tammi Martti

    2006-12-01

    Full Text Available Abstract In 1998, the Asia Pacific Bioinformatics Network (APBioNet, Asia's oldest bioinformatics organisation was set up to champion the advancement of bioinformatics in the Asia Pacific. By 2002, APBioNet was able to gain sufficient critical mass to initiate the first International Conference on Bioinformatics (InCoB bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2006 Conference was organized as the 5th annual conference of the Asia-Pacific Bioinformatics Network, on Dec. 18–20, 2006 in New Delhi, India, following a series of successful events in Bangkok (Thailand, Penang (Malaysia, Auckland (New Zealand and Busan (South Korea. This Introduction provides a brief overview of the peer-reviewed manuscripts accepted for publication in this Supplement. It exemplifies a typical snapshot of the growing research excellence in bioinformatics of the region as we embark on a trajectory of establishing a solid bioinformatics research culture in the Asia Pacific that is able to contribute fully to the global bioinformatics community.

  15. BioWarehouse: a bioinformatics database warehouse toolkit

    Directory of Open Access Journals (Sweden)

    Stringer-Calvert David WJ

    2006-03-01

    Full Text Available Abstract Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the

  16. Bioinformatics meets user-centred design: a perspective.

    Directory of Open Access Journals (Sweden)

    Katrina Pavelin

    Full Text Available Designers have a saying that "the joy of an early release lasts but a short time. The bitterness of an unusable system lasts for years." It is indeed disappointing to discover that your data resources are not being used to their full potential. Not only have you invested your time, effort, and research grant on the project, but you may face costly redesigns if you want to improve the system later. This scenario would be less likely if the product was designed to provide users with exactly what they need, so that it is fit for purpose before its launch. We work at EMBL-European Bioinformatics Institute (EMBL-EBI, and we consult extensively with life science researchers to find out what they need from biological data resources. We have found that although users believe that the bioinformatics community is providing accurate and valuable data, they often find the interfaces to these resources tricky to use and navigate. We believe that if you can find out what your users want even before you create the first mock-up of a system, the final product will provide a better user experience. This would encourage more people to use the resource and they would have greater access to the data, which could ultimately lead to more scientific discoveries. In this paper, we explore the need for a user-centred design (UCD strategy when designing bioinformatics resources and illustrate this with examples from our work at EMBL-EBI. Our aim is to introduce the reader to how selected UCD techniques may be successfully applied to software design for bioinformatics.

  17. An Adaptive Gateway Discovery Algorithm to support QoS When Providing Internet Access to Mobile Ad Hoc Networks

    Directory of Open Access Journals (Sweden)

    Mari Carmen Domingo

    2007-04-01

    Full Text Available When a node in an ad hoc network wants Internet access, it needs to obtain information about the available gateways and it should select the most appropriate of them. In this work we propose a new gateway discovery scheme suitable for real-time applications that adjusts the frequency of gateway advertisements dynamically. This adjustment is related to the percentage of real-time sources that have quality of service problems because of excessive end-to-end delays. The optimal values for the configuration parameters (time interval and threshold of the proposed adaptive gateway discovery mechanism for the selected network conditions have been studied with the aid of simulations. The scalability of the proposed scheme with respect to mobility as well as the impact of best-effort traffic load have been analyzed. Simulation results indicate that the proposed scheme significantly improves the average end-to-end delay, jitter and packet delivery ratio of real-time flows; the routing overhead is also reduced and there is no starvation of best-effort traffic.

  18. Bioinformatics Training Network (BTN): a community resource for bioinformatics trainers

    DEFF Research Database (Denmark)

    Schneider, Maria V.; Walter, Peter; Blatter, Marie-Claude;

    2012-01-01

    to the development of ‘high-throughput biology’, the need for training in the field of bioinformatics, in particular, is seeing a resurgence: it has been defined as a key priority by many Institutions and research programmes and is now an important component of many grant proposals. Nevertheless, when it comes...... to planning and preparing to meet such training needs, tension arises between the reward structures that predominate in the scientific community which compel individuals to publish or perish, and the time that must be devoted to the design, delivery and maintenance of high-quality training materials....... Conversely, there is much relevant teaching material and training expertise available worldwide that, were it properly organized, could be exploited by anyone who needs to provide training or needs to set up a new course. To do this, however, the materials would have to be centralized in a database...

  19. Bioinformatics for the synthetic biology of natural products: integrating across the Design-Build-Test cycle.

    Science.gov (United States)

    Carbonell, Pablo; Currin, Andrew; Jervis, Adrian J; Rattray, Nicholas J W; Swainston, Neil; Yan, Cunyu; Takano, Eriko; Breitling, Rainer

    2016-08-27

    Covering: 2000 to 2016Progress in synthetic biology is enabled by powerful bioinformatics tools allowing the integration of the design, build and test stages of the biological engineering cycle. In this review we illustrate how this integration can be achieved, with a particular focus on natural products discovery and production. Bioinformatics tools for the DESIGN and BUILD stages include tools for the selection, synthesis, assembly and optimization of parts (enzymes and regulatory elements), devices (pathways) and systems (chassis). TEST tools include those for screening, identification and quantification of metabolites for rapid prototyping. The main advantages and limitations of these tools as well as their interoperability capabilities are highlighted. PMID:27185383

  20. Bioinformatics for the synthetic biology of natural products: integrating across the Design–Build–Test cycle

    Science.gov (United States)

    Currin, Andrew; Jervis, Adrian J.; Rattray, Nicholas J. W.; Swainston, Neil; Yan, Cunyu; Breitling, Rainer

    2016-01-01

    Covering: 2000 to 2016 Progress in synthetic biology is enabled by powerful bioinformatics tools allowing the integration of the design, build and test stages of the biological engineering cycle. In this review we illustrate how this integration can be achieved, with a particular focus on natural products discovery and production. Bioinformatics tools for the DESIGN and BUILD stages include tools for the selection, synthesis, assembly and optimization of parts (enzymes and regulatory elements), devices (pathways) and systems (chassis). TEST tools include those for screening, identification and quantification of metabolites for rapid prototyping. The main advantages and limitations of these tools as well as their interoperability capabilities are highlighted. PMID:27185383

  1. Computational approaches to natural product discovery

    NARCIS (Netherlands)

    Medema, M.H.; Fischbach, M.A.

    2015-01-01

    Starting with the earliest Streptomyces genome sequences, the promise of natural product genome mining has been captivating: genomics and bioinformatics would transform compound discovery from an ad hoc pursuit to a high-throughput endeavor. Until recently, however, genome mining has advanced natura

  2. BioShaDock: a community driven bioinformatics shared Docker-based tools registry.

    Science.gov (United States)

    Moreews, François; Sallou, Olivier; Ménager, Hervé; Le Bras, Yvan; Monjeaud, Cyril; Blanchet, Christophe; Collin, Olivier

    2015-01-01

    Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community. PMID:26913191

  3. Biology in 'silico': The Bioinformatics Revolution.

    Science.gov (United States)

    Bloom, Mark

    2001-01-01

    Explains the Human Genome Project (HGP) and efforts to sequence the human genome. Describes the role of bioinformatics in the project and considers it the genetics Swiss Army Knife, which has many different uses, for use in forensic science, medicine, agriculture, and environmental sciences. Discusses the use of bioinformatics in the high school…

  4. Using "Arabidopsis" Genetic Sequences to Teach Bioinformatics

    Science.gov (United States)

    Zhang, Xiaorong

    2009-01-01

    This article describes a new approach to teaching bioinformatics using "Arabidopsis" genetic sequences. Several open-ended and inquiry-based laboratory exercises have been designed to help students grasp key concepts and gain practical skills in bioinformatics, using "Arabidopsis" leucine-rich repeat receptor-like kinase (LRR RLK) genetic…

  5. A Mathematical Optimization Problem in Bioinformatics

    Science.gov (United States)

    Heyer, Laurie J.

    2008-01-01

    This article describes the sequence alignment problem in bioinformatics. Through examples, we formulate sequence alignment as an optimization problem and show how to compute the optimal alignment with dynamic programming. The examples and sample exercises have been used by the author in a specialized course in bioinformatics, but could be adapted…

  6. Online Bioinformatics Tutorials | Office of Cancer Genomics

    Science.gov (United States)

    Bioinformatics is a scientific discipline that applies computer science and information technology to help understand biological processes. The NIH provides a list of free online bioinformatics tutorials, either generated by the NIH Library or other institutes, which includes introductory lectures and "how to" videos on using various tools.

  7. Rapid Development of Bioinformatics Education in China

    Science.gov (United States)

    Zhong, Yang; Zhang, Xiaoyan; Ma, Jian; Zhang, Liang

    2003-01-01

    As the Human Genome Project experiences remarkable success and a flood of biological data is produced, bioinformatics becomes a very "hot" cross-disciplinary field, yet experienced bioinformaticians are urgently needed worldwide. This paper summarises the rapid development of bioinformatics education in China, especially related undergraduate…

  8. Developing expertise in bioinformatics for biomedical research in Africa

    OpenAIRE

    Karikari, Thomas K.; Emmanuel Quansah; Wael M.Y. Mohamed

    2015-01-01

    Research in bioinformatics has a central role in helping to advance biomedical research. However, its introduction to Africa has been met with some challenges (such as inadequate infrastructure, training opportunities, research funding, human resources, biorepositories and databases) that have contributed to the slow pace of development in this field across the continent. Fortunately, recent improvements in areas such as research funding, infrastructural support and capacity building are help...

  9. Bioinformatics clouds for big data manipulation

    KAUST Repository

    Dai, Lin

    2012-11-28

    As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics.This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor. 2012 Dai et al.; licensee BioMed Central Ltd.

  10. Bioinformatics clouds for big data manipulation

    Directory of Open Access Journals (Sweden)

    Dai Lin

    2012-11-01

    Full Text Available Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS, Software as a Service (SaaS, Platform as a Service (PaaS, and Infrastructure as a Service (IaaS, and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.

  11. Decades of Discovery

    Science.gov (United States)

    2011-06-01

    For the past two-and-a-half decades, the Office of Science at the U.S. Department of Energy has been at the forefront of scientific discovery. Over 100 important discoveries supported by the Office of Science are represented in this document.

  12. G2LC: Resources Autoscaling for Real Time Bioinformatics Applications in IaaS

    Directory of Open Access Journals (Sweden)

    Rongdong Hu

    2015-01-01

    Full Text Available Cloud computing has started to change the way how bioinformatics research is being carried out. Researchers who have taken advantage of this technology can process larger amounts of data and speed up scientific discovery. The variability in data volume results in variable computing requirements. Therefore, bioinformatics researchers are pursuing more reliable and efficient methods for conducting sequencing analyses. This paper proposes an automated resource provisioning method, G2LC, for bioinformatics applications in IaaS. It enables application to output the results in a real time manner. Its main purpose is to guarantee applications performance, while improving resource utilization. Real sequence searching data of BLAST is used to evaluate the effectiveness of G2LC. Experimental results show that G2LC guarantees the application performance, while resource is saved up to 20.14%.

  13. NETTAB 2014: From high-throughput structural bioinformatics to integrative systems biology.

    Science.gov (United States)

    Romano, Paolo; Cordero, Francesca

    2016-03-02

    The fourteenth NETTAB workshop, NETTAB 2014, was devoted to a range of disciplines going from structural bioinformatics, to proteomics and to integrative systems biology. The topics of the workshop were centred around bioinformatics methods, tools, applications, and perspectives for models, standards and management of high-throughput biological data, structural bioinformatics, functional proteomics, mass spectrometry, drug discovery, and systems biology.43 scientific contributions were presented at NETTAB 2014, including keynote, special guest and tutorial talks, oral communications, and posters. Full papers from some of the best contributions presented at the workshop were later submitted to a special Call for this Supplement.Here, we provide an overview of the workshop and introduce manuscripts that have been accepted for publication in this Supplement.

  14. G2LC: Resources Autoscaling for Real Time Bioinformatics Applications in IaaS.

    Science.gov (United States)

    Hu, Rongdong; Liu, Guangming; Jiang, Jingfei; Wang, Lixin

    2015-01-01

    Cloud computing has started to change the way how bioinformatics research is being carried out. Researchers who have taken advantage of this technology can process larger amounts of data and speed up scientific discovery. The variability in data volume results in variable computing requirements. Therefore, bioinformatics researchers are pursuing more reliable and efficient methods for conducting sequencing analyses. This paper proposes an automated resource provisioning method, G2LC, for bioinformatics applications in IaaS. It enables application to output the results in a real time manner. Its main purpose is to guarantee applications performance, while improving resource utilization. Real sequence searching data of BLAST is used to evaluate the effectiveness of G2LC. Experimental results show that G2LC guarantees the application performance, while resource is saved up to 20.14%.

  15. Semantics in support of biodiversity knowledge discovery: an introduction to the biological collections ontology and related ontologies.

    Science.gov (United States)

    Walls, Ramona L; Deck, John; Guralnick, Robert; Baskauf, Steve; Beaman, Reed; Blum, Stanley; Bowers, Shawn; Buttigieg, Pier Luigi; Davies, Neil; Endresen, Dag; Gandolfo, Maria Alejandra; Hanner, Robert; Janning, Alyssa; Krishtalka, Leonard; Matsunaga, Andréa; Midford, Peter; Morrison, Norman; Ó Tuama, Éamonn; Schildhauer, Mark; Smith, Barry; Stucky, Brian J; Thomer, Andrea; Wieczorek, John; Whitacre, Jamie; Wooley, John

    2014-01-01

    The study of biodiversity spans many disciplines and includes data pertaining to species distributions and abundances, genetic sequences, trait measurements, and ecological niches, complemented by information on collection and measurement protocols. A review of the current landscape of metadata standards and ontologies in biodiversity science suggests that existing standards such as the Darwin Core terminology are inadequate for describing biodiversity data in a semantically meaningful and computationally useful way. Existing ontologies, such as the Gene Ontology and others in the Open Biological and Biomedical Ontologies (OBO) Foundry library, provide a semantic structure but lack many of the necessary terms to describe biodiversity data in all its dimensions. In this paper, we describe the motivation for and ongoing development of a new Biological Collections Ontology, the Environment Ontology, and the Population and Community Ontology. These ontologies share the aim of improving data aggregation and integration across the biodiversity domain and can be used to describe physical samples and sampling processes (for example, collection, extraction, and preservation techniques), as well as biodiversity observations that involve no physical sampling. Together they encompass studies of: 1) individual organisms, including voucher specimens from ecological studies and museum specimens, 2) bulk or environmental samples (e.g., gut contents, soil, water) that include DNA, other molecules, and potentially many organisms, especially microbes, and 3) survey-based ecological observations. We discuss how these ontologies can be applied to biodiversity use cases that span genetic, organismal, and ecosystem levels of organization. We argue that if adopted as a standard and rigorously applied and enriched by the biodiversity community, these ontologies would significantly reduce barriers to data discovery, integration, and exchange among biodiversity resources and researchers.

  16. Semantics in support of biodiversity knowledge discovery: an introduction to the biological collections ontology and related ontologies.

    Directory of Open Access Journals (Sweden)

    Ramona L Walls

    Full Text Available The study of biodiversity spans many disciplines and includes data pertaining to species distributions and abundances, genetic sequences, trait measurements, and ecological niches, complemented by information on collection and measurement protocols. A review of the current landscape of metadata standards and ontologies in biodiversity science suggests that existing standards such as the Darwin Core terminology are inadequate for describing biodiversity data in a semantically meaningful and computationally useful way. Existing ontologies, such as the Gene Ontology and others in the Open Biological and Biomedical Ontologies (OBO Foundry library, provide a semantic structure but lack many of the necessary terms to describe biodiversity data in all its dimensions. In this paper, we describe the motivation for and ongoing development of a new Biological Collections Ontology, the Environment Ontology, and the Population and Community Ontology. These ontologies share the aim of improving data aggregation and integration across the biodiversity domain and can be used to describe physical samples and sampling processes (for example, collection, extraction, and preservation techniques, as well as biodiversity observations that involve no physical sampling. Together they encompass studies of: 1 individual organisms, including voucher specimens from ecological studies and museum specimens, 2 bulk or environmental samples (e.g., gut contents, soil, water that include DNA, other molecules, and potentially many organisms, especially microbes, and 3 survey-based ecological observations. We discuss how these ontologies can be applied to biodiversity use cases that span genetic, organismal, and ecosystem levels of organization. We argue that if adopted as a standard and rigorously applied and enriched by the biodiversity community, these ontologies would significantly reduce barriers to data discovery, integration, and exchange among biodiversity resources and

  17. A Survey on Evolutionary Algorithm Based Hybrid Intelligence in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Shan Li

    2014-01-01

    Full Text Available With the rapid advance in genomics, proteomics, metabolomics, and other types of omics technologies during the past decades, a tremendous amount of data related to molecular biology has been produced. It is becoming a big challenge for the bioinformatists to analyze and interpret these data with conventional intelligent techniques, for example, support vector machines. Recently, the hybrid intelligent methods, which integrate several standard intelligent approaches, are becoming more and more popular due to their robustness and efficiency. Specifically, the hybrid intelligent approaches based on evolutionary algorithms (EAs are widely used in various fields due to the efficiency and robustness of EAs. In this review, we give an introduction about the applications of hybrid intelligent methods, in particular those based on evolutionary algorithm, in bioinformatics. In particular, we focus on their applications to three common problems that arise in bioinformatics, that is, feature selection, parameter estimation, and reconstruction of biological networks.

  18. When cloud computing meets bioinformatics: a review.

    Science.gov (United States)

    Zhou, Shuigeng; Liao, Ruiqi; Guan, Jihong

    2013-10-01

    In the past decades, with the rapid development of high-throughput technologies, biology research has generated an unprecedented amount of data. In order to store and process such a great amount of data, cloud computing and MapReduce were applied to many fields of bioinformatics. In this paper, we first introduce the basic concepts of cloud computing and MapReduce, and their applications in bioinformatics. We then highlight some problems challenging the applications of cloud computing and MapReduce to bioinformatics. Finally, we give a brief guideline for using cloud computing in biology research.

  19. Bioinformatics in microbial biotechnology – a mini review

    Directory of Open Access Journals (Sweden)

    Bansal Arvind K

    2005-06-01

    expression analysis to derive regulatory pathways, the development of statistical techniques, clustering techniques and data mining techniques to derive protein-protein and protein-DNA interactions, and modeling of 3D structure of proteins and 3D docking between proteins and biochemicals for rational drug design, difference analysis between pathogenic and non-pathogenic strains to identify candidate genes for vaccines and anti-microbial agents, and the whole genome comparison to understand the microbial evolution. The development of bioinformatics techniques has enhanced the pace of biological discovery by automated analysis of large number of microbial genomes. We are on the verge of using all this knowledge to understand cellular mechanisms at the systemic level. The developed bioinformatics techniques have potential to facilitate (i the discovery of causes of diseases, (ii vaccine and rational drug design, and (iii improved cost effective agents for bioremediation by pruning out the dead ends. Despite the fast paced global effort, the current analysis is limited by the lack of available gene-functionality from the wet-lab data, the lack of computer algorithms to explore vast amount of data with unknown functionality, limited availability of protein-protein and protein-DNA interactions, and the lack of knowledge of temporal and transient behavior of genes and pathways.

  20. Elucidating the role of topological pattern discovery and support vector machine in generating predictive models for Indian summer monsoon rainfall

    Science.gov (United States)

    Chattopadhyay, Manojit; Chattopadhyay, Surajit

    2016-10-01

    The present paper reports a study, where growing hierarchical self-organising map (GHSOM) has been applied to achieve a visual cluster analysis to the Indian rainfall dataset consisting of 142 years of Indian rainfall data so that the yearly rainfall can be segregated into small groups to visualise the pattern of clustering behaviour of yearly rainfall due to changes in monthly rainfall for each year. Also, through support vector machine (SVM), it has been observed that generation of clusters impacts positively on the prediction of the Indian summer monsoon rainfall. Results have been presented through statistical and graphical analyses.

  1. Integrative Bioinformatics for Genomics and Proteomics

    OpenAIRE

    Wu, C.H.

    2011-01-01

    Systems integration is becoming the driving force for 21st century biology. Researchers are systematically tackling gene functions and complex regulatory processes by studying organisms at different levels of organization, from genomes and transcriptomes to proteomes and interactomes. To fully realize the value of such high-throughput data requires advanced bioinformatics for integration, mining, comparative analysis, and functional interpretation. We are developing a bioinformatics research ...

  2. Concepts and introduction to RNA bioinformatics

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Hofacker, Ivo L.; Ruzzo, Walter L.

    2014-01-01

    RNA bioinformatics and computational RNA biology have emerged from implementing methods for predicting the secondary structure of single sequences. The field has evolved to exploit multiple sequences to take evolutionary information into account, such as compensating (and structure preserving) base...... for interactions between RNA and proteins.Here, we introduce the basic concepts of predicting RNA secondary structure relevant to the further analyses of RNA sequences. We also provide pointers to methods addressing various aspects of RNA bioinformatics and computational RNA biology....

  3. Bioinformatics for saffron (Crocus sativus L.) improvement

    OpenAIRE

    Ghulam A. PARRAY; Abdul G. Rather; Parvez Sofi; Shafiq A. Wani; Amjad M. Husaini; Asif B. Shikari; Javid I. Mir

    2009-01-01

    Saffron (Crocus sativus L.) is a sterile triploid plant and belongs to the Iridaceae (Liliales, Monocots). Its genome is of relatively large size and is poorly characterized. Bioinformatics can play an enormous technical role in the sequence-level structural characterization of saffron genomic DNA. Bioinformatics tools can also help in appreciating the extent of diversity of various geographic or genetic groups of cultivated saffron to infer relationships between groups and accessions. The ch...

  4. Bioinformatics Identification of Antigenic Peptide: Predicting the Specificity of Major MHC Class I and II Pathway Players

    DEFF Research Database (Denmark)

    Lund, Ole; Karosiene, Edita; Lundegaard, Claus;

    2013-01-01

    Bioinformatics methods for immunology have become increasingly used over the last decade and now form an integrated part of most epitope discovery projects. This wide usage has led to the confusion of defining which of the many methods to use for what problems. In this chapter, an overview is given...

  5. MAPI: towards the integrated exploitation of bioinformatics Web Services

    Directory of Open Access Journals (Sweden)

    Karlsson Johan

    2011-10-01

    Full Text Available Abstract Background Bioinformatics is commonly featured as a well assorted list of available web resources. Although diversity of services is positive in general, the proliferation of tools, their dispersion and heterogeneity complicate the integrated exploitation of such data processing capacity. Results To facilitate the construction of software clients and make integrated use of this variety of tools, we present a modular programmatic application interface (MAPI that provides the necessary functionality for uniform representation of Web Services metadata descriptors including their management and invocation protocols of the services which they represent. This document describes the main functionality of the framework and how it can be used to facilitate the deployment of new software under a unified structure of bioinformatics Web Services. A notable feature of MAPI is the modular organization of the functionality into different modules associated with specific tasks. This means that only the modules needed for the client have to be installed, and that the module functionality can be extended without the need for re-writing the software client. Conclusions The potential utility and versatility of the software library has been demonstrated by the implementation of several currently available clients that cover different aspects of integrated data processing, ranging from service discovery to service invocation with advanced features such as workflows composition and asynchronous services calls to multiple types of Web Services including those registered in repositories (e.g. GRID-based, SOAP, BioMOBY, R-bioconductor, and others.

  6. MOWServ: a web client for integration of bioinformatic resources

    Science.gov (United States)

    Ramírez, Sergio; Muñoz-Mérida, Antonio; Karlsson, Johan; García, Maximiliano; Pérez-Pulido, Antonio J.; Claros, M. Gonzalo; Trelles, Oswaldo

    2010-01-01

    The productivity of any scientist is affected by cumbersome, tedious and time-consuming tasks that try to make the heterogeneous web services compatible so that they can be useful in their research. MOWServ, the bioinformatic platform offered by the Spanish National Institute of Bioinformatics, was released to provide integrated access to databases and analytical tools. Since its release, the number of available services has grown dramatically, and it has become one of the main contributors of registered services in the EMBRACE Biocatalogue. The ontology that enables most of the web-service compatibility has been curated, improved and extended. The service discovery has been greatly enhanced by Magallanes software and biodataSF. User data are securely stored on the main server by an authentication protocol that enables the monitoring of current or already-finished user’s tasks, as well as the pipelining of successive data processing services. The BioMoby standard has been greatly extended with the new features included in the MOWServ, such as management of additional information (metadata such as extended descriptions, keywords and datafile examples), a qualified registry, error handling, asynchronous services and service replication. All of them have increased the MOWServ service quality, usability and robustness. MOWServ is available at http://www.inab.org/MOWServ/ and has a mirror at http://www.bitlab-es.com/MOWServ/. PMID:20525794

  7. Coronavirus Genomics and Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Kwok-Yung Yuen

    2010-08-01

    Full Text Available The drastic increase in the number of coronaviruses discovered and coronavirus genomes being sequenced have given us an unprecedented opportunity to perform genomics and bioinformatics analysis on this family of viruses. Coronaviruses possess the largest genomes (26.4 to 31.7 kb among all known RNA viruses, with G + C contents varying from 32% to 43%. Variable numbers of small ORFs are present between the various conserved genes (ORF1ab, spike, envelope, membrane and nucleocapsid and downstream to nucleocapsid gene in different coronavirus lineages. Phylogenetically, three genera, Alphacoronavirus, Betacoronavirus and Gammacoronavirus, with Betacoronavirus consisting of subgroups A, B, C and D, exist. A fourth genus, Deltacoronavirus, which includes bulbul coronavirus HKU11, thrush coronavirus HKU12 and munia coronavirus HKU13, is emerging. Molecular clock analysis using various gene loci revealed that the time of most recent common ancestor of human/civet SARS related coronavirus to be 1999-2002, with estimated substitution rate of 4´10-4 to 2´10-2 substitutions per site per year. Recombination in coronaviruses was most notable between different strains of murine hepatitis virus (MHV, between different strains of infectious bronchitis virus, between MHV and bovine coronavirus, between feline coronavirus (FCoV type I and canine coronavirus generating FCoV type II, and between the three genotypes of human coronavirus HKU1 (HCoV-HKU1. Codon usage bias in coronaviruses were observed, with HCoV-HKU1 showing the most extreme bias, and cytosine deamination and selection of CpG suppressed clones are the two major independent biological forces that shape such codon usage bias in coronaviruses.

  8. Integrative cluster analysis in bioinformatics

    CERN Document Server

    Abu-Jamous, Basel; Nandi, Asoke K

    2015-01-01

    Clustering techniques are increasingly being put to use in the analysis of high-throughput biological datasets. Novel computational techniques to analyse high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. This book details the complete pathway of cluster analysis, from the basics of molecular biology to the generation of biological knowledge. The book also presents the latest clustering methods and clustering validation, thereby offering the reader a comprehensive review o

  9. Experiences with workflows for automating data-intensive bioinformatics.

    Science.gov (United States)

    Spjuth, Ola; Bongcam-Rudloff, Erik; Hernández, Guillermo Carrasco; Forer, Lukas; Giovacchini, Mario; Guimera, Roman Valls; Kallio, Aleksi; Korpelainen, Eija; Kańduła, Maciej M; Krachunov, Milko; Kreil, David P; Kulev, Ognyan; Łabaj, Paweł P; Lampa, Samuel; Pireddu, Luca; Schönherr, Sebastian; Siretskiy, Alexey; Vassilev, Dimitar

    2015-01-01

    High-throughput technologies, such as next-generation sequencing, have turned molecular biology into a data-intensive discipline, requiring bioinformaticians to use high-performance computing resources and carry out data management and analysis tasks on large scale. Workflow systems can be useful to simplify construction of analysis pipelines that automate tasks, support reproducibility and provide measures for fault-tolerance. However, workflow systems can incur significant development and administration overhead so bioinformatics pipelines are often still built without them. We present the experiences with workflows and workflow systems within the bioinformatics community participating in a series of hackathons and workshops of the EU COST action SeqAhead. The organizations are working on similar problems, but we have addressed them with different strategies and solutions. This fragmentation of efforts is inefficient and leads to redundant and incompatible solutions. Based on our experiences we define a set of recommendations for future systems to enable efficient yet simple bioinformatics workflow construction and execution. PMID:26282399

  10. SNPTrack™ : an integrated bioinformatics system for genetic association studies.

    Science.gov (United States)

    Xu, Joshua; Kelly, Reagan; Zhou, Guangxu; Turner, Steven A; Ding, Don; Harris, Stephen C; Hong, Huixiao; Fang, Hong; Tong, Weida

    2012-01-01

    A genetic association study is a complicated process that involves collecting phenotypic data, generating genotypic data, analyzing associations between genotypic and phenotypic data, and interpreting genetic biomarkers identified. SNPTrack is an integrated bioinformatics system developed by the US Food and Drug Administration (FDA) to support the review and analysis of pharmacogenetics data resulting from FDA research or submitted by sponsors. The system integrates data management, analysis, and interpretation in a single platform for genetic association studies. Specifically, it stores genotyping data and single-nucleotide polymorphism (SNP) annotations along with study design data in an Oracle database. It also integrates popular genetic analysis tools, such as PLINK and Haploview. SNPTrack provides genetic analysis capabilities and captures analysis results in its database as SNP lists that can be cross-linked for biological interpretation to gene/protein annotations, Gene Ontology, and pathway analysis data. With SNPTrack, users can do the entire stream of bioinformatics jobs for genetic association studies. SNPTrack is freely available to the public at http://www.fda.gov/ScienceResearch/BioinformaticsTools/SNPTrack/default.htm. PMID:23245293

  11. ISEV position paper: extracellular vesicle RNA analysis and bioinformatics

    Directory of Open Access Journals (Sweden)

    Andrew F. Hill

    2013-12-01

    Full Text Available Extracellular vesicles (EVs are the collective term for the various vesicles that are released by cells into the extracellular space. Such vesicles include exosomes and microvesicles, which vary by their size and/or protein and genetic cargo. With the discovery that EVs contain genetic material in the form of RNA (evRNA has come the increased interest in these vesicles for their potential use as sources of disease biomarkers and potential therapeutic agents. Rapid developments in the availability of deep sequencing technologies have enabled the study of EV-related RNA in detail. In October 2012, the International Society for Extracellular Vesicles (ISEV held a workshop on “evRNA analysis and bioinformatics.” Here, we report the conclusions of one of the roundtable discussions where we discussed evRNA analysis technologies and provide some guidelines to researchers in the field to consider when performing such analysis.

  12. SPOT--towards temporal data mining in medicine and bioinformatics.

    Science.gov (United States)

    Tusch, Guenter; Bretl, Chris; O'Connor, Martin; Connor, Martin; Das, Amar

    2008-01-01

    Mining large clinical and bioinformatics databases often includes exploration of temporal data. E.g., in liver transplantation, researchers might look for patients with an unusual time pattern of potential complications of the liver. In Knowledge-based Temporal Abstraction time-stamped data points are transformed into an interval-based representation. We extended this framework by creating an open-source platform, SPOT. It supports the R statistical package and knowledge representation standards (OWL, SWRL) using the open source Semantic Web tool Protégé-OWL. PMID:18999225

  13. WU-Blast2 server at the European Bioinformatics Institute

    OpenAIRE

    Lopez, Rodrigo; Silventoinen, Ville; Robinson, Stephen; Kibria, Asif; Gish, Warren

    2003-01-01

    Since 1995, the WU-BLAST programs (http://blast.wustl.edu) have provided a fast, flexible and reliable method for similarity searching of biological sequence databases. The software is in use at many locales and web sites. The European Bioinformatics Institute's WU-Blast2 (http://www.ebi.ac.uk/blast2/) server has been providing free access to these search services since 1997 and today supports many features that both enhance the usability and expand on the scope of the software.

  14. Adapting bioinformatics curricula for big data.

    Science.gov (United States)

    Greene, Anna C; Giffin, Kristine A; Greene, Casey S; Moore, Jason H

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs.

  15. Composable languages for bioinformatics: the NYoSh experiment

    Directory of Open Access Journals (Sweden)

    Manuele Simi

    2014-01-01

    Full Text Available Language WorkBenches (LWBs are software engineering tools that help domain experts develop solutions to various classes of problems. Some of these tools focus on non-technical users and provide languages to help organize knowledge while other workbenches provide means to create new programming languages. A key advantage of language workbenches is that they support the seamless composition of independently developed languages. This capability is useful when developing programs that can benefit from different levels of abstraction. We reasoned that language workbenches could be useful to develop bioinformatics software solutions. In order to evaluate the potential of language workbenches in bioinformatics, we tested a prominent workbench by developing an alternative to shell scripting. To illustrate what LWBs and Language Composition can bring to bioinformatics, we report on our design and development of NYoSh (Not Your ordinary Shell. NYoSh was implemented as a collection of languages that can be composed to write programs as expressive and concise as shell scripts. This manuscript offers a concrete illustration of the advantages and current minor drawbacks of using the MPS LWB. For instance, we found that we could implement an environment-aware editor for NYoSh that can assist the programmers when developing scripts for specific execution environments. This editor further provides semantic error detection and can be compiled interactively with an automatic build and deployment system. In contrast to shell scripts, NYoSh scripts can be written in a modern development environment, supporting context dependent intentions and can be extended seamlessly by end-users with new abstractions and language constructs. We further illustrate language extension and composition with LWBs by presenting a tight integration of NYoSh scripts with the GobyWeb system. The NYoSh Workbench prototype, which implements a fully featured integrated development environment

  16. Composable languages for bioinformatics: the NYoSh experiment.

    Science.gov (United States)

    Simi, Manuele; Campagne, Fabien

    2014-01-01

    Language WorkBenches (LWBs) are software engineering tools that help domain experts develop solutions to various classes of problems. Some of these tools focus on non-technical users and provide languages to help organize knowledge while other workbenches provide means to create new programming languages. A key advantage of language workbenches is that they support the seamless composition of independently developed languages. This capability is useful when developing programs that can benefit from different levels of abstraction. We reasoned that language workbenches could be useful to develop bioinformatics software solutions. In order to evaluate the potential of language workbenches in bioinformatics, we tested a prominent workbench by developing an alternative to shell scripting. To illustrate what LWBs and Language Composition can bring to bioinformatics, we report on our design and development of NYoSh (Not Your ordinary Shell). NYoSh was implemented as a collection of languages that can be composed to write programs as expressive and concise as shell scripts. This manuscript offers a concrete illustration of the advantages and current minor drawbacks of using the MPS LWB. For instance, we found that we could implement an environment-aware editor for NYoSh that can assist the programmers when developing scripts for specific execution environments. This editor further provides semantic error detection and can be compiled interactively with an automatic build and deployment system. In contrast to shell scripts, NYoSh scripts can be written in a modern development environment, supporting context dependent intentions and can be extended seamlessly by end-users with new abstractions and language constructs. We further illustrate language extension and composition with LWBs by presenting a tight integration of NYoSh scripts with the GobyWeb system. The NYoSh Workbench prototype, which implements a fully featured integrated development environment for NYoSh is

  17. BioShaDock: a community driven bioinformatics shared Docker-based tools registry [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    François Moreews

    2015-12-01

    Full Text Available Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community.

  18. An environment for knowledge discovery in biology.

    Science.gov (United States)

    Barrera, Junior; Cesar, Roberto M; Ferreira, João E; Gubitoso, Marco D

    2004-07-01

    This paper describes a data mining environment for knowledge discovery in bioinformatics applications. The system has a generic kernel that implements the mining functions to be applied to input primary databases, with a warehouse architecture, of biomedical information. Both supervised and unsupervised classification can be implemented within the kernel and applied to data extracted from the primary database, with the results being suitably stored in a complex object database for knowledge discovery. The kernel also includes a specific high-performance library that allows designing and applying the mining functions in parallel machines. The experimental results obtained by the application of the kernel functions are reported.

  19. Evolution of web services in bioinformatics

    NARCIS (Netherlands)

    Neerincx, P.B.T.; Leunissen, J.A.M.

    2005-01-01

    Bioinformaticians have developed large collections of tools to make sense of the rapidly growing pool of molecular biological data. Biological systems tend to be complex and in order to understand them, it is often necessary to link many data sets and use more than one tool. Therefore, bioinformatic

  20. A bioinformatics approach to marker development

    NARCIS (Netherlands)

    Tang, J.

    2008-01-01

    The thesis focuses on two bioinformatics research topics: the development of tools for an efficient and reliable identification of single nucleotides polymorphisms (SNPs) and polymorphic simple sequence repeats (SSRs) from expressed sequence tags (ESTs) (Chapter 2, 3 and 4), and the subsequent imple

  1. "Extreme Programming" in a Bioinformatics Class

    Science.gov (United States)

    Kelley, Scott; Alger, Christianna; Deutschman, Douglas

    2009-01-01

    The importance of Bioinformatics tools and methodology in modern biological research underscores the need for robust and effective courses at the college level. This paper describes such a course designed on the principles of cooperative learning based on a computer software industry production model called "Extreme Programming" (EP). The…

  2. Hardware Acceleration of Bioinformatics Sequence Alignment Applications

    NARCIS (Netherlands)

    Hasan, L.

    2011-01-01

    Biological sequence alignment is an important and challenging task in bioinformatics. Alignment may be defined as an arrangement of two or more DNA or protein sequences to highlight the regions of their similarity. Sequence alignment is used to infer the evolutionary relationship between a set of pr

  3. Bioboxes: standardised containers for interchangeable bioinformatics software.

    Science.gov (United States)

    Belmann, Peter; Dröge, Johannes; Bremges, Andreas; McHardy, Alice C; Sczyrba, Alexander; Barton, Michael D

    2015-01-01

    Software is now both central and essential to modern biology, yet lack of availability, difficult installations, and complex user interfaces make software hard to obtain and use. Containerisation, as exemplified by the Docker platform, has the potential to solve the problems associated with sharing software. We propose bioboxes: containers with standardised interfaces to make bioinformatics software interchangeable. PMID:26473029

  4. Bioinformatics: A History of Evolution "In Silico"

    Science.gov (United States)

    Ondrej, Vladan; Dvorak, Petr

    2012-01-01

    Bioinformatics, biological databases, and the worldwide use of computers have accelerated biological research in many fields, such as evolutionary biology. Here, we describe a primer of nucleotide sequence management and the construction of a phylogenetic tree with two examples; the two selected are from completely different groups of organisms:…

  5. Privacy Preserving PCA on Distributed Bioinformatics Datasets

    Science.gov (United States)

    Li, Xin

    2011-01-01

    In recent years, new bioinformatics technologies, such as gene expression microarray, genome-wide association study, proteomics, and metabolomics, have been widely used to simultaneously identify a huge number of human genomic/genetic biomarkers, generate a tremendously large amount of data, and dramatically increase the knowledge on human…

  6. Bioinformatics in Undergraduate Education: Practical Examples

    Science.gov (United States)

    Boyle, John A.

    2004-01-01

    Bioinformatics has emerged as an important research tool in recent years. The ability to mine large databases for relevant information has become increasingly central to many different aspects of biochemistry and molecular biology. It is important that undergraduates be introduced to the available information and methodologies. We present a…

  7. Implementing bioinformatic workflows within the bioextract server

    Science.gov (United States)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows typically require the integrated use of multiple, distributed data sources and analytic tools. The BioExtract Server (http://bioextract.org) is a distributed servi...

  8. Agile parallel bioinformatics workflow management using Pwrake

    Directory of Open Access Journals (Sweden)

    Tanaka Masahiro

    2011-09-01

    Full Text Available Abstract Background In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error. Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. Findings We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Conclusions Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows

  9. Component-Based Approach for Educating Students in Bioinformatics

    Science.gov (United States)

    Poe, D.; Venkatraman, N.; Hansen, C.; Singh, G.

    2009-01-01

    There is an increasing need for an effective method of teaching bioinformatics. Increased progress and availability of computer-based tools for educating students have led to the implementation of a computer-based system for teaching bioinformatics as described in this paper. Bioinformatics is a recent, hybrid field of study combining elements of…

  10. Volatility Discovery

    DEFF Research Database (Denmark)

    Dias, Gustavo Fruet; Scherrer, Cristina; Papailias, Fotis

    There is a large literature that investigates how homogenous securities traded on different markets incorporate new information (price discovery analysis). We extend this concept to the stochastic volatility process and investigate how markets contribute to the efficient stochastic volatility whi...

  11. Bioinformatics education in high school: implications for promoting science, technology, engineering, and mathematics careers.

    Science.gov (United States)

    Kovarik, Dina N; Patterson, Davis G; Cohen, Carolyn; Sanders, Elizabeth A; Peterson, Karen A; Porter, Sandra G; Chowning, Jeanne Ting

    2013-01-01

    We investigated the effects of our Bio-ITEST teacher professional development model and bioinformatics curricula on cognitive traits (awareness, engagement, self-efficacy, and relevance) in high school teachers and students that are known to accompany a developing interest in science, technology, engineering, and mathematics (STEM) careers. The program included best practices in adult education and diverse resources to empower teachers to integrate STEM career information into their classrooms. The introductory unit, Using Bioinformatics: Genetic Testing, uses bioinformatics to teach basic concepts in genetics and molecular biology, and the advanced unit, Using Bioinformatics: Genetic Research, utilizes bioinformatics to study evolution and support student research with DNA barcoding. Pre-post surveys demonstrated significant growth (n = 24) among teachers in their preparation to teach the curricula and infuse career awareness into their classes, and these gains were sustained through the end of the academic year. Introductory unit students (n = 289) showed significant gains in awareness, relevance, and self-efficacy. While these students did not show significant gains in engagement, advanced unit students (n = 41) showed gains in all four cognitive areas. Lessons learned during Bio-ITEST are explored in the context of recommendations for other programs that wish to increase student interest in STEM careers.

  12. Associations between Input and Outcome Variables in an Online High School Bioinformatics Instructional Program

    Science.gov (United States)

    Lownsbery, Douglas S.

    Quantitative data from a completed year of an innovative online high school bioinformatics instructional program were analyzed as part of a descriptive research study. The online instructional program provided the opportunity for high school students to develop content understandings of molecular genetics and to use sophisticated bioinformatics tools and methodologies to conduct authentic research. Quantitative data were analyzed to identify potential associations between independent program variables including implementation setting, gender, and student educational backgrounds and dependent variables indicating success in the program including completion rates for analyzing DNA clones and performance gains from pre-to-post assessments of bioinformatics knowledge. Study results indicate that understanding associations between student educational backgrounds and level of success may be useful for structuring collaborative learning groups and enhancing scaffolding and support during the program to promote higher levels of success for participating students.

  13. Cloud Infrastructures for In Silico Drug Discovery: Economic and Practical Aspects

    Directory of Open Access Journals (Sweden)

    Daniele D'Agostino

    2013-01-01

    Full Text Available Cloud computing opens new perspectives for small-medium biotechnology laboratories that need to perform bioinformatics analysis in a flexible and effective way. This seems particularly true for hybrid clouds that couple the scalability offered by general-purpose public clouds with the greater control and ad hoc customizations supplied by the private ones. A hybrid cloud broker, acting as an intermediary between users and public providers, can support customers in the selection of the most suitable offers, optionally adding the provisioning of dedicated services with higher levels of quality. This paper analyses some economic and practical aspects of exploiting cloud computing in a real research scenario for the in silico drug discovery in terms of requirements, costs, and computational load based on the number of expected users. In particular, our work is aimed at supporting both the researchers and the cloud broker delivering an IaaS cloud infrastructure for biotechnology laboratories exposing different levels of nonfunctional requirements.

  14. Cloud infrastructures for in silico drug discovery: economic and practical aspects.

    Science.gov (United States)

    D'Agostino, Daniele; Clematis, Andrea; Quarati, Alfonso; Cesini, Daniele; Chiappori, Federica; Milanesi, Luciano; Merelli, Ivan

    2013-01-01

    Cloud computing opens new perspectives for small-medium biotechnology laboratories that need to perform bioinformatics analysis in a flexible and effective way. This seems particularly true for hybrid clouds that couple the scalability offered by general-purpose public clouds with the greater control and ad hoc customizations supplied by the private ones. A hybrid cloud broker, acting as an intermediary between users and public providers, can support customers in the selection of the most suitable offers, optionally adding the provisioning of dedicated services with higher levels of quality. This paper analyses some economic and practical aspects of exploiting cloud computing in a real research scenario for the in silico drug discovery in terms of requirements, costs, and computational load based on the number of expected users. In particular, our work is aimed at supporting both the researchers and the cloud broker delivering an IaaS cloud infrastructure for biotechnology laboratories exposing different levels of nonfunctional requirements. PMID:24106693

  15. Cloud infrastructures for in silico drug discovery: economic and practical aspects.

    Science.gov (United States)

    D'Agostino, Daniele; Clematis, Andrea; Quarati, Alfonso; Cesini, Daniele; Chiappori, Federica; Milanesi, Luciano; Merelli, Ivan

    2013-01-01

    Cloud computing opens new perspectives for small-medium biotechnology laboratories that need to perform bioinformatics analysis in a flexible and effective way. This seems particularly true for hybrid clouds that couple the scalability offered by general-purpose public clouds with the greater control and ad hoc customizations supplied by the private ones. A hybrid cloud broker, acting as an intermediary between users and public providers, can support customers in the selection of the most suitable offers, optionally adding the provisioning of dedicated services with higher levels of quality. This paper analyses some economic and practical aspects of exploiting cloud computing in a real research scenario for the in silico drug discovery in terms of requirements, costs, and computational load based on the number of expected users. In particular, our work is aimed at supporting both the researchers and the cloud broker delivering an IaaS cloud infrastructure for biotechnology laboratories exposing different levels of nonfunctional requirements.

  16. Motif Discovery in Tissue-Specific Regulatory Sequences Using Directed Information

    Directory of Open Access Journals (Sweden)

    James Douglas Engel

    2007-12-01

    Full Text Available Motif discovery for the identification of functional regulatory elements underlying gene expression is a challenging problem. Sequence inspection often leads to discovery of novel motifs (including transcription factor sites with previously uncharacterized function in gene expression. Coupled with the complexity underlying tissue-specific gene expression, there are several motifs that are putatively responsible for expression in a certain cell type. This has important implications in understanding fundamental biological processes such as development and disease progression. In this work, we present an approach to the identification of motifs (not necessarily transcription factor sites and examine its application to some questions in current bioinformatics research. These motifs are seen to discriminate tissue-specific gene promoter or regulatory regions from those that are not tissue-specific. There are two main contributions of this work. Firstly, we propose the use of directed information for such classification constrained motif discovery, and then use the selected features with a support vector machine (SVM classifier to find the tissue specificity of any sequence of interest. Such analysis yields several novel interesting motifs that merit further experimental characterization. Furthermore, this approach leads to a principled framework for the prospective examination of any chosen motif to be discriminatory motif for a group of coexpressed/coregulated genes, thereby integrating sequence and expression perspectives. We hypothesize that the discovery of these motifs would enable the large-scale investigation for the tissue-specific regulatory role of any conserved sequence element identified from genome-wide studies.

  17. Motif Discovery in Tissue-Specific Regulatory Sequences Using Directed Information

    Directory of Open Access Journals (Sweden)

    States David

    2007-01-01

    Full Text Available Motif discovery for the identification of functional regulatory elements underlying gene expression is a challenging problem. Sequence inspection often leads to discovery of novel motifs (including transcription factor sites with previously uncharacterized function in gene expression. Coupled with the complexity underlying tissue-specific gene expression, there are several motifs that are putatively responsible for expression in a certain cell type. This has important implications in understanding fundamental biological processes such as development and disease progression. In this work, we present an approach to the identification of motifs (not necessarily transcription factor sites and examine its application to some questions in current bioinformatics research. These motifs are seen to discriminate tissue-specific gene promoter or regulatory regions from those that are not tissue-specific. There are two main contributions of this work. Firstly, we propose the use of directed information for such classification constrained motif discovery, and then use the selected features with a support vector machine (SVM classifier to find the tissue specificity of any sequence of interest. Such analysis yields several novel interesting motifs that merit further experimental characterization. Furthermore, this approach leads to a principled framework for the prospective examination of any chosen motif to be discriminatory motif for a group of coexpressed/coregulated genes, thereby integrating sequence and expression perspectives. We hypothesize that the discovery of these motifs would enable the large-scale investigation for the tissue-specific regulatory role of any conserved sequence element identified from genome-wide studies.

  18. Bioinformatics analyses for signal transduction networks

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    Research in signaling networks contributes to a deeper understanding of organism living activities. With the development of experimental methods in the signal transduction field, more and more mechanisms of signaling pathways have been discovered. This paper introduces such popular bioin-formatics analysis methods for signaling networks as the common mechanism of signaling pathways and database resource on the Internet, summerizes the methods of analyzing the structural properties of networks, including structural Motif finding and automated pathways generation, and discusses the modeling and simulation of signaling networks in detail, as well as the research situation and tendency in this area. Now the investigation of signal transduction is developing from small-scale experiments to large-scale network analysis, and dynamic simulation of networks is closer to the real system. With the investigation going deeper than ever, the bioinformatics analysis of signal transduction would have immense space for development and application.

  19. Bioinformatics Approaches for Human Gut Microbiome Research

    Directory of Open Access Journals (Sweden)

    Zhijun Zheng

    2016-07-01

    Full Text Available The human microbiome has received much attention because many studies have reported that the human gut microbiome is associated with several diseases. The very large datasets that are produced by these kinds of studies means that bioinformatics approaches are crucial for their analysis. Here, we systematically reviewed bioinformatics tools that are commonly used in microbiome research, including a typical pipeline and software for sequence alignment, abundance profiling, enterotype determination, taxonomic diversity, identifying differentially abundant species/genes, gene cataloging, and functional analyses. We also summarized the algorithms and methods used to define metagenomic species and co-abundance gene groups to expand our understanding of unclassified and poorly understood gut microbes that are undocumented in the current genome databases. Additionally, we examined the methods used to identify metagenomic biomarkers based on the gut microbiome, which might help to expand the knowledge and approaches for disease detection and monitoring.

  20. Bioinformatics for saffron (Crocus sativus L. improvement

    Directory of Open Access Journals (Sweden)

    Ghulam A. Parray

    2009-02-01

    Full Text Available Saffron (Crocus sativus L. is a sterile triploid plant and belongs to the Iridaceae (Liliales, Monocots. Its genome is of relatively large size and is poorly characterized. Bioinformatics can play an enormous technical role in the sequence-level structural characterization of saffron genomic DNA. Bioinformatics tools can also help in appreciating the extent of diversity of various geographic or genetic groups of cultivated saffron to infer relationships between groups and accessions. The characterization of the transcriptome of saffron stigmas is the most vital for throwing light on the molecular basis of flavor, color biogenesis, genomic organization and biology of gynoecium of saffron. The information derived can be utilized for constructing biological pathways involved in the biosynthesis of principal components of saffron i.e., crocin, crocetin, safranal, picrocrocin and safchiA

  1. Hardware Acceleration of Bioinformatics Sequence Alignment Applications

    OpenAIRE

    Hasan, L.

    2011-01-01

    Biological sequence alignment is an important and challenging task in bioinformatics. Alignment may be defined as an arrangement of two or more DNA or protein sequences to highlight the regions of their similarity. Sequence alignment is used to infer the evolutionary relationship between a set of protein or DNA sequences. An accurate alignment can provide valuable information for experimentation on the newly found sequences. It is indispensable in basic research as well as in practical applic...

  2. Genome bioinformatics of tomato and potato

    OpenAIRE

    Datema, E.

    2011-01-01

    In the past two decades genome sequencing has developed from a laborious and costly technology employed by large international consortia to a widely used, automated and affordable tool used worldwide by many individual research groups. Genome sequences of many food animals and crop plants have been deciphered and are being exploited for fundamental research and applied to improve their breeding programs. The developments in sequencing technologies have also impacted the associated bioinformat...

  3. Chapter 16: Text Mining for Translational Bioinformatics

    OpenAIRE

    Bretonnel Cohen, K; Hunter, Lawrence E.

    2013-01-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. P...

  4. Adapting bioinformatics curricula for big data

    OpenAIRE

    Greene, Anna C.; Giffin, Kristine A.; Greene, Casey S; Jason H Moore

    2015-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these...

  5. [Applied problems of mathematical biology and bioinformatics].

    Science.gov (United States)

    Lakhno, V D

    2011-01-01

    Mathematical biology and bioinformatics represent a new and rapidly progressing line of investigations which emerged in the course of work on the project "Human genome". The main applied problems of these sciences are grug design, patient-specific medicine and nanobioelectronics. It is shown that progress in the technology of mass sequencing of the human genome has set the stage for starting the national program on patient-specific medicine.

  6. VLSI Microsystem for Rapid Bioinformatic Pattern Recognition

    Science.gov (United States)

    Fang, Wai-Chi; Lue, Jaw-Chyng

    2009-01-01

    A system comprising very-large-scale integrated (VLSI) circuits is being developed as a means of bioinformatics-oriented analysis and recognition of patterns of fluorescence generated in a microarray in an advanced, highly miniaturized, portable genetic-expression-assay instrument. Such an instrument implements an on-chip combination of polymerase chain reactions and electrochemical transduction for amplification and detection of deoxyribonucleic acid (DNA).

  7. Adapting bioinformatics curricula for big data.

    Science.gov (United States)

    Greene, Anna C; Giffin, Kristine A; Greene, Casey S; Moore, Jason H

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. PMID:25829469

  8. Bioinformatics Approach in Plant Genomic Research.

    Science.gov (United States)

    Ong, Quang; Nguyen, Phuc; Thao, Nguyen Phuong; Le, Ly

    2016-08-01

    The advance in genomics technology leads to the dramatic change in plant biology research. Plant biologists now easily access to enormous genomic data to deeply study plant high-density genetic variation at molecular level. Therefore, fully understanding and well manipulating bioinformatics tools to manage and analyze these data are essential in current plant genome research. Many plant genome databases have been established and continued expanding recently. Meanwhile, analytical methods based on bioinformatics are also well developed in many aspects of plant genomic research including comparative genomic analysis, phylogenomics and evolutionary analysis, and genome-wide association study. However, constantly upgrading in computational infrastructures, such as high capacity data storage and high performing analysis software, is the real challenge for plant genome research. This review paper focuses on challenges and opportunities which knowledge and skills in bioinformatics can bring to plant scientists in present plant genomics era as well as future aspects in critical need for effective tools to facilitate the translation of knowledge from new sequencing data to enhancement of plant productivity. PMID:27499685

  9. Bringing Web 2.0 to bioinformatics.

    Science.gov (United States)

    Zhang, Zhang; Cheung, Kei-Hoi; Townsend, Jeffrey P

    2009-01-01

    Enabling deft data integration from numerous, voluminous and heterogeneous data sources is a major bioinformatic challenge. Several approaches have been proposed to address this challenge, including data warehousing and federated databasing. Yet despite the rise of these approaches, integration of data from multiple sources remains problematic and toilsome. These two approaches follow a user-to-computer communication model for data exchange, and do not facilitate a broader concept of data sharing or collaboration among users. In this report, we discuss the potential of Web 2.0 technologies to transcend this model and enhance bioinformatics research. We propose a Web 2.0-based Scientific Social Community (SSC) model for the implementation of these technologies. By establishing a social, collective and collaborative platform for data creation, sharing and integration, we promote a web services-based pipeline featuring web services for computer-to-computer data exchange as users add value. This pipeline aims to simplify data integration and creation, to realize automatic analysis, and to facilitate reuse and sharing of data. SSC can foster collaboration and harness collective intelligence to create and discover new knowledge. In addition to its research potential, we also describe its potential role as an e-learning platform in education. We discuss lessons from information technology, predict the next generation of Web (Web 3.0), and describe its potential impact on the future of bioinformatics studies.

  10. Chapter 16: text mining for translational bioinformatics.

    Science.gov (United States)

    Cohen, K Bretonnel; Hunter, Lawrence E

    2013-04-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.

  11. Chapter 16: text mining for translational bioinformatics.

    Directory of Open Access Journals (Sweden)

    K Bretonnel Cohen

    2013-04-01

    Full Text Available Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.

  12. Beyond Discovery

    DEFF Research Database (Denmark)

    Korsgaard, Steffen; Sassmannshausen, Sean Patrick

    2015-01-01

    as their central concepts and conceptualization of the entrepreneurial function. On this basis we discuss three central themes that cut across the four alternatives: process, uncertainty, and agency. These themes provide new foci for entrepreneurship research and can help to generate new research questions......In this chapter we explore four alternatives to the dominant discovery view of entrepreneurship; the development view, the construction view, the evolutionary view, and the Neo-Austrian view. We outline the main critique points of the discovery presented in these four alternatives, as well...

  13. In-silico prediction of drug targets, biological activities, signal pathways and regulating networks of dioscin based on bioinformatics

    OpenAIRE

    Yin, Lianhong; Zheng, Lingli; Xu, Lina; Dong, Deshi; Han, Xu; Qi, Yan; Zhao, Yanyan; Xu, Youwei; Peng, Jinyong

    2015-01-01

    Background Inverse docking technology has been a trend of drug discovery, and bioinformatics approaches have been used to predict target proteins, biological activities, signal pathways and molecular regulating networks affected by drugs for further pharmacodynamic and mechanism studies. Methods In the present paper, inverse docking technology was applied to screen potential targets from potential drug target database (PDTD). Then, the corresponding gene information of the obtained drug-targe...

  14. High-throughput next-generation sequencing technologies foster new cutting-edge computing techniques in bioinformatics

    OpenAIRE

    Yang, Mary Qu; Athey, Brian D; Arabnia, Hamid R.; Sung, Andrew H; Liu, Qingzhong; Yang, Jack Y; Mao, Jinghe; Deng, Youping

    2009-01-01

    The advent of high-throughput next generation sequencing technologies have fostered enormous potential applications of supercomputing techniques in genome sequencing, epi-genetics, metagenomics, personalized medicine, discovery of non-coding RNAs and protein-binding sites. To this end, the 2008 International Conference on Bioinformatics and Computational Biology (Biocomp) – 2008 World Congress on Computer Science, Computer Engineering and Applied Computing (Worldcomp) was designed to promote ...

  15. Emerging role of bioinformatics tools and software in evolution of clinical research.

    Science.gov (United States)

    Gill, Supreet Kaur; Christopher, Ajay Francis; Gupta, Vikas; Bansal, Parveen

    2016-01-01

    Clinical research is making toiling efforts for promotion and wellbeing of the health status of the people. There is a rapid increase in number and severity of diseases like cancer, hepatitis, HIV etc, resulting in high morbidity and mortality. Clinical research involves drug discovery and development whereas clinical trials are performed to establish safety and efficacy of drugs. Drug discovery is a long process starting with the target identification, validation and lead optimization. This is followed by the preclinical trials, intensive clinical trials and eventually post marketing vigilance for drug safety. Softwares and the bioinformatics tools play a great role not only in the drug discovery but also in drug development. It involves the use of informatics in the development of new knowledge pertaining to health and disease, data management during clinical trials and to use clinical data for secondary research. In addition, new technology likes molecular docking, molecular dynamics simulation, proteomics and quantitative structure activity relationship in clinical research results in faster and easier drug discovery process. During the preclinical trials, the software is used for randomization to remove bias and to plan study design. In clinical trials software like electronic data capture, Remote data capture and electronic case report form (eCRF) is used to store the data. eClinical, Oracle clinical are software used for clinical data management and for statistical analysis of the data. After the drug is marketed the safety of a drug could be monitored by drug safety software like Oracle Argus or ARISg. Therefore, softwares are used from the very early stages of drug designing, to drug development, clinical trials and during pharmacovigilance. This review describes different aspects related to application of computers and bioinformatics in drug designing, discovery and development, formulation designing and clinical research. PMID:27453827

  16. Emerging role of bioinformatics tools and software in evolution of clinical research

    Science.gov (United States)

    Gill, Supreet Kaur; Christopher, Ajay Francis; Gupta, Vikas; Bansal, Parveen

    2016-01-01

    Clinical research is making toiling efforts for promotion and wellbeing of the health status of the people. There is a rapid increase in number and severity of diseases like cancer, hepatitis, HIV etc, resulting in high morbidity and mortality. Clinical research involves drug discovery and development whereas clinical trials are performed to establish safety and efficacy of drugs. Drug discovery is a long process starting with the target identification, validation and lead optimization. This is followed by the preclinical trials, intensive clinical trials and eventually post marketing vigilance for drug safety. Softwares and the bioinformatics tools play a great role not only in the drug discovery but also in drug development. It involves the use of informatics in the development of new knowledge pertaining to health and disease, data management during clinical trials and to use clinical data for secondary research. In addition, new technology likes molecular docking, molecular dynamics simulation, proteomics and quantitative structure activity relationship in clinical research results in faster and easier drug discovery process. During the preclinical trials, the software is used for randomization to remove bias and to plan study design. In clinical trials software like electronic data capture, Remote data capture and electronic case report form (eCRF) is used to store the data. eClinical, Oracle clinical are software used for clinical data management and for statistical analysis of the data. After the drug is marketed the safety of a drug could be monitored by drug safety software like Oracle Argus or ARISg. Therefore, softwares are used from the very early stages of drug designing, to drug development, clinical trials and during pharmacovigilance. This review describes different aspects related to application of computers and bioinformatics in drug designing, discovery and development, formulation designing and clinical research.

  17. Emerging role of bioinformatics tools and software in evolution of clinical research

    Directory of Open Access Journals (Sweden)

    Supreet Kaur Gill

    2016-01-01

    Full Text Available Clinical research is making toiling efforts for promotion and wellbeing of the health status of the people. There is a rapid increase in number and severity of diseases like cancer, hepatitis, HIV etc, resulting in high morbidity and mortality. Clinical research involves drug discovery and development whereas clinical trials are performed to establish safety and efficacy of drugs. Drug discovery is a long process starting with the target identification, validation and lead optimization. This is followed by the preclinical trials, intensive clinical trials and eventually post marketing vigilance for drug safety. Softwares and the bioinformatics tools play a great role not only in the drug discovery but also in drug development. It involves the use of informatics in the development of new knowledge pertaining to health and disease, data management during clinical trials and to use clinical data for secondary research. In addition, new technology likes molecular docking, molecular dynamics simulation, proteomics and quantitative structure activity relationship in clinical research results in faster and easier drug discovery process. During the preclinical trials, the software is used for randomization to remove bias and to plan study design. In clinical trials software like electronic data capture, Remote data capture and electronic case report form (eCRF is used to store the data. eClinical, Oracle clinical are software used for clinical data management and for statistical analysis of the data. After the drug is marketed the safety of a drug could be monitored by drug safety software like Oracle Argus or ARISg. Therefore, softwares are used from the very early stages of drug designing, to drug development, clinical trials and during pharmacovigilance. This review describes different aspects related to application of computers and bioinformatics in drug designing, discovery and development, formulation designing and clinical research.

  18. Proteomic and Bioinformatics Analyses of Mouse Liver Microsomes

    Directory of Open Access Journals (Sweden)

    Fang Peng

    2012-01-01

    Full Text Available Microsomes are derived mostly from endoplasmic reticulum and are an ideal target to investigate compound metabolism, membrane-bound enzyme functions, lipid-protein interactions, and drug-drug interactions. To better understand the molecular mechanisms of the liver and its diseases, mouse liver microsomes were isolated and enriched with differential centrifugation and sucrose gradient centrifugation, and microsome membrane proteins were further extracted from isolated microsomal fractions by the carbonate method. The enriched microsome proteins were arrayed with two-dimensional gel electrophoresis (2DE and carbonate-extracted microsome membrane proteins with one-dimensional gel electrophoresis (1DE. A total of 183 2DE-arrayed proteins and 99 1DE-separated proteins were identified with tandem mass spectrometry. A total of 259 nonredundant microsomal proteins were obtained and represent the proteomic profile of mouse liver microsomes, including 62 definite microsome membrane proteins. The comprehensive bioinformatics analyses revealed the functional categories of those microsome proteins and provided clues into biological functions of the liver. The systematic analyses of the proteomic profile of mouse liver microsomes not only reveal essential, valuable information about the biological function of the liver, but they also provide important reference data to analyze liver disease-related microsome proteins for biomarker discovery and mechanism clarification of liver disease.

  19. USDA Stakeholder Workshop on Animal Bioinformatics: Summary and Recommendations.

    Science.gov (United States)

    Hamernik, Debora L; Adelson, David L

    2003-01-01

    An electronic workshop was conducted on 4 November-13 December 2002 to discuss current issues and needs in animal bioinformatics. The electronic (e-mail listserver) format was chosen to provide a relatively speedy process that is broad in scope, cost-efficient and easily accessible to all participants. Approximately 40 panelists with diverse species and discipline expertise communicated through the panel e-mail listserver. The panel included scientists from academia, industry and government, in the USA, Australia and the UK. A second 'stakeholder' e-mail listserver was used to obtain input from a broad audience with general interests in animal genomics. The objectives of the electronic workshop were: (a) to define priorities for animal genome database development; and (b) to recommend ways in which the USDA could provide leadership in the area of animal genome database development. E-mail messages from panelists and stakeholders are archived at http://genome.cvm.umn.edu/bioinfo/. Priorities defined for animal genome database development included: (a) data repository; (b) tools for genome analysis; (c) annotation; (d) practical application of genomic data; and (e) a biological framework for DNA sequence. A stable source of funding, such as the USDA Agricultural Research Service (ARS), was recommended to support maintenance of data repositories and data curation. Continued support for competitive grants programs within the USDA Cooperative State Research, Education and Extension Service (CSREES) was recommended for tool development and hypothesis-driven research projects in genome analysis. Additional stakeholder input will be required to continuously refine priorities and maximize the use of limited resources for animal bioinformatics within the USDA. PMID:18629125

  20. USDA Stakeholder Workshop on Animal Bioinformatics: Summary and Recommendations

    Directory of Open Access Journals (Sweden)

    David L. Adelson

    2006-04-01

    Full Text Available An electronic workshop was conducted on 4 November–13 December 2002 to discuss current issues and needs in animal bioinformatics. The electronic (e-mail listserver format was chosen to provide a relatively speedy process that is broad in scope, cost-efficient and easily accessible to all participants. Approximately 40 panelists with diverse species and discipline expertise communicated through the panel e-mail listserver. The panel included scientists from academia, industry and government, in the USA, Australia and the UK. A second ‘stakeholder’ e-mail listserver was used to obtain input from a broad audience with general interests in animal genomics. The objectives of the electronic workshop were: (a to define priorities for animal genome database development; and (b to recommend ways in which the USDA could provide leadership in the area of animal genome database development. E-mail messages from panelists and stakeholders are archived at http://genome.cvm.umn.edu/bioinfo/. Priorities defined for animal genome database development included: (a data repository; (b tools for genome analysis; (c annotation; (d practical application of genomic data; and (e a biological framework for DNA sequence. A stable source of funding, such as the USDA Agricultural Research Service (ARS, was recommended to support maintenance of data repositories and data curation. Continued support for competitive grants programs within the USDA Cooperative State Research, Education and Extension Service (CSREES was recommended for tool development and hypothesis-driven research projects in genome analysis. Additional stakeholder input will be required to continuously refine priorities and maximize the use of limited resources for animal bioinformatics within the USDA.

  1. Tissue Banking, Bioinformatics, and Electronic Medical Records: The Front-End Requirements for Personalized Medicine

    Directory of Open Access Journals (Sweden)

    K. Stephen Suh

    2013-01-01

    Full Text Available Personalized medicine promises patient-tailored treatments that enhance patient care and decrease overall treatment costs by focusing on genetics and “-omics” data obtained from patient biospecimens and records to guide therapy choices that generate good clinical outcomes. The approach relies on diagnostic and prognostic use of novel biomarkers discovered through combinations of tissue banking, bioinformatics, and electronic medical records (EMRs. The analytical power of bioinformatic platforms combined with patient clinical data from EMRs can reveal potential biomarkers and clinical phenotypes that allow researchers to develop experimental strategies using selected patient biospecimens stored in tissue banks. For cancer, high-quality biospecimens collected at diagnosis, first relapse, and various treatment stages provide crucial resources for study designs. To enlarge biospecimen collections, patient education regarding the value of specimen donation is vital. One approach for increasing consent is to offer publically available illustrations and game-like engagements demonstrating how wider sample availability facilitates development of novel therapies. The critical value of tissue bank samples, bioinformatics, and EMR in the early stages of the biomarker discovery process for personalized medicine is often overlooked. The data obtained also require cross-disciplinary collaborations to translate experimental results into clinical practice and diagnostic and prognostic use in personalized medicine.

  2. Assessment of composite motif discovery methods

    Directory of Open Access Journals (Sweden)

    Johansen Jostein

    2008-02-01

    Full Text Available Abstract Background Computational discovery of regulatory elements is an important area of bioinformatics research and more than a hundred motif discovery methods have been published. Traditionally, most of these methods have addressed the problem of single motif discovery – discovering binding motifs for individual transcription factors. In higher organisms, however, transcription factors usually act in combination with nearby bound factors to induce specific regulatory behaviours. Hence, recent focus has shifted from single motifs to the discovery of sets of motifs bound by multiple cooperating transcription factors, so called composite motifs or cis-regulatory modules. Given the large number and diversity of methods available, independent assessment of methods becomes important. Although there have been several benchmark studies of single motif discovery, no similar studies have previously been conducted concerning composite motif discovery. Results We have developed a benchmarking framework for composite motif discovery and used it to evaluate the performance of eight published module discovery tools. Benchmark datasets were constructed based on real genomic sequences containing experimentally verified regulatory modules, and the module discovery programs were asked to predict both the locations of these modules and to specify the single motifs involved. To aid the programs in their search, we provided position weight matrices corresponding to the binding motifs of the transcription factors involved. In addition, selections of decoy matrices were mixed with the genuine matrices on one dataset to test the response of programs to varying levels of noise. Conclusion Although some of the methods tested tended to score somewhat better than others overall, there were still large variations between individual datasets and no single method performed consistently better than the rest in all situations. The variation in performance on individual

  3. Translational Bioinformatics:Past, Present, and Future

    Institute of Scientific and Technical Information of China (English)

    Jessica D. Tenenbaum

    2016-01-01

    Though a relatively young discipline, translational bioinformatics (TBI) has become a key component of biomedical research in the era of precision medicine. Development of high-throughput technologies and electronic health records has caused a paradigm shift in both healthcare and biomedical research. Novel tools and methods are required to convert increasingly voluminous datasets into information and actionable knowledge. This review provides a definition and contex-tualization of the term TBI, describes the discipline’s brief history and past accomplishments, as well as current foci, and concludes with predictions of future directions in the field.

  4. Introducing bioinformatics, the biosciences' genomic revolution

    CERN Document Server

    Zanella, Paolo

    1999-01-01

    The general audience for these lectures is mainly physicists, computer scientists, engineers or the general public wanting to know more about what’s going on in the biosciences. What’s bioinformatics and why is all this fuss being made about it ? What’s this revolution triggered by the human genome project ? Are there any results yet ? What are the problems ? What new avenues of research have been opened up ? What about the technology ? These new developments will be compared with what happened at CERN earlier in its evolution, and it is hoped that the similiraties and contrasts will stimulate new curiosity and provoke new thoughts.

  5. Robust Bioinformatics Recognition with VLSI Biochip Microsystem

    Science.gov (United States)

    Lue, Jaw-Chyng L.; Fang, Wai-Chi

    2006-01-01

    A microsystem architecture for real-time, on-site, robust bioinformatic patterns recognition and analysis has been proposed. This system is compatible with on-chip DNA analysis means such as polymerase chain reaction (PCR)amplification. A corresponding novel artificial neural network (ANN) learning algorithm using new sigmoid-logarithmic transfer function based on error backpropagation (EBP) algorithm is invented. Our results show the trained new ANN can recognize low fluorescence patterns better than the conventional sigmoidal ANN does. A differential logarithmic imaging chip is designed for calculating logarithm of relative intensities of fluorescence signals. The single-rail logarithmic circuit and a prototype ANN chip are designed, fabricated and characterized.

  6. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    Science.gov (United States)

    Magana, Alejandra J.; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the…

  7. Standardizing the next generation of bioinformatics software development with BioHDF (HDF5).

    Science.gov (United States)

    Mason, Christopher E; Zumbo, Paul; Sanders, Stephan; Folk, Mike; Robinson, Dana; Aydt, Ruth; Gollery, Martin; Welsh, Mark; Olson, N Eric; Smith, Todd M

    2010-01-01

    Next Generation Sequencing technologies are limited by the lack of standard bioinformatics infrastructures that can reduce data storage, increase data processing performance, and integrate diverse information. HDF technologies address these requirements and have a long history of use in data-intensive science communities. They include general data file formats, libraries, and tools for working with the data. Compared to emerging standards, such as the SAM/BAM formats, HDF5-based systems demonstrate significantly better scalability, can support multiple indexes, store multiple data types, and are self-describing. For these reasons, HDF5 and its BioHDF extension are well suited for implementing data models to support the next generation of bioinformatics applications. PMID:20865556

  8. Discovery Mondays

    CERN Document Server

    2003-01-01

    Many people don't realise quite how much is going on at CERN. Would you like to gain first-hand knowledge of CERN's scientific and technological activities and their many applications? Try out some experiments for yourself, or pick the brains of the people in charge? If so, then the «Lundis Découverte» or Discovery Mondays, will be right up your street. Starting on May 5th, on every first Monday of the month you will be introduced to a different facet of the Laboratory. CERN staff, non-scientists, and members of the general public, everyone is welcome. So tell your friends and neighbours and make sure you don't miss this opportunity to satisfy your curiosity and enjoy yourself at the same time. You won't have to listen to a lecture, as the idea is to have open exchange with the expert in question and for each subject to be illustrated with experiments and demonstrations. There's no need to book, as Microcosm, CERN's interactive museum, will be open non-stop from 7.30 p.m. to 9 p.m. On the first Discovery M...

  9. The 2015 Bioinformatics Open Source Conference (BOSC 2015.

    Directory of Open Access Journals (Sweden)

    Nomi L Harris

    2016-02-01

    Full Text Available The Bioinformatics Open Source Conference (BOSC is organized by the Open Bioinformatics Foundation (OBF, a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG before the annual Intelligent Systems in Molecular Biology (ISMB conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included "Data Science;" "Standards and Interoperability;" "Open Science and Reproducibility;" "Translational Bioinformatics;" "Visualization;" and "Bioinformatics Open Source Project Updates". In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled "Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community," that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule.

  10. The bioinformatics of next generation sequencing: a meeting report

    Institute of Scientific and Technical Information of China (English)

    Ravi Shankar

    2011-01-01

    @@ The Studio of Computational Biology & Bioinformatics (SCBB), IHBT, CSIR,Palampur, India organized one of the very first national workshop funded by DBT,Govt.of India, on the Bioinformatics issues associated with next generation sequencing approaches.The course structure was designed by SCBB, IHBT.The workshop took place in the IHBT premise on 17 and 18 June 2010.

  11. Generative Topic Modeling in Image Data Mining and Bioinformatics Studies

    Science.gov (United States)

    Chen, Xin

    2012-01-01

    Probabilistic topic models have been developed for applications in various domains such as text mining, information retrieval and computer vision and bioinformatics domain. In this thesis, we focus on developing novel probabilistic topic models for image mining and bioinformatics studies. Specifically, a probabilistic topic-connection (PTC) model…

  12. Evaluating an Inquiry-Based Bioinformatics Course Using Q Methodology

    Science.gov (United States)

    Ramlo, Susan E.; McConnell, David; Duan, Zhong-Hui; Moore, Francisco B.

    2008-01-01

    Faculty at a Midwestern metropolitan public university recently developed a course on bioinformatics that emphasized collaboration and inquiry. Bioinformatics, essentially the application of computational tools to biological data, is inherently interdisciplinary. Thus part of the challenge of creating this course was serving the needs and…

  13. Assessment of a Bioinformatics across Life Science Curricula Initiative

    Science.gov (United States)

    Howard, David R.; Miskowski, Jennifer A.; Grunwald, Sandra K.; Abler, Michael L.

    2007-01-01

    At the University of Wisconsin-La Crosse, we have undertaken a program to integrate the study of bioinformatics across the undergraduate life science curricula. Our efforts have included incorporating bioinformatics exercises into courses in the biology, microbiology, and chemistry departments, as well as coordinating the efforts of faculty within…

  14. Virginia Bioinformatics Institute, partners receive $1.45 million award for plant protection in developing countries

    OpenAIRE

    Bland, Susan

    2010-01-01

    A research collaboration led by a professor from the Virginia Bioinformatics Institute at Virginia Tech will work to develop methods to protect agriculturally important crops in developing countries from devastating attacks from plant pathogens with the support of a $1.45 million award from the Basic Research to Enable Agricultural Development (BREAD) program sponsored by the National Science Foundation (NSF) and the Bill & Melinda Gates Foundation (BMGF).

  15. The European Bioinformatics Institute's data resources.

    Science.gov (United States)

    Brooksbank, Catherine; Camon, Evelyn; Harris, Midori A; Magrane, Michele; Martin, Maria Jesus; Mulder, Nicola; O'Donovan, Claire; Parkinson, Helen; Tuli, Mary Ann; Apweiler, Rolf; Birney, Ewan; Brazma, Alvis; Henrick, Kim; Lopez, Rodrigo; Stoesser, Guenter; Stoehr, Peter; Cameron, Graham

    2003-01-01

    As the amount of biological data grows, so does the need for biologists to store and access this information in central repositories in a free and unambiguous manner. The European Bioinformatics Institute (EBI) hosts six core databases, which store information on DNA sequences (EMBL-Bank), protein sequences (SWISS-PROT and TrEMBL), protein structure (MSD), whole genomes (Ensembl) and gene expression (ArrayExpress). But just as a cell would be useless if it couldn't transcribe DNA or translate RNA, our resources would be compromised if each existed in isolation. We have therefore developed a range of tools that not only facilitate the deposition and retrieval of biological information, but also allow users to carry out searches that reflect the interconnectedness of biological information. The EBI's databases and tools are all available on our website at www.ebi.ac.uk. PMID:12519944

  16. Bioinformatic and Biometric Methods in Plant Morphology

    Directory of Open Access Journals (Sweden)

    Surangi W. Punyasena

    2014-08-01

    Full Text Available Recent advances in microscopy, imaging, and data analyses have permitted both the greater application of quantitative methods and the collection of large data sets that can be used to investigate plant morphology. This special issue, the first for Applications in Plant Sciences, presents a collection of papers highlighting recent methods in the quantitative study of plant form. These emerging biometric and bioinformatic approaches to plant sciences are critical for better understanding how morphology relates to ecology, physiology, genotype, and evolutionary and phylogenetic history. From microscopic pollen grains and charcoal particles, to macroscopic leaves and whole root systems, the methods presented include automated classification and identification, geometric morphometrics, and skeleton networks, as well as tests of the limits of human assessment. All demonstrate a clear need for these computational and morphometric approaches in order to increase the consistency, objectivity, and throughput of plant morphological studies.

  17. Bioinformatics analysis of estrogen-responsive genes

    Science.gov (United States)

    Handel, Adam E.

    2016-01-01

    Estrogen is a steroid hormone that plays critical roles in a myriad of intracellular pathways. The expression of many genes is regulated through the steroid hormone receptors ESR1 and ESR2. These bind to DNA and modulate the expression of target genes. Identification of estrogen target genes is greatly facilitated by the use of transcriptomic methods, such as RNA-seq and expression microarrays, and chromatin immunoprecipitation with massively parallel sequencing (ChIP-seq). Combining transcriptomic and ChIP-seq data enables a distinction to be drawn between direct and indirect estrogen target genes. This chapter will discuss some methods of identifying estrogen target genes that do not require any expertise in programming languages or complex bioinformatics. PMID:26585125

  18. Wrapping and interoperating bioinformatics resources using CORBA.

    Science.gov (United States)

    Stevens, R; Miller, C

    2000-02-01

    Bioinformaticians seeking to provide services to working biologists are faced with the twin problems of distribution and diversity of resources. Bioinformatics databases are distributed around the world and exist in many kinds of storage forms, platforms and access paradigms. To provide adequate services to biologists, these distributed and diverse resources have to interoperate seamlessly within single applications. The Common Object Request Broker Architecture (CORBA) offers one technical solution to these problems. The key component of CORBA is its use of object orientation as an intermediate form to translate between different representations. This paper concentrates on an explanation of object orientation and how it can be used to overcome the problems of distribution and diversity by describing the interfaces between objects.

  19. Bioinformatics Analysis of Estrogen-Responsive Genes.

    Science.gov (United States)

    Handel, Adam E

    2016-01-01

    Estrogen is a steroid hormone that plays critical roles in a myriad of intracellular pathways. The expression of many genes is regulated through the steroid hormone receptors ESR1 and ESR2. These bind to DNA and modulate the expression of target genes. Identification of estrogen target genes is greatly facilitated by the use of transcriptomic methods, such as RNA-seq and expression microarrays, and chromatin immunoprecipitation with massively parallel sequencing (ChIP-seq). Combining transcriptomic and ChIP-seq data enables a distinction to be drawn between direct and indirect estrogen target genes. This chapter discusses some methods of identifying estrogen target genes that do not require any expertise in programming languages or complex bioinformatics. PMID:26585125

  20. Combining multiple decisions: applications to bioinformatics

    International Nuclear Information System (INIS)

    Multi-class classification is one of the fundamental tasks in bioinformatics and typically arises in cancer diagnosis studies by gene expression profiling. This article reviews two recent approaches to multi-class classification by combining multiple binary classifiers, which are formulated based on a unified framework of error-correcting output coding (ECOC). The first approach is to construct a multi-class classifier in which each binary classifier to be aggregated has a weight value to be optimally tuned based on the observed data. In the second approach, misclassification of each binary classifier is formulated as a bit inversion error with a probabilistic model by making an analogy to the context of information transmission theory. Experimental studies using various real-world datasets including cancer classification problems reveal that both of the new methods are superior or comparable to other multi-class classification methods

  1. Applied bioinformatics: Genome annotation and transcriptome analysis

    DEFF Research Database (Denmark)

    Gupta, Vikas

    Next generation sequencing (NGS) has revolutionized the field of genomics and its wide range of applications has resulted in the genome-wide analysis of hundreds of species and the development of thousands of computational tools. This thesis represents my work on NGS analysis of four species, Lotus...... japonicus (Lotus), Vaccinium corymbosum (blueberry), Stegodyphus mimosarum (spider) and Trifolium occidentale (clover). From a bioinformatics data analysis perspective, my work can be divided into three parts; genome annotation, small RNA, and gene expression analysis. Lotus is a legume of significant...... agricultural and biological importance. Its capacity to form symbiotic relationships with rhizobia and microrrhizal fungi has fascinated researchers for years. Lotus has a small genome of approximately 470 Mb and a short life cycle of 2 to 3 months, which has made Lotus a model legume plant for many molecular...

  2. Comparison of Online and Onsite Bioinformatics Instruction for a Fully Online Bioinformatics Master’s Program

    Directory of Open Access Journals (Sweden)

    Kristina M. Obom

    2009-12-01

    Full Text Available The completely online Master of Science in Bioinformatics program differs from the onsite program only in the mode of content delivery. Analysis of student satisfaction indicates no statistically significant difference between most online and onsite student responses, however, online and onsite students do differ significantly in their responses to a few questions on the course evaluation queries. Analysis of student exam performance using three assessments indicates that there was no significant difference in grades earned by students in online and onsite courses. These results suggest that our model for online bioinformatics education provides students with a rigorous course of study that is comparable to onsite course instruction and possibly provides a more rigorous course load and more opportunities for participation.

  3. Comparison of Online and Onsite Bioinformatics Instruction for a Fully Online Bioinformatics Master’s Program

    OpenAIRE

    Obom, Kristina. M.; Cummings, Patrick J.

    2009-01-01

    The completely online Master of Science in Bioinformatics program differs from the onsite program only in the mode of content delivery. Analysis of student satisfaction indicates no statistically significant difference between most online and onsite student responses, however, online and onsite students do differ significantly in their responses to a few questions on the course evaluation queries. Analysis of student exam performance using three assessments indicates that there was no signifi...

  4. SNP Discovery Using Next Generation Transcriptomic Sequencing.

    Science.gov (United States)

    De Wit, Pierre

    2016-01-01

    In this chapter, I will guide the user through methods to find new SNP markers from expressed sequence (RNA-Seq) data, focusing on the sample preparation and also on the bioinformatic analyses needed to sort through the immense flood of data from high-throughput sequencing machines. The general steps included are as follows: sample preparation, sequencing, quality control of data, assembly, mapping, SNP discovery, filtering, validation. The first few steps are traditional laboratory protocols, whereas steps following the sequencing are of bioinformatic nature. The bioinformatics described herein are by no means exhaustive, rather they serve as one example of a simple way of analyzing high-throughput sequence data to find SNP markers. Ideally, one would like to run through this protocol several times with a new dataset, while varying software parameters slightly, in order to determine the robustness of the results. The final validation step, although not described in much detail here, is also quite critical as that will be the final test of the accuracy of the assumptions made in silico.There is a plethora of downstream applications of a SNP dataset, not covered in this chapter. For an example of a more thorough protocol also including differential gene expression and functional enrichment analyses, BLAST annotation and downstream applications of SNP markers, a good starting point could be the "Simple Fool's Guide to population genomics via RNA-Seq," which is available at http://sfg.stanford.edu . PMID:27460371

  5. MEMOSys: Bioinformatics platform for genome-scale metabolic models

    Directory of Open Access Journals (Sweden)

    Agren Rasmus

    2011-01-01

    Full Text Available Abstract Background Recent advances in genomic sequencing have enabled the use of genome sequencing in standard biological and biotechnological research projects. The challenge is how to integrate the large amount of data in order to gain novel biological insights. One way to leverage sequence data is to use genome-scale metabolic models. We have therefore designed and implemented a bioinformatics platform which supports the development of such metabolic models. Results MEMOSys (MEtabolic MOdel research and development System is a versatile platform for the management, storage, and development of genome-scale metabolic models. It supports the development of new models by providing a built-in version control system which offers access to the complete developmental history. Moreover, the integrated web board, the authorization system, and the definition of user roles allow collaborations across departments and institutions. Research on existing models is facilitated by a search system, references to external databases, and a feature-rich comparison mechanism. MEMOSys provides customizable data exchange mechanisms using the SBML format to enable analysis in external tools. The web application is based on the Java EE framework and offers an intuitive user interface. It currently contains six annotated microbial metabolic models. Conclusions We have developed a web-based system designed to provide researchers a novel application facilitating the management and development of metabolic models. The system is freely available at http://www.icbi.at/MEMOSys.

  6. Mutational and Bioinformatic Analysis of Haloarchaeal Lipobox-Containing Proteins

    Directory of Open Access Journals (Sweden)

    Stefanie Storf

    2010-01-01

    Full Text Available A conserved lipid-modified cysteine found in a protein motif commonly referred to as a lipobox mediates the membrane anchoring of a subset of proteins transported across the bacterial cytoplasmic membrane via the Sec pathway. Sequenced haloarchaeal genomes encode many putative lipoproteins and recent studies have confirmed the importance of the conserved lipobox cysteine for signal peptide processing of three lipobox-containing proteins in the model archaeon Haloferax volcanii. We have extended these in vivo analyses to additional Hfx. volcanii substrates, supporting our previous in silico predictions and confirming the diversity of predicted Hfx. volcanii lipoproteins. Moreover, using extensive comparative secretome analyses, we identified genes encodining putative lipoproteins across a wide range of archaeal species. While our in silico analyses, supported by in vivo data, indicate that most haloarchaeal lipoproteins are Tat substrates, these analyses also predict that many crenarchaeal species lack lipoproteins altogether and that other archaea, such as nonhalophilic euryarchaeal species, transport lipoproteins via the Sec pathway. To facilitate the identification of genes that encode potential haloarchaeal Tat-lipoproteins, we have developed TatLipo, a bioinformatic tool designed to detect lipoboxes in haloarchaeal Tat signal peptides. Our results provide a strong foundation for future studies aimed at identifying components of the archaeal lipoprotein biogenesis pathway.

  7. On reliable discovery of molecular signatures

    Directory of Open Access Journals (Sweden)

    Björkegren Johan

    2009-01-01

    Full Text Available Abstract Background Molecular signatures are sets of genes, proteins, genetic variants or other variables that can be used as markers for a particular phenotype. Reliable signature discovery methods could yield valuable insight into cell biology and mechanisms of human disease. However, it is currently not clear how to control error rates such as the false discovery rate (FDR in signature discovery. Moreover, signatures for cancer gene expression have been shown to be unstable, that is, difficult to replicate in independent studies, casting doubts on their reliability. Results We demonstrate that with modern prediction methods, signatures that yield accurate predictions may still have a high FDR. Further, we show that even signatures with low FDR may fail to replicate in independent studies due to limited statistical power. Thus, neither stability nor predictive accuracy are relevant when FDR control is the primary goal. We therefore develop a general statistical hypothesis testing framework that for the first time provides FDR control for signature discovery. Our method is demonstrated to be correct in simulation studies. When applied to five cancer data sets, the method was able to discover molecular signatures with 5% FDR in three cases, while two data sets yielded no significant findings. Conclusion Our approach enables reliable discovery of molecular signatures from genome-wide data with current sample sizes. The statistical framework developed herein is potentially applicable to a wide range of prediction problems in bioinformatics.

  8. The Semantic Automated Discovery and Integration (SADI Web service Design-Pattern, API and Reference Implementation

    Directory of Open Access Journals (Sweden)

    Wilkinson Mark D

    2011-10-01

    Full Text Available Abstract Background The complexity and inter-related nature of biological data poses a difficult challenge for data and tool integration. There has been a proliferation of interoperability standards and projects over the past decade, none of which has been widely adopted by the bioinformatics community. Recent attempts have focused on the use of semantics to assist integration, and Semantic Web technologies are being welcomed by this community. Description SADI - Semantic Automated Discovery and Integration - is a lightweight set of fully standards-compliant Semantic Web service design patterns that simplify the publication of services of the type commonly found in bioinformatics and other scientific domains. Using Semantic Web technologies at every level of the Web services "stack", SADI services consume and produce instances of OWL Classes following a small number of very straightforward best-practices. In addition, we provide codebases that support these best-practices, and plug-in tools to popular developer and client software that dramatically simplify deployment of services by providers, and the discovery and utilization of those services by their consumers. Conclusions SADI Services are fully compliant with, and utilize only foundational Web standards; are simple to create and maintain for service providers; and can be discovered and utilized in a very intuitive way by biologist end-users. In addition, the SADI design patterns significantly improve the ability of software to automatically discover appropriate services based on user-needs, and automatically chain these into complex analytical workflows. We show that, when resources are exposed through SADI, data compliant with a given ontological model can be automatically gathered, or generated, from these distributed, non-coordinating resources - a behaviour we have not observed in any other Semantic system. Finally, we show that, using SADI, data dynamically generated from Web services

  9. Continuing Education Workshops in Bioinformatics Positively Impact Research and Careers.

    Science.gov (United States)

    Brazas, Michelle D; Ouellette, B F Francis

    2016-06-01

    Bioinformatics.ca has been hosting continuing education programs in introductory and advanced bioinformatics topics in Canada since 1999 and has trained more than 2,000 participants to date. These workshops have been adapted over the years to keep pace with advances in both science and technology as well as the changing landscape in available learning modalities and the bioinformatics training needs of our audience. Post-workshop surveys have been a mandatory component of each workshop and are used to ensure appropriate adjustments are made to workshops to maximize learning. However, neither bioinformatics.ca nor others offering similar training programs have explored the long-term impact of bioinformatics continuing education training. Bioinformatics.ca recently initiated a look back on the impact its workshops have had on the career trajectories, research outcomes, publications, and collaborations of its participants. Using an anonymous online survey, bioinformatics.ca analyzed responses from those surveyed and discovered its workshops have had a positive impact on collaborations, research, publications, and career progression. PMID:27281025

  10. 一种面向业务领域的web服务发现方法以及支撑框架%Method and supporting framework for business domain-oriented web service discovery

    Institute of Scientific and Technical Information of China (English)

    刘佳; 王海洋; 崔立真; 史玉良

    2008-01-01

    This paper proposes a new business domain-oriented web service discovery method and framework to solve the low precision results caused by UDDI (universal description, discovery and integration) syntactic discovery and the difficulty in selecting from among functionally equivalent web services. When requesting services, service clusters are extracted from concrete services in terms of functional requests; then, through business information properties consultation, the most suitable services are determined and finally bound to user requests. The whole process is transparent to users. This framework is also tested and supported through a prototype based on a travel domain, IPVita (intelligent platform of virtual travel agency).%针对UDDI关键字匹配带来的服务发现精度低,以及难以从功能相同的多个web服务中选择合适服务的问题,提出一种新的面向业务领域的web服务发现方法和相关框架.当用户请求服务时,根据功能请求从实际服务中抽取出服务簇,然后由业务信息属性等非功能性属性进行协商,确定出适合的服务并与用户请求绑定,而整个发现过程对用户来说都是透明的.通过建立一个基于旅游领域的原型系统IPVita来测试和支持此框架.

  11. Bioconductor: open software development for computational biology and bioinformatics

    DEFF Research Database (Denmark)

    Gentleman, R.C.; Carey, V.J.; Bates, D.M.;

    2004-01-01

    The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisci......The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry...... into interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples....

  12. Survey of MapReduce frame operation in bioinformatics.

    Science.gov (United States)

    Zou, Quan; Li, Xu-Bin; Jiang, Wen-Rui; Lin, Zi-Yu; Li, Gui-Lin; Chen, Ke

    2014-07-01

    Bioinformatics is challenged by the fact that traditional analysis tools have difficulty in processing large-scale data from high-throughput sequencing. The open source Apache Hadoop project, which adopts the MapReduce framework and a distributed file system, has recently given bioinformatics researchers an opportunity to achieve scalable, efficient and reliable computing performance on Linux clusters and on cloud computing services. In this article, we present MapReduce frame-based applications that can be employed in the next-generation sequencing and other biological domains. In addition, we discuss the challenges faced by this field as well as the future works on parallel computing in bioinformatics.

  13. Survey of MapReduce frame operation in bioinformatics.

    Science.gov (United States)

    Zou, Quan; Li, Xu-Bin; Jiang, Wen-Rui; Lin, Zi-Yu; Li, Gui-Lin; Chen, Ke

    2014-07-01

    Bioinformatics is challenged by the fact that traditional analysis tools have difficulty in processing large-scale data from high-throughput sequencing. The open source Apache Hadoop project, which adopts the MapReduce framework and a distributed file system, has recently given bioinformatics researchers an opportunity to achieve scalable, efficient and reliable computing performance on Linux clusters and on cloud computing services. In this article, we present MapReduce frame-based applications that can be employed in the next-generation sequencing and other biological domains. In addition, we discuss the challenges faced by this field as well as the future works on parallel computing in bioinformatics. PMID:23396756

  14. Website for avian flu information and bioinformatics

    Institute of Scientific and Technical Information of China (English)

    GAO; George; Fu

    2009-01-01

    Highly pathogenic influenza A virus H5N1 has spread out worldwide and raised the public concerns. This increased the output of influenza virus sequence data as well as the research publication and other reports. In order to fight against H5N1 avian flu in a comprehensive way, we designed and started to set up the Website for Avian Flu Information (http://www.avian-flu.info) from 2004. Other than the influenza virus database available, the website is aiming to integrate diversified information for both researchers and the public. From 2004 to 2009, we collected information from all aspects, i.e. reports of outbreaks, scientific publications and editorials, policies for prevention, medicines and vaccines, clinic and diagnosis. Except for publications, all information is in Chinese. Till April 15, 2009, the cumulative news entries had been over 2000 and research papers were approaching 5000. By using the curated data from Influenza Virus Resource, we have set up an influenza virus sequence database and a bioinformatic platform, providing the basic functions for the sequence analysis of influenza virus. We will focus on the collection of experimental data and results as well as the integration of the data from the geological information system and avian influenza epidemiology.

  15. Website for avian flu information and bioinformatics

    Institute of Scientific and Technical Information of China (English)

    LIU Di; LIU Quan-He; WU Lin-Huan; LIU Bin; WU Jun; LAO Yi-Mei; LI Xiao-Jing; GAO George Fu; MA Jun-Cai

    2009-01-01

    Highly pathogenic influenza A virus H5N1 has spread out worldwide and raised the public concerns. This increased the output of influenza virus sequence data as well as the research publication and other reports. In order to fight against H5N1 avian flu in a comprehensive way, we designed and started to set up the Website for Avian Flu Information (http://www.avian-flu.info) from 2004. Other than the influenza virus database available, the website is aiming to integrate diversified information for both researchers and the public. From 2004 to 2009, we collected information from all aspects, i.e. reports of outbreaks, scientific publications and editorials, policies for prevention, medicines and vaccines, clinic and diagnosis. Except for publications, all information is in Chinese. Till April 15, 2009, the cumulative news entries had been over 2000 and research papers were approaching 5000. By using the curated data from Influenza Virus Resource, we have set up an influenza virus sequence database and a bioin-formatic platform, providing the basic functions for the sequence analysis of influenza virus. We will focus on the collection of experimental data and results as well as the integration of the data from the geological information system and avian influenza epidemiology.

  16. Evolution of web services in bioinformatics.

    Science.gov (United States)

    Neerincx, Pieter B T; Leunissen, Jack A M

    2005-06-01

    Bioinformaticians have developed large collections of tools to make sense of the rapidly growing pool of molecular biological data. Biological systems tend to be complex and in order to understand them, it is often necessary to link many data sets and use more than one tool. Therefore, bioinformaticians have experimented with several strategies to try to integrate data sets and tools. Owing to the lack of standards for data sets and the interfaces of the tools this is not a trivial task. Over the past few years building services with web-based interfaces has become a popular way of sharing the data and tools that have resulted from many bioinformatics projects. This paper discusses the interoperability problem and how web services are being used to try to solve it, resulting in the evolution of tools with web interfaces from HTML/web form-based tools not suited for automatic workflow generation to a dynamic network of XML-based web services that can easily be used to create pipelines.

  17. Bioinformatics in cancer therapy and drug design

    International Nuclear Information System (INIS)

    One of the mechanisms of external signal transduction (ionizing radiation, toxicants, stress) to the target cell is the existence of membrane and intracellular proteins with intrinsic tyrosine kinase activity. No wonder that etiology of malignant growth links to abnormalities in signal transduction through tyrosine kinases. The epidermal growth factor receptor (EGFR) tyrosine kinases play fundamental roles in development, proliferation and differentiation of tissues of epithelial, mesenchymal and neuronal origin. There are four types of EGFR: EGF receptor (ErbB1/HER1), ErbB2/Neu/HER2, ErbB3/HER3 and ErbB4/HER4. Abnormal expression of EGFR, appearance of receptor mutants with changed ability to protein-protein interactions or increased tyrosine kinase activity have been implicated in the malignancy of different types of human tumors. Bioinformatics is currently using in investigation on design and selection of drugs that can make alterations in structure or competitively bind with receptors and so display antagonistic characteristics. (authors)

  18. Whale song analyses using bioinformatics sequence analysis approaches

    Science.gov (United States)

    Chen, Yian A.; Almeida, Jonas S.; Chou, Lien-Siang

    2005-04-01

    Animal songs are frequently analyzed using discrete hierarchical units, such as units, themes and songs. Because animal songs and bio-sequences may be understood as analogous, bioinformatics analysis tools DNA/protein sequence alignment and alignment-free methods are proposed to quantify the theme similarities of the songs of false killer whales recorded off northeast Taiwan. The eighteen themes with discrete units that were identified in an earlier study [Y. A. Chen, masters thesis, University of Charleston, 2001] were compared quantitatively using several distance metrics. These metrics included the scores calculated using the Smith-Waterman algorithm with the repeated procedure; the standardized Euclidian distance and the angle metrics based on word frequencies. The theme classifications based on different metrics were summarized and compared in dendrograms using cluster analyses. The results agree with earlier classifications derived by human observation qualitatively. These methods further quantify the similarities among themes. These methods could be applied to the analyses of other animal songs on a larger scale. For instance, these techniques could be used to investigate song evolution and cultural transmission quantifying the dissimilarities of humpback whale songs across different seasons, years, populations, and geographic regions. [Work supported by SC Sea Grant, and Ilan County Government, Taiwan.

  19. Metagenomics and Bioinformatics in Microbial Ecology: Current Status and Beyond.

    Science.gov (United States)

    Hiraoka, Satoshi; Yang, Ching-Chia; Iwasaki, Wataru

    2016-09-29

    Metagenomic approaches are now commonly used in microbial ecology to study microbial communities in more detail, including many strains that cannot be cultivated in the laboratory. Bioinformatic analyses make it possible to mine huge metagenomic datasets and discover general patterns that govern microbial ecosystems. However, the findings of typical metagenomic and bioinformatic analyses still do not completely describe the ecology and evolution of microbes in their environments. Most analyses still depend on straightforward sequence similarity searches against reference databases. We herein review the current state of metagenomics and bioinformatics in microbial ecology and discuss future directions for the field. New techniques will allow us to go beyond routine analyses and broaden our knowledge of microbial ecosystems. We need to enrich reference databases, promote platforms that enable meta- or comprehensive analyses of diverse metagenomic datasets, devise methods that utilize long-read sequence information, and develop more powerful bioinformatic methods to analyze data from diverse perspectives. PMID:27383682

  20. Metagenomics and Bioinformatics in Microbial Ecology: Current Status and Beyond

    Science.gov (United States)

    Hiraoka, Satoshi; Yang, Ching-chia; Iwasaki, Wataru

    2016-01-01

    Metagenomic approaches are now commonly used in microbial ecology to study microbial communities in more detail, including many strains that cannot be cultivated in the laboratory. Bioinformatic analyses make it possible to mine huge metagenomic datasets and discover general patterns that govern microbial ecosystems. However, the findings of typical metagenomic and bioinformatic analyses still do not completely describe the ecology and evolution of microbes in their environments. Most analyses still depend on straightforward sequence similarity searches against reference databases. We herein review the current state of metagenomics and bioinformatics in microbial ecology and discuss future directions for the field. New techniques will allow us to go beyond routine analyses and broaden our knowledge of microbial ecosystems. We need to enrich reference databases, promote platforms that enable meta- or comprehensive analyses of diverse metagenomic datasets, devise methods that utilize long-read sequence information, and develop more powerful bioinformatic methods to analyze data from diverse perspectives. PMID:27383682

  1. Creating Bioinformatic Workflows within the BioExtract Server

    Science.gov (United States)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows generally require access to multiple, distributed data sources and analytic tools. The requisite data sources may include large public data repositories, community...

  2. Survey of Natural Language Processing Techniques in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Zhiqiang Zeng

    2015-01-01

    Full Text Available Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers.

  3. A web services choreography scenario for interoperating bioinformatics applications

    Directory of Open Access Journals (Sweden)

    Cheung David W

    2004-03-01

    with these web services using a web services choreography language (BPEL4WS. Conclusion While it is relatively straightforward to implement and publish web services, the use of web services choreography engines is still in its infancy. However, industry-wide support and push for web services standards is quickly increasing the chance of success in using web services to unify heterogeneous bioinformatics applications. Due to the immaturity of currently available web services engines, it is still most practical to implement a simple, ad-hoc XML-based workflow by hard coding the workflow as a Java application. For advanced web service users the Collaxa BPEL engine facilitates a configuration and management environment that can fully handle XML-based workflow.

  4. An innovative approach for testing bioinformatics programs using metamorphic testing

    Directory of Open Access Journals (Sweden)

    Liu Huai

    2009-01-01

    Full Text Available Abstract Background Recent advances in experimental and computational technologies have fueled the development of many sophisticated bioinformatics programs. The correctness of such programs is crucial as incorrectly computed results may lead to wrong biological conclusion or misguide downstream experimentation. Common software testing procedures involve executing the target program with a set of test inputs and then verifying the correctness of the test outputs. However, due to the complexity of many bioinformatics programs, it is often difficult to verify the correctness of the test outputs. Therefore our ability to perform systematic software testing is greatly hindered. Results We propose to use a novel software testing technique, metamorphic testing (MT, to test a range of bioinformatics programs. Instead of requiring a mechanism to verify whether an individual test output is correct, the MT technique verifies whether a pair of test outputs conform to a set of domain specific properties, called metamorphic relations (MRs, thus greatly increases the number and variety of test cases that can be applied. To demonstrate how MT is used in practice, we applied MT to test two open-source bioinformatics programs, namely GNLab and SeqMap. In particular we show that MT is simple to implement, and is effective in detecting faults in a real-life program and some artificially fault-seeded programs. Further, we discuss how MT can be applied to test programs from various domains of bioinformatics. Conclusion This paper describes the application of a simple, effective and automated technique to systematically test a range of bioinformatics programs. We show how MT can be implemented in practice through two real-life case studies. Since many bioinformatics programs, particularly those for large scale simulation and data analysis, are hard to test systematically, their developers may benefit from using MT as part of the testing strategy. Therefore our work

  5. Can Bioinformatics Be Considered as an Experimental Biological Science?

    OpenAIRE

    Ilzins, Olaf; Isea, Raul; Hoebeke, Johan

    2016-01-01

    The objective of this short report is to reconsider the subject of bioinformatics as just being a tool of experimental biological science. To do that, we introduce three examples to show how bioinformatics could be considered as an experimental science. These examples show how the development of theoretical biological models generates experimentally verifiable computer hypotheses, which necessarily must be validated by experiments in vitro or in vivo.

  6. MotifLab: a tools and data integration workbench for motif discovery and regulatory sequence analysis

    Directory of Open Access Journals (Sweden)

    Klepper Kjetil

    2013-01-01

    Full Text Available Abstract Background Traditional methods for computational motif discovery often suffer from poor performance. In particular, methods that search for sequence matches to known binding motifs tend to predict many non-functional binding sites because they fail to take into consideration the biological state of the cell. In recent years, genome-wide studies have generated a lot of data that has the potential to improve our ability to identify functional motifs and binding sites, such as information about chromatin accessibility and epigenetic states in different cell types. However, it is not always trivial to make use of this data in combination with existing motif discovery tools, especially for researchers who are not skilled in bioinformatics programming. Results Here we present MotifLab, a general workbench for analysing regulatory sequence regions and discovering transcription factor binding sites and cis-regulatory modules. MotifLab supports comprehensive motif discovery and analysis by allowing users to integrate several popular motif discovery tools as well as different kinds of additional information, including phylogenetic conservation, epigenetic marks, DNase hypersensitive sites, ChIP-Seq data, positional binding preferences of transcription factors, transcription factor interactions and gene expression. MotifLab offers several data-processing operations that can be used to create, manipulate and analyse data objects, and complete analysis workflows can be constructed and automatically executed within MotifLab, including graphical presentation of the results. Conclusions We have developed MotifLab as a flexible workbench for motif analysis in a genomic context. The flexibility and effectiveness of this workbench has been demonstrated on selected test cases, in particular two previously published benchmark data sets for single motifs and modules, and a realistic example of genes responding to treatment with forskolin. MotifLab is freely

  7. Established and Emerging Trends in Computational Drug Discovery in the Structural Genomics Era

    DEFF Research Database (Denmark)

    Taboureau, Olivier; Baell, Jonathan B.; Fernández-Recio, Juan;

    2012-01-01

    Bioinformatics and chemoinformatics approaches contribute to hit discovery, hit-to-lead optimization, safety profiling, and target identification and enhance our overall understanding of the health and disease states. A vast repertoire of computational methods has been reported and increasingly c...

  8. BioZone Exploting Source-Capability Information for Integrated Access to Multiple Bioinformatics Data Sources

    Energy Technology Data Exchange (ETDEWEB)

    Liu, L; Buttler, D; Paques, H; Pu, C; Critchlow

    2002-01-28

    Modern Bioinformatics data sources are widely used by molecular biologists for homology searching and new drug discovery. User-friendly and yet responsive access is one of the most desirable properties for integrated access to the rapidly growing, heterogeneous, and distributed collection of data sources. The increasing volume and diversity of digital information related to bioinformatics (such as genomes, protein sequences, protein structures, etc.) have led to a growing problem that conventional data management systems do not have, namely finding which information sources out of many candidate choices are the most relevant and most accessible to answer a given user query. We refer to this problem as the query routing problem. In this paper we introduce the notation and issues of query routing, and present a practical solution for designing a scalable query routing system based on multi-level progressive pruning strategies. The key idea is to create and maintain source-capability profiles independently, and to provide algorithms that can dynamically discover relevant information sources for a given query through the smart use of source profiles. Compared to the keyword-based indexing techniques adopted in most of the search engines and software, our approach offers fine-granularity of interest matching, thus it is more powerful and effective for handling queries with complex conditions.

  9. Bioinformatics analysis of two-component regulatory systems in Staphylococcus epidermidis

    Institute of Scientific and Technical Information of China (English)

    QIN Zhiqiang; ZHONG Yang; ZHANG Jian; HE Youyu; WU Yang; JIANG Juan; CHEN Jiemin; LUO Xiaomin; QU Di

    2004-01-01

    Sixteen pairs of two-component regulatory systems are identified in the genome of Staphylococcus epidermidis ATCC12228 strain, which is newly sequenced by our laboratory for Medical Molecular Virology and Chinese National Human Genome Center at Shanghai, by using bioinformatics analysis. Comparative analysis of the twocomponent regulatory systems in S. epidermidis and that of S.aureus and Bacillus subtilis shows that these systems may regulate some important biological functions, e.g. growth,biofilm formation, and expression of virulence factors in S.epidermidis. Two conserved domains, i.e. HATPase_c and REC domains, are found in all 16 pairs of two-component proteins.Homologous modelling analysis indicates that there are 4similar HATPase_c domain structures of histidine kinases and 13 similar REC domain structures of response regulators,and there is one AMP-PNP binding pocket in the HATPase_c domain and three active aspartate residues in the REC domain. Preliminary experiment reveals that the bioinformatics analysis of the conserved domain structures in the two-component regulatory systems in S. epidermidis may provide useful information for discovery of potential drug target.

  10. Natural product discovery: past, present, and future.

    Science.gov (United States)

    Katz, Leonard; Baltz, Richard H

    2016-03-01

    Microorganisms have provided abundant sources of natural products which have been developed as commercial products for human medicine, animal health, and plant crop protection. In the early years of natural product discovery from microorganisms (The Golden Age), new antibiotics were found with relative ease from low-throughput fermentation and whole cell screening methods. Later, molecular genetic and medicinal chemistry approaches were applied to modify and improve the activities of important chemical scaffolds, and more sophisticated screening methods were directed at target disease states. In the 1990s, the pharmaceutical industry moved to high-throughput screening of synthetic chemical libraries against many potential therapeutic targets, including new targets identified from the human genome sequencing project, largely to the exclusion of natural products, and discovery rates dropped dramatically. Nonetheless, natural products continued to provide key scaffolds for drug development. In the current millennium, it was discovered from genome sequencing that microbes with large genomes have the capacity to produce about ten times as many secondary metabolites as was previously recognized. Indeed, the most gifted actinomycetes have the capacity to produce around 30-50 secondary metabolites. With the precipitous drop in cost for genome sequencing, it is now feasible to sequence thousands of actinomycete genomes to identify the "biosynthetic dark matter" as sources for the discovery of new and novel secondary metabolites. Advances in bioinformatics, mass spectrometry, proteomics, transcriptomics, metabolomics and gene expression are driving the new field of microbial genome mining for applications in natural product discovery and development. PMID:26739136

  11. SNPTrackTM : an integrated bioinformatics system for genetic association studies

    Directory of Open Access Journals (Sweden)

    Xu Joshua

    2012-07-01

    Full Text Available Abstract A genetic association study is a complicated process that involves collecting phenotypic data, generating genotypic data, analyzing associations between genotypic and phenotypic data, and interpreting genetic biomarkers identified. SNPTrack is an integrated bioinformatics system developed by the US Food and Drug Administration (FDA to support the review and analysis of pharmacogenetics data resulting from FDA research or submitted by sponsors. The system integrates data management, analysis, and interpretation in a single platform for genetic association studies. Specifically, it stores genotyping data and single-nucleotide polymorphism (SNP annotations along with study design data in an Oracle database. It also integrates popular genetic analysis tools, such as PLINK and Haploview. SNPTrack provides genetic analysis capabilities and captures analysis results in its database as SNP lists that can be cross-linked for biological interpretation to gene/protein annotations, Gene Ontology, and pathway analysis data. With SNPTrack, users can do the entire stream of bioinformatics jobs for genetic association studies. SNPTrack is freely available to the public at http://www.fda.gov/ScienceResearch/BioinformaticsTools/SNPTrack/default.htm.

  12. Effective Online Group Discovery in Trajectory Databases

    DEFF Research Database (Denmark)

    Li, Xiaohui; Ceikute, Vaida; Jensen, Christian S.;

    2013-01-01

    GPS-enabled devices are pervasive nowadays. Finding movement patterns in trajectory data stream is gaining in importance. We propose a group discovery framework that aims to efficiently support the online discovery of moving objects that travel together. The framework adopts a sampling...

  13. PROTEOME-3D: an interactive bioinformatics tool for large-scale data exploration and knowledge discovery.

    Science.gov (United States)

    Lundgren, Deborah H; Eng, Jimmy; Wright, Michael E; Han, David K

    2003-11-01

    Comprehensive understanding of biological systems requires efficient and systematic assimilation of high-throughput datasets in the context of the existing knowledge base. A major limitation in the field of proteomics is the lack of an appropriate software platform that can synthesize a large number of experimental datasets in the context of the existing knowledge base. Here, we describe a software platform, termed PROTEOME-3D, that utilizes three essential features for systematic analysis of proteomics data: creation of a scalable, queryable, customized database for identified proteins from published literature; graphical tools for displaying proteome landscapes and trends from multiple large-scale experiments; and interactive data analysis that facilitates identification of crucial networks and pathways. Thus, PROTEOME-3D offers a standardized platform to analyze high-throughput experimental datasets for the identification of crucial players in co-regulated pathways and cellular processes. PMID:12960178

  14. TGI-Simulator: a visual tool to support the preclinical phase of the drug discovery process by assessing in silico the effect of an anticancer drug.

    Science.gov (United States)

    Terranova, Nadia; Magni, Paolo

    2012-02-01

    This paper presents TGI-Simulator, a software tool designed to show, through a 2-D graphical animation, the simulated time effect of an anticancer drug on a tumor mass by exploiting the well-known Tumor Growth Inhibition (TGI) model published by Simeoni et al. [1]. Simeoni TGI model is a mathematical model routinely used by pharma companies and researchers during the drug development process. The application is based on a Java graphical user interface (GUI) including a self installing differential equation solver implemented in Matlab together with an optimization algorithm that performs model identification via Weighted Least Squares (WLS). However, it can graphically show also the simulation results obtained within other scientific software tools, if they are preventively stored into a suitable ASCII file. The tool would be a valid support also for researchers with no specific skills in scientific calculations and in pharmacokinetic and pharmacodynamic modeling but daily involved in pharma companies drug development processes at different levels. The availability of a movie with a temporal varying 2-D iconographic representation is an original instrument to communicate results and learn Simeoni TGI model and its potential application in preclinical studies. PMID:22005012

  15. A Survey of the Availability of Primary Bioinformatics Web Resources

    Institute of Scientific and Technical Information of China (English)

    Trias Thireou; George Spyrou; Vassilis Atlamazoglou

    2007-01-01

    The explosive growth of the bioinformatics field has led to a large amount of data and software applications publicly available as web resources. However, the lack of persistence of web references is a barrier to a comprehensive shared access. We conducted a study of the current availability and other features of primary bioinformatics web resources (such as software tools and databases). The majority (95%) of the examined bioinformatics web resources were found running on UNIX/Linux operating systems, and the most widely used web server was found to be Apache (or Apache-related products). Of the overall 1,130 Uniform Resource Locators (URLs) examined, 91% were highly available (more than 90% of the time), while only 4% showed low accessibility (less than 50% of the time) during the survey. Furthermore, the most common URL failure modes are presented and analyzed.

  16. Drug targets for lymphatic filariasis: A bioinformatics approach

    Directory of Open Access Journals (Sweden)

    Om Prakash Sharma

    2013-08-01

    Full Text Available This review article discusses the current scenario of the national and international burden due to lymphatic filariasis (LF and describes the active elimination programmes for LF and their achievements to eradicate this most debilitating disease from the earth. Since, bioinformatics is a rapidly growing field of biological study, and it has an increasingly significant role in various fields of biology. We have reviewed its leading involvement in the filarial research using different approaches of bioinformatics and have summarized available existing drugs and their targets to re-examine and to keep away from the resisting conditions. Moreover, some of the novel drug targets have been assembled for further study to design fresh and better pharmacological therapeutics. Various bioinformatics-based web resources, and databases have been discussed, which may enrich the filarial research.

  17. Application of Bioinformatics and Systems Biology in Medicinal Plant Studies

    Institute of Scientific and Technical Information of China (English)

    DENG You-ping; AI Jun-mei; XIAO Pei-gen

    2010-01-01

    One important purpose to investigate medicinal plants is to understand genes and enzymes that govern the biological metabolic process to produce bioactive compounds.Genome wide high throughput technologies such as genomics,transcriptomics,proteomics and metabolomics can help reach that goal.Such technologies can produce a vast amount of data which desperately need bioinformatics and systems biology to process,manage,distribute and understand these data.By dealing with the"omics"data,bioinformatics and systems biology can also help improve the quality of traditional medicinal materials,develop new approaches for the classification and authentication of medicinal plants,identify new active compounds,and cultivate medicinal plant species that tolerate harsh environmental conditions.In this review,the application of bioinformatics and systems biology in medicinal plants is briefly introduced.

  18. Bioinformatic Identification of Conserved Cis-Sequences in Coregulated Genes.

    Science.gov (United States)

    Bülow, Lorenz; Hehl, Reinhard

    2016-01-01

    Bioinformatics tools can be employed to identify conserved cis-sequences in sets of coregulated plant genes because more and more gene expression and genomic sequence data become available. Knowledge on the specific cis-sequences, their enrichment and arrangement within promoters, facilitates the design of functional synthetic plant promoters that are responsive to specific stresses. The present chapter illustrates an example for the bioinformatic identification of conserved Arabidopsis thaliana cis-sequences enriched in drought stress-responsive genes. This workflow can be applied for the identification of cis-sequences in any sets of coregulated genes. The workflow includes detailed protocols to determine sets of coregulated genes, to extract the corresponding promoter sequences, and how to install and run a software package to identify overrepresented motifs. Further bioinformatic analyses that can be performed with the results are discussed. PMID:27557771

  19. Probabilistic models and machine learning in structural bioinformatics

    DEFF Research Database (Denmark)

    Hamelryck, Thomas

    2009-01-01

    and experimental determination of macromolecular structure that are based on such methods. These developments include generative models of protein structure, the estimation of the parameters of energy functions that are used in structure prediction, the superposition of macromolecules and structure determination......Structural bioinformatics is concerned with the molecular structure of biomacromolecules on a genomic scale, using computational methods. Classic problems in structural bioinformatics include the prediction of protein and RNA structure from sequence, the design of artificial proteins or enzymes......, and the automated analysis and comparison of biomacromolecules in atomic detail. The determination of macromolecular structure from experimental data (for example coming from nuclear magnetic resonance, X-ray crystallography or small angle X-ray scattering) has close ties with the field of structural bioinformatics...

  20. Approaches in integrative bioinformatics towards the virtual cell

    CERN Document Server

    Chen, Ming

    2014-01-01

    Approaches in Integrative Bioinformatics provides a basic introduction to biological information systems, as well as guidance for the computational analysis of systems biology. This book also covers a range of issues and methods that reveal the multitude of omics data integration types and the relevance that integrative bioinformatics has today. Topics include biological data integration and manipulation, modeling and simulation of metabolic networks, transcriptomics and phenomics, and virtual cell approaches, as well as a number of applications of network biology. It helps to illustrat

  1. Bioinformatic scaling of allosteric interactions in biomedical isozymes

    Science.gov (United States)

    Phillips, J. C.

    2016-09-01

    Allosteric (long-range) interactions can be surprisingly strong in proteins of biomedical interest. Here we use bioinformatic scaling to connect prior results on nonsteroidal anti-inflammatory drugs to promising new drugs that inhibit cancer cell metabolism. Many parallel features are apparent, which explain how even one amino acid mutation, remote from active sites, can alter medical results. The enzyme twins involved are cyclooxygenase (aspirin) and isocitrate dehydrogenase (IDH). The IDH results are accurate to 1% and are overdetermined by adjusting a single bioinformatic scaling parameter. It appears that the final stage in optimizing protein functionality may involve leveling of the hydrophobic limits of the arms of conformational hydrophilic hinges.

  2. High-performance computational solutions in protein bioinformatics

    CERN Document Server

    Mrozek, Dariusz

    2014-01-01

    Recent developments in computer science enable algorithms previously perceived as too time-consuming to now be efficiently used for applications in bioinformatics and life sciences. This work focuses on proteins and their structures, protein structure similarity searching at main representation levels and various techniques that can be used to accelerate similarity searches. Divided into four parts, the first part provides a formal model of 3D protein structures for functional genomics, comparative bioinformatics and molecular modeling. The second part focuses on the use of multithreading for

  3. Performance Evaluation of Frequent Subgraph Discovery Techniques

    Directory of Open Access Journals (Sweden)

    Saif Ur Rehman

    2014-01-01

    Full Text Available Due to rapid development of the Internet technology and new scientific advances, the number of applications that model the data as graphs increases, because graphs have highly expressive power to model a complicated structure. Graph mining is a well-explored area of research which is gaining popularity in the data mining community. A graph is a general model to represent data and has been used in many domains such as cheminformatics, web information management system, computer network, and bioinformatics, to name a few. In graph mining the frequent subgraph discovery is a challenging task. Frequent subgraph mining is concerned with discovery of those subgraphs from graph dataset which have frequent or multiple instances within the given graph dataset. In the literature a large number of frequent subgraph mining algorithms have been proposed; these included FSG, AGM, gSpan, CloseGraph, SPIN, Gaston, and Mofa. The objective of this research work is to perform quantitative comparison of the above listed techniques. The performances of these techniques have been evaluated through a number of experiments based on three different state-of-the-art graph datasets. This novel work will provide base for anyone who is working to design a new frequent subgraph discovery technique.

  4. Perspective: Role of structure prediction in materials discovery and design

    Science.gov (United States)

    Needs, Richard J.; Pickard, Chris J.

    2016-05-01

    Materials informatics owes much to bioinformatics and the Materials Genome Initiative has been inspired by the Human Genome Project. But there is more to bioinformatics than genomes, and the same is true for materials informatics. Here we describe the rapidly expanding role of searching for structures of materials using first-principles electronic-structure methods. Structure searching has played an important part in unraveling structures of dense hydrogen and in identifying the record-high-temperature superconducting component in hydrogen sulfide at high pressures. We suggest that first-principles structure searching has already demonstrated its ability to determine structures of a wide range of materials and that it will play a central and increasing part in materials discovery and design.

  5. Snpdat: Easy and rapid annotation of results from de novo snp discovery projects for model and non-model organisms

    Directory of Open Access Journals (Sweden)

    Doran Anthony G

    2013-02-01

    Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs are the most abundant genetic variant found in vertebrates and invertebrates. SNP discovery has become a highly automated, robust and relatively inexpensive process allowing the identification of many thousands of mutations for model and non-model organisms. Annotating large numbers of SNPs can be a difficult and complex process. Many tools available are optimised for use with organisms densely sampled for SNPs, such as humans. There are currently few tools available that are species non-specific or support non-model organism data. Results Here we present SNPdat, a high throughput analysis tool that can provide a comprehensive annotation of both novel and known SNPs for any organism with a draft sequence and annotation. Using a dataset of 4,566 SNPs identified in cattle using high-throughput DNA sequencing we demonstrate the annotations performed and the statistics that can be generated by SNPdat. Conclusions SNPdat provides users with a simple tool for annotation of genomes that are either not supported by other tools or have a small number of annotated SNPs available. SNPdat can also be used to analyse datasets from organisms which are densely sampled for SNPs. As a command line tool it can easily be incorporated into existing SNP discovery pipelines and fills a niche for analyses involving non-model organisms that are not supported by many available SNP annotation tools. SNPdat will be of great interest to scientists involved in SNP discovery and analysis projects, particularly those with limited bioinformatics experience.

  6. Model-driven user interfaces for bioinformatics data resources: regenerating the wheel as an alternative to reinventing it

    OpenAIRE

    Swainston Neil; Griffiths Tony; Hedeler Cornelia; Garwood Christopher; Garwood Kevin; Oliver Stephen G; Paton Norman W

    2006-01-01

    Abstract Background The proliferation of data repositories in bioinformatics has resulted in the development of numerous interfaces that allow scientists to browse, search and analyse the data that they contain. Interfaces typically support repository access by means of web pages, but other means are also used, such as desktop applications and command line tools. Interfaces often duplicate functionality amongst each other, and this implies that associated development activities are repeated i...

  7. University-level practical activities in bioinformatics benefit voluntary groups of pupils in the last 2 years of school

    OpenAIRE

    Barker, Daniel; Alderson, Rosanna Grace; McDonagh, James L.; Plaisier, Heleen; Comrie, Muriel Margaret; Duncan, Leigh; Muirhead, Gavin T.P.; Sweeney, Stuart D.

    2015-01-01

    This work was supported in part by the Science and Technology Facilities Council under grant ST/M000435/1 to Daniel Barker. Background Bioinformatics—the use of computers in biology—is of major and increasing importance to biological sciences and medicine. We conducted a preliminary investigation of the value of bringing practical, university-level bioinformatics education to the school level. We conducted voluntary activities for pupils at two schools in Scotland (years S5 and S6; pupils ...

  8. Genomics, molecular imaging, bioinformatics, and bio-nano-info integration are synergistic components of translational medicine and personalized healthcare research

    OpenAIRE

    2008-01-01

    Supported by National Science Foundation (NSF), International Society of Intelligent Biological Medicine (ISIBM), International Journal of Computational Biology and Drug Design and International Journal of Functional Informatics and Personalized Medicine, IEEE 7th Bioinformatics and Bioengineering attracted more than 600 papers and 500 researchers and medical doctors. It was the only synergistic inter/multidisciplinary IEEE conference with 24 Keynote Lectures, 7 Tutorials, 5 Cutting-Edge Rese...

  9. Coral aquaculture to support drug discovery

    NARCIS (Netherlands)

    Leal, M.C.; Calado, R.; Sheridan, C.; Alimonti, A.; Osinga, R.

    2013-01-01

    Marine natural products (NP) are unanimously acknowledged as the blue gold in the urgent quest for new pharmaceuticals. Although corals are among the marine organisms with the greatest diversity of secondary metabolites, growing evidence suggest that their symbiotic bacteria produce most of these bi

  10. Linking Textual Resources to Support Information Discovery

    OpenAIRE

    Knoth, Petr

    2015-01-01

    A vast amount of information is today stored in the form of textual documents, many of which are available online. These documents come from different sources and are of different types. They include newspaper articles, books, corporate reports, encyclopedia entries and research papers. At a semantic level, these documents contain knowledge, which was created by explicitly connecting information and expressing it in the form of a natural language. However, a significant amount of knowledge is...

  11. Technical phosphoproteomic and bioinformatic tools useful in cancer research.

    Science.gov (United States)

    López, Elena; Wesselink, Jan-Jaap; López, Isabel; Mendieta, Jesús; Gómez-Puertas, Paulino; Muñoz, Sarbelio Rodríguez

    2011-01-01

    Reversible protein phosphorylation is one of the most important forms of cellular regulation. Thus, phosphoproteomic analysis of protein phosphorylation in cells is a powerful tool to evaluate cell functional status. The importance of protein kinase-regulated signal transduction pathways in human cancer has led to the development of drugs that inhibit protein kinases at the apex or intermediary levels of these pathways. Phosphoproteomic analysis of these signalling pathways will provide important insights for operation and connectivity of these pathways to facilitate identification of the best targets for cancer therapies. Enrichment of phosphorylated proteins or peptides from tissue or bodily fluid samples is required. The application of technologies such as phosphoenrichments, mass spectrometry (MS) coupled to bioinformatics tools is crucial for the identification and quantification of protein phosphorylation sites for advancing in such relevant clinical research. A combination of different phosphopeptide enrichments, quantitative techniques and bioinformatic tools is necessary to achieve good phospho-regulation data and good structural analysis of protein studies. The current and most useful proteomics and bioinformatics techniques will be explained with research examples. Our aim in this article is to be helpful for cancer research via detailing proteomics and bioinformatic tools. PMID:21967744

  12. BioRuby: Bioinformatics software for the Ruby programming language

    NARCIS (Netherlands)

    Goto, N.; Prins, J.C.P.; Nakao, M.; Bonnal, R.; Aerts, J.; Katayama, A.

    2010-01-01

    The BioRuby software toolkit contains a comprehensive set of free development tools and libraries for bioinformatics and molecular biology, written in the Ruby programming language. BioRuby has components for sequence analysis, pathway analysis, protein modelling and phylogenetic analysis; it suppor

  13. Learning Genetics through an Authentic Research Simulation in Bioinformatics

    Science.gov (United States)

    Gelbart, Hadas; Yarden, Anat

    2006-01-01

    Following the rationale that learning is an active process of knowledge construction as well as enculturation into a community of experts, we developed a novel web-based learning environment in bioinformatics for high-school biology majors in Israel. The learning environment enables the learners to actively participate in a guided inquiry process…

  14. CROSSWORK for Glycans: Glycan Identificatin Through Mass Spectrometry and Bioinformatics

    DEFF Research Database (Denmark)

    Rasmussen, Morten; Thaysen-Andersen, Morten; Højrup, Peter

      We have developed "GLYCANthrope " - CROSSWORKS for glycans:  a bioinformatics tool, which assists in identifying N-linked glycosylated peptides as well as their glycan moieties from MS2 data of enzymatically digested glycoproteins. The program runs either as a stand-alone application or as a plug...

  15. Setubal named deputy director at Virginia Bioinformatics Institute

    OpenAIRE

    Whyte, Barry James

    2005-01-01

    The Virginia Bioinformatics Institute (VBI) at Virginia Tech has announced the appointment of Joao Setubal as the institute's deputy director. As deputy director, Setubal will act on behalf of VBI's executive and scientific director, Bruno Sobral, handling internal administrative functions, as well as scientific decision making.

  16. Robust enzyme design: bioinformatic tools for improved protein stability.

    Science.gov (United States)

    Suplatov, Dmitry; Voevodin, Vladimir; Švedas, Vytas

    2015-03-01

    The ability of proteins and enzymes to maintain a functionally active conformation under adverse environmental conditions is an important feature of biocatalysts, vaccines, and biopharmaceutical proteins. From an evolutionary perspective, robust stability of proteins improves their biological fitness and allows for further optimization. Viewed from an industrial perspective, enzyme stability is crucial for the practical application of enzymes under the required reaction conditions. In this review, we analyze bioinformatic-driven strategies that are used to predict structural changes that can be applied to wild type proteins in order to produce more stable variants. The most commonly employed techniques can be classified into stochastic approaches, empirical or systematic rational design strategies, and design of chimeric proteins. We conclude that bioinformatic analysis can be efficiently used to study large protein superfamilies systematically as well as to predict particular structural changes which increase enzyme stability. Evolution has created a diversity of protein properties that are encoded in genomic sequences and structural data. Bioinformatics has the power to uncover this evolutionary code and provide a reproducible selection of hotspots - key residues to be mutated in order to produce more stable and functionally diverse proteins and enzymes. Further development of systematic bioinformatic procedures is needed to organize and analyze sequences and structures of proteins within large superfamilies and to link them to function, as well as to provide knowledge-based predictions for experimental evaluation.

  17. Incorporation of Bioinformatics Exercises into the Undergraduate Biochemistry Curriculum

    Science.gov (United States)

    Feig, Andrew L.; Jabri, Evelyn

    2002-01-01

    The field of bioinformatics is developing faster than most biochemistry textbooks can adapt. Supplementing the undergraduate biochemistry curriculum with data-mining exercises is an ideal way to expose the students to the common databases and tools that take advantage of this vast repository of biochemical information. An integrated collection of…

  18. A BIOINFORMATIC STRATEGY TO RAPIDLY CHARACTERIZE CDNA LIBRARIES

    Science.gov (United States)

    A Bioinformatic Strategy to Rapidly Characterize cDNA LibrariesG. Charles Ostermeier1, David J. Dix2 and Stephen A. Krawetz1.1Departments of Obstetrics and Gynecology, Center for Molecular Medicine and Genetics, & Institute for Scientific Computing, Wayne State Univer...

  19. An International Bioinformatics Infrastructure to Underpin the Arabidopsis Community

    Science.gov (United States)

    The future bioinformatics needs of the Arabidopsis community as well as those of other scientific communities that depend on Arabidopsis resources were discussed at a pair of recent meetings held by the Multinational Arabidopsis Steering Committee (MASC) and the North American Arabidopsis Steering C...

  20. Intrageneric Primer Design: Bringing Bioinformatics Tools to the Class

    Science.gov (United States)

    Lima, Andre O. S.; Garces, Sergio P. S.

    2006-01-01

    Bioinformatics is one of the fastest growing scientific areas over the last decade. It focuses on the use of informatics tools for the organization and analysis of biological data. An example of their importance is the availability nowadays of dozens of software programs for genomic and proteomic studies. Thus, there is a growing field (private…

  1. Mathematics and evolutionary biology make bioinformatics education comprehensible

    Science.gov (United States)

    Weisstein, Anton E.

    2013-01-01

    The patterns of variation within a molecular sequence data set result from the interplay between population genetic, molecular evolutionary and macroevolutionary processes—the standard purview of evolutionary biologists. Elucidating these patterns, particularly for large data sets, requires an understanding of the structure, assumptions and limitations of the algorithms used by bioinformatics software—the domain of mathematicians and computer scientists. As a result, bioinformatics often suffers a ‘two-culture’ problem because of the lack of broad overlapping expertise between these two groups. Collaboration among specialists in different fields has greatly mitigated this problem among active bioinformaticians. However, science education researchers report that much of bioinformatics education does little to bridge the cultural divide, the curriculum too focused on solving narrow problems (e.g. interpreting pre-built phylogenetic trees) rather than on exploring broader ones (e.g. exploring alternative phylogenetic strategies for different kinds of data sets). Herein, we present an introduction to the mathematics of tree enumeration, tree construction, split decomposition and sequence alignment. We also introduce off-line downloadable software tools developed by the BioQUEST Curriculum Consortium to help students learn how to interpret and critically evaluate the results of standard bioinformatics analyses. PMID:23821621

  2. Missing "Links" in Bioinformatics Education: Expanding Students' Conceptions of Bioinformatics Using a Biodiversity Database of Living and Fossil Reef Corals

    Science.gov (United States)

    Nehm, Ross H.; Budd, Ann F.

    2006-01-01

    NMITA is a reef coral biodiversity database that we use to introduce students to the expansive realm of bioinformatics beyond genetics. We introduce a series of lessons that have students use this database, thereby accessing real data that can be used to test hypotheses about biodiversity and evolution while targeting the "National Science …

  3. Vertical and Horizontal Integration of Bioinformatics Education: A Modular, Interdisciplinary Approach

    Science.gov (United States)

    Furge, Laura Lowe; Stevens-Truss, Regina; Moore, D. Blaine; Langeland, James A.

    2009-01-01

    Bioinformatics education for undergraduates has been approached primarily in two ways: introduction of new courses with largely bioinformatics focus or introduction of bioinformatics experiences into existing courses. For small colleges such as Kalamazoo, creation of new courses within an already resource-stretched setting has not been an option.…

  4. Design and Implementation of an Interdepartmental Bioinformatics Program across Life Science Curricula

    Science.gov (United States)

    Miskowski, Jennifer A.; Howard, David R.; Abler, Michael L.; Grunwald, Sandra K.

    2007-01-01

    Over the past 10 years, there has been a technical revolution in the life sciences leading to the emergence of a new discipline called bioinformatics. In response, bioinformatics-related topics have been incorporated into various undergraduate courses along with the development of new courses solely focused on bioinformatics. This report describes…

  5. Introductory Bioinformatics Exercises Utilizing Hemoglobin and Chymotrypsin to Reinforce the Protein Sequence-Structure-Function Relationship

    Science.gov (United States)

    Inlow, Jennifer K.; Miller, Paige; Pittman, Bethany

    2007-01-01

    We describe two bioinformatics exercises intended for use in a computer laboratory setting in an upper-level undergraduate biochemistry course. To introduce students to bioinformatics, the exercises incorporate several commonly used bioinformatics tools, including BLAST, that are freely available online. The exercises build upon the students'…

  6. Report on the EMBER Project--A European Multimedia Bioinformatics Educational Resource

    Science.gov (United States)

    Attwood, Terri K.; Selimas, Ioannis; Buis, Rob; Altenburg, Ruud; Herzog, Robert; Ledent, Valerie; Ghita, Viorica; Fernandes, Pedro; Marques, Isabel; Brugman, Marc

    2005-01-01

    EMBER was a European project aiming to develop bioinformatics teaching materials on the Web and CD-ROM to help address the recognised skills shortage in bioinformatics. The project grew out of pilot work on the development of an interactive web-based bioinformatics tutorial and the desire to repackage that resource with the help of a professional…

  7. 2K09 and Thereafter : The Coming Era of Integrative Bioinformatics, Systems Biology and Intelligent Computing for Functional Genomics and Personalized Medicine Research

    OpenAIRE

    2010-01-01

    Abstract Significant interest exists in establishing synergistic research in bioinformatics, systems biology and intelligent computing. Supported by the United States National Science Foundation (NSF), International Society of Intelligent Biological Medicine (http://www.ISIBM.org), International Journal of Computational Biology and Drug Design (IJCBDD) and International Journal of Functional Informatics and Personalized Medicine, the ISIBM International Joint Co...

  8. Open PHACTS: semantic interoperability for drug discovery.

    Science.gov (United States)

    Williams, Antony J; Harland, Lee; Groth, Paul; Pettifer, Stephen; Chichester, Christine; Willighagen, Egon L; Evelo, Chris T; Blomberg, Niklas; Ecker, Gerhard; Goble, Carole; Mons, Barend

    2012-11-01

    Open PHACTS is a public-private partnership between academia, publishers, small and medium sized enterprises and pharmaceutical companies. The goal of the project is to deliver and sustain an 'open pharmacological space' using and enhancing state-of-the-art semantic web standards and technologies. It is focused on practical and robust applications to solve specific questions in drug discovery research. OPS is intended to facilitate improvements in drug discovery in academia and industry and to support open innovation and in-house non-public drug discovery research. This paper lays out the challenges and how the Open PHACTS project is hoping to address these challenges technically and socially. PMID:22683805

  9. Computational drug discovery

    Institute of Scientific and Technical Information of China (English)

    Si-sheng OU-YANG; Jun-yan LU; Xiang-qian KONG; Zhong-jie LIANG; Cheng LUO; Hualiang JIANG

    2012-01-01

    Computational drug discovery is an effective strategy for accelerating and economizing drug discovery and development process.Because of the dramatic increase in the availability of biological macromolecule and small molecule information,the applicability of computational drug discovery has been extended and broadly applied to nearly every stage in the drug discovery and development workflow,including target identification and validation,lead discovery and optimization and preclinical tests.Over the past decades,computational drug discovery methods such as molecular docking,pharmacophore modeling and mapping,de novo design,molecular similarity calculation and sequence-based virtual screening have been greatly improved.In this review,we present an overview of these important computational methods,platforms and successful applications in this field.

  10. A New Universe of Discoveries

    Science.gov (United States)

    Córdova, France A.

    2016-01-01

    The convergence of emerging advances in astronomical instruments, computational capabilities and talented practitioners (both professional and civilian) is creating an extraordinary new environment for making numerous fundamental discoveries in astronomy, ranging from the nature of exoplanets to understanding the evolution of solar systems and galaxies. The National Science Foundation is playing a critical role in supporting, stimulating, and shaping these advances. NSF is more than an agency of government or a funding mechanism for the infrastructure of science. The work of NSF is a sacred trust that every generation of Americans makes to those of the next generation, that we will build on the body of knowledge we inherit and continue to push forward the frontiers of science. We never lose sight of NSF's obligation to "explore the unexplored" and inspire all of humanity with the wonders of discovery. As the only Federal agency dedicated to the support of basic research and education in all fields of science and engineering, NSF has empowered discoveries across a broad spectrum of scientific inquiry for more than six decades. The result is fundamental scientific research that has had a profound impact on our nation's innovation ecosystem and kept our nation at the very forefront of the world's science-and-engineering enterprise.

  11. Reliable knowledge discovery

    CERN Document Server

    Dai, Honghua; Smirnov, Evgueni

    2012-01-01

    Reliable Knowledge Discovery focuses on theory, methods, and techniques for RKDD, a new sub-field of KDD. It studies the theory and methods to assure the reliability and trustworthiness of discovered knowledge and to maintain the stability and consistency of knowledge discovery processes. RKDD has a broad spectrum of applications, especially in critical domains like medicine, finance, and military. Reliable Knowledge Discovery also presents methods and techniques for designing robust knowledge-discovery processes. Approaches to assessing the reliability of the discovered knowledge are introduc

  12. Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses.

    Science.gov (United States)

    Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

    2014-06-01

    Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach.

  13. Quantum Bio-Informatics II From Quantum Information to Bio-Informatics

    Science.gov (United States)

    Accardi, L.; Freudenberg, Wolfgang; Ohya, Masanori

    2009-02-01

    / H. Kamimura -- Massive collection of full-length complementary DNA clones and microarray analyses: keys to rice transcriptome analysis / S. Kikuchi -- Changes of influenza A(H5) viruses by means of entropic chaos degree / K. Sato and M. Ohya -- Basics of genome sequence analysis in bioinformatics - its fundamental ideas and problems / T. Suzuki and S. Miyazaki -- A basic introduction to gene expression studies using microarray expression data analysis / D. Wanke and J. Kilian -- Integrating biological perspectives: a quantum leap for microarray expression analysis / D. Wanke ... [et al.].

  14. Rise and demise of bioinformatics? Promise and progress.

    Directory of Open Access Journals (Sweden)

    Christos A Ouzounis

    Full Text Available The field of bioinformatics and computational biology has gone through a number of transformations during the past 15 years, establishing itself as a key component of new biology. This spectacular growth has been challenged by a number of disruptive changes in science and technology. Despite the apparent fatigue of the linguistic use of the term itself, bioinformatics has grown perhaps to a point beyond recognition. We explore both historical aspects and future trends and argue that as the field expands, key questions remain unanswered and acquire new meaning while at the same time the range of applications is widening to cover an ever increasing number of biological disciplines. These trends appear to be pointing to a redefinition of certain objectives, milestones, and possibly the field itself.

  15. Bioinformatics Data Distribution and Integration via Web Services and XML

    Institute of Scientific and Technical Information of China (English)

    Xiao Li; Yizheng Zhang

    2003-01-01

    It is widely recognized that exchange, distribution, and integration of biological data are the keys to improve bioinformatics and genome biology in post-genomic era. However, the problem of exchanging and integrating biological data is not solved satisfactorily. The eXtensible Markup Language (XML) is rapidly spreading as an emerging standard for structuring documents to exchange and integrate data on the World Wide Web (WWW). Web service is the next generation of WWW and is founded upon the open standards of W3C (World Wide Web Consortium)and IETF (Internet Engineering Task Force). This paper presents XML and Web Services technologies and their use for an appropriate solution to the problem of bioinformatics data exchange and integration.

  16. Security in SPGBP – Simulation of Protein Generation Bioinformatic Project

    Directory of Open Access Journals (Sweden)

    Toma Cristian

    2010-06-01

    Full Text Available Bioinformatics has known a rapid growth in the last decade, along with the development of the genomic projects worldwide. The Human Genome Project alone offers the sequences for 70 000 – 100 000 genes, as terra bytes of information. The major challenge nowadays is to process this huge amount of data. This paper comes to suggest a solution for the retrieval and processing of biological data. The main objective of this section is to point a direction in the development of tools that enable manipulation and efficient access to different types of biological data. Also, the major challenge is to use security in the proposed distributed architecture and to import concepts from bioinformatics into cryptography and IT&C security.

  17. Bioinformatics Analysis of Zinc Transporter from Baoding Alfalfa

    Institute of Scientific and Technical Information of China (English)

    Haibo WANG; Junyun GUO

    2012-01-01

    [Objective] This study aimed to perform the bioinformatics analysis of Zinc transporter (ZnT) from Baoding Alfalfa. [Method] Based on the amino acid sequence, the physical and chemical properties, hydrophilicity/hydrophobicity, secondary structure of ZnT from Baoding alfalfa were predicted by a series of bioinformatics software. And the transmembrane domains were predicted by using different online tools. [Result] ZnT is a hydrophobic protein containing 408 amino acids with the theoretical pl of 5.94, and it has 7 potential transmembrane hydrophobic regions. In the sec- ondary structure, co-helix (Hh) accounted for 48.04%, extended strand (Ee) for 9.56%, random coil (Cc) for 42.40%, which was accored with the characteristic of transmembrane protein. [Conclusion] mZnT is a member of CDF family, responsible for transporting Zn^2+ out of the cell membrane to reduce the concentration and toxicity of Zn^2+.

  18. Statistical modelling in biostatistics and bioinformatics selected papers

    CERN Document Server

    Peng, Defen

    2014-01-01

    This book presents selected papers on statistical model development related mainly to the fields of Biostatistics and Bioinformatics. The coverage of the material falls squarely into the following categories: (a) Survival analysis and multivariate survival analysis, (b) Time series and longitudinal data analysis, (c) Statistical model development and (d) Applied statistical modelling. Innovations in statistical modelling are presented throughout each of the four areas, with some intriguing new ideas on hierarchical generalized non-linear models and on frailty models with structural dispersion, just to mention two examples. The contributors include distinguished international statisticians such as Philip Hougaard, John Hinde, Il Do Ha, Roger Payne and Alessandra Durio, among others, as well as promising newcomers. Some of the contributions have come from researchers working in the BIO-SI research programme on Biostatistics and Bioinformatics, centred on the Universities of Limerick and Galway in Ireland and fu...

  19. Bioinformatics and Microarray Data Analysis on the Cloud.

    Science.gov (United States)

    Calabrese, Barbara; Cannataro, Mario

    2016-01-01

    High-throughput platforms such as microarray, mass spectrometry, and next-generation sequencing are producing an increasing volume of omics data that needs large data storage and computing power. Cloud computing offers massive scalable computing and storage, data sharing, on-demand anytime and anywhere access to resources and applications, and thus, it may represent the key technology for facing those issues. In fact, in the recent years it has been adopted for the deployment of different bioinformatics solutions and services both in academia and in the industry. Although this, cloud computing presents several issues regarding the security and privacy of data, that are particularly important when analyzing patients data, such as in personalized medicine. This chapter reviews main academic and industrial cloud-based bioinformatics solutions; with a special focus on microarray data analysis solutions and underlines main issues and problems related to the use of such platforms for the storage and analysis of patients data. PMID:25863787

  20. Bioinformatics for whole-genome shotgun sequencing of microbial communities.

    Directory of Open Access Journals (Sweden)

    Kevin Chen

    2005-07-01

    Full Text Available The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fecal samples. The interpretation of this new kind of data poses a wide variety of exciting and difficult bioinformatics problems. The aim of this review is to introduce the bioinformatics community to this emerging field by surveying existing techniques and promising new approaches for several of the most interesting of these computational problems.

  1. Architecture exploration of FPGA based accelerators for bioinformatics applications

    CERN Document Server

    Varma, B Sharat Chandra; Balakrishnan, M

    2016-01-01

    This book presents an evaluation methodology to design future FPGA fabrics incorporating hard embedded blocks (HEBs) to accelerate applications. This methodology will be useful for selection of blocks to be embedded into the fabric and for evaluating the performance gain that can be achieved by such an embedding. The authors illustrate the use of their methodology by studying the impact of HEBs on two important bioinformatics applications: protein docking and genome assembly. The book also explains how the respective HEBs are designed and how hardware implementation of the application is done using these HEBs. It shows that significant speedups can be achieved over pure software implementations by using such FPGA-based accelerators. The methodology presented in this book may also be used for designing HEBs for accelerating software implementations in other domains besides bioinformatics. This book will prove useful to students, researchers, and practicing engineers alike.

  2. 2nd Colombian Congress on Computational Biology and Bioinformatics

    CERN Document Server

    Cristancho, Marco; Isaza, Gustavo; Pinzón, Andrés; Rodríguez, Juan

    2014-01-01

    This volume compiles accepted contributions for the 2nd Edition of the Colombian Computational Biology and Bioinformatics Congress CCBCOL, after a rigorous review process in which 54 papers were accepted for publication from 119 submitted contributions. Bioinformatics and Computational Biology are areas of knowledge that have emerged due to advances that have taken place in the Biological Sciences and its integration with Information Sciences. The expansion of projects involving the study of genomes has led the way in the production of vast amounts of sequence data which needs to be organized, analyzed and stored to understand phenomena associated with living organisms related to their evolution, behavior in different ecosystems, and the development of applications that can be derived from this analysis.  .

  3. Promoting synergistic research and education in genomics and bioinformatics

    OpenAIRE

    Yang, Jack Y; Yang, Mary Qu; Zhu, Mengxia (Michelle); Arabnia, Hamid R; Deng, Youping

    2008-01-01

    Bioinformatics and Genomics are closely related disciplines that hold great promises for the advancement of research and development in complex biomedical systems, as well as public health, drug design, comparative genomics, personalized medicine and so on. Research and development in these two important areas are impacting the science and technology. High throughput sequencing and molecular imaging technologies marked the beginning of a new era for modern translational medicine and personali...

  4. BioJava: an open-source framework for bioinformatics

    OpenAIRE

    Holland, R. C. G.; Down, T. A.; Pocock, M.; Prlić, A.; Huen, D; James, K.; Foisy, S.; Dräger, A.; Yates, A; Heuer, M.; Schreiber, M. J.

    2008-01-01

    Summary: BioJava is a mature open-source project that provides a framework for processing of biological data. BioJava contains powerful analysis and statistical routines, tools for parsing common file formats and packages for manipulating sequences and 3D structures. It enables rapid bioinformatics application development in the Java programming language. Availability: BioJava is an open-source project distributed under the Lesser GPL (LGPL). BioJava can be downloaded from the BioJava website...

  5. A quick guide for building a successful bioinformatics community.

    Directory of Open Access Journals (Sweden)

    Aidan Budd

    2015-02-01

    Full Text Available "Scientific community" refers to a group of people collaborating together on scientific-research-related activities who also share common goals, interests, and values. Such communities play a key role in many bioinformatics activities. Communities may be linked to a specific location or institute, or involve people working at many different institutions and locations. Education and training is typically an important component of these communities, providing a valuable context in which to develop skills and expertise, while also strengthening links and relationships within the community. Scientific communities facilitate: (i the exchange and development of ideas and expertise; (ii career development; (iii coordinated funding activities; (iv interactions and engagement with professionals from other fields; and (v other activities beneficial to individual participants, communities, and the scientific field as a whole. It is thus beneficial at many different levels to understand the general features of successful, high-impact bioinformatics communities; how individual participants can contribute to the success of these communities; and the role of education and training within these communities. We present here a quick guide to building and maintaining a successful, high-impact bioinformatics community, along with an overview of the general benefits of participating in such communities. This article grew out of contributions made by organizers, presenters, panelists, and other participants of the ISMB/ECCB 2013 workshop "The 'How To Guide' for Establishing a Successful Bioinformatics Network" at the 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB and the 12th European Conference on Computational Biology (ECCB.

  6. A quick guide for building a successful bioinformatics community.

    Science.gov (United States)

    Budd, Aidan; Corpas, Manuel; Brazas, Michelle D; Fuller, Jonathan C; Goecks, Jeremy; Mulder, Nicola J; Michaut, Magali; Ouellette, B F Francis; Pawlik, Aleksandra; Blomberg, Niklas

    2015-02-01

    "Scientific community" refers to a group of people collaborating together on scientific-research-related activities who also share common goals, interests, and values. Such communities play a key role in many bioinformatics activities. Communities may be linked to a specific location or institute, or involve people working at many different institutions and locations. Education and training is typically an important component of these communities, providing a valuable context in which to develop skills and expertise, while also strengthening links and relationships within the community. Scientific communities facilitate: (i) the exchange and development of ideas and expertise; (ii) career development; (iii) coordinated funding activities; (iv) interactions and engagement with professionals from other fields; and (v) other activities beneficial to individual participants, communities, and the scientific field as a whole. It is thus beneficial at many different levels to understand the general features of successful, high-impact bioinformatics communities; how individual participants can contribute to the success of these communities; and the role of education and training within these communities. We present here a quick guide to building and maintaining a successful, high-impact bioinformatics community, along with an overview of the general benefits of participating in such communities. This article grew out of contributions made by organizers, presenters, panelists, and other participants of the ISMB/ECCB 2013 workshop "The 'How To Guide' for Establishing a Successful Bioinformatics Network" at the 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and the 12th European Conference on Computational Biology (ECCB). PMID:25654371

  7. The European Bioinformatics Institute’s data resources

    OpenAIRE

    Brooksbank, Catherine; Cameron, Graham; Thornton, Janet

    2009-01-01

    The wide uptake of next-generation sequencing and other ultra-high throughput technologies by life scientists with a diverse range of interests, spanning fundamental biological research, medicine, agriculture and environmental science, has led to unprecedented growth in the amount of data generated. It has also put the need for unrestricted access to biological data at the centre of biology. The European Bioinformatics Institute (EMBL-EBI) is unique in Europe and is one of only two organisati...

  8. Drug targets for lymphatic filariasis: A bioinformatics approach

    OpenAIRE

    Om Prakash Sharma; Yellamandaya Vadlamudi; Arun Gupta Kota; Vikrant Kumar Sinha; Muthuvel Suresh Kumar

    2013-01-01

    This review article discusses the current scenario of the national and international burden due to lymphatic filariasis (LF) and describes the active elimination programmes for LF and their achievements to eradicate this most debilitating disease from the earth. Since, bioinformatics is a rapidly growing field of biological study, and it has an increasingly significant role in various fields of biology. We have reviewed its leading involvement in the filarial research using different approach...

  9. Bioinformatics approaches for identifying new therapeutic bioactive peptides in food

    OpenAIRE

    Nora Khaldi

    2012-01-01

    ABSTRACT:The traditional methods for mining foods for bioactive peptides are tedious and long. Similar to the drug industry, the length of time to identify and deliver a commercial health ingredient that reduces disease symptoms can take anything between 5 to 10 years. Reducing this time and effort is crucial in order to create new commercially viable products with clear and important health benefits. In the past few years, bioinformatics, the science that brings together fast computational b...

  10. BIRCH: A user-oriented, locally-customizable, bioinformatics system

    Directory of Open Access Journals (Sweden)

    Fristensky Brian

    2007-02-01

    Full Text Available Abstract Background Molecular biologists need sophisticated analytical tools which often demand extensive computational resources. While finding, installing, and using these tools can be challenging, pipelining data from one program to the next is particularly awkward, especially when using web-based programs. At the same time, system administrators tasked with maintaining these tools do not always appreciate the needs of research biologists. Results BIRCH (Biological Research Computing Hierarchy is an organizational framework for delivering bioinformatics resources to a user group, scaling from a single lab to a large institution. The BIRCH core distribution includes many popular bioinformatics programs, unified within the GDE (Genetic Data Environment graphic interface. Of equal importance, BIRCH provides the system administrator with tools that simplify the job of managing a multiuser bioinformatics system across different platforms and operating systems. These include tools for integrating locally-installed programs and databases into BIRCH, and for customizing the local BIRCH system to meet the needs of the user base. BIRCH can also act as a front end to provide a unified view of already-existing collections of bioinformatics software. Documentation for the BIRCH and locally-added programs is merged in a hierarchical set of web pages. In addition to manual pages for individual programs, BIRCH tutorials employ step by step examples, with screen shots and sample files, to illustrate both the important theoretical and practical considerations behind complex analytical tasks. Conclusion BIRCH provides a versatile organizational framework for managing software and databases, and making these accessible to a user base. Because of its network-centric design, BIRCH makes it possible for any user to do any task from anywhere.

  11. p3d – Python module for structural bioinformatics

    Directory of Open Access Journals (Sweden)

    Fufezan Christian

    2009-08-01

    Full Text Available Abstract Background High-throughput bioinformatic analysis tools are needed to mine the large amount of structural data via knowledge based approaches. The development of such tools requires a robust interface to access the structural data in an easy way. For this the Python scripting language is the optimal choice since its philosophy is to write an understandable source code. Results p3d is an object oriented Python module that adds a simple yet powerful interface to the Python interpreter to process and analyse three dimensional protein structure files (PDB files. p3d's strength arises from the combination of a very fast spatial access to the structural data due to the implementation of a binary space partitioning (BSP tree, b set theory and c functions that allow to combine a and b and that use human readable language in the search queries rather than complex computer language. All these factors combined facilitate the rapid development of bioinformatic tools that can perform quick and complex analyses of protein structures. Conclusion p3d is the perfect tool to quickly develop tools for structural bioinformatics using the Python scripting language.

  12. A comparison of common programming languages used in bioinformatics

    Directory of Open Access Journals (Sweden)

    Gillings Michael R

    2008-02-01

    Full Text Available Abstract Background The performance of different programming languages has previously been benchmarked using abstract mathematical algorithms, but not using standard bioinformatics algorithms. We compared the memory usage and speed of execution for three standard bioinformatics methods, implemented in programs using one of six different programming languages. Programs for the Sellers algorithm, the Neighbor-Joining tree construction algorithm and an algorithm for parsing BLAST file outputs were implemented in C, C++, C#, Java, Perl and Python. Results Implementations in C and C++ were fastest and used the least memory. Programs in these languages generally contained more lines of code. Java and C# appeared to be a compromise between the flexibility of Perl and Python and the fast performance of C and C++. The relative performance of the tested languages did not change from Windows to Linux and no clear evidence of a faster operating system was found. Source code and additional information are available from http://www.bioinformatics.org/benchmark/ Conclusion This benchmark provides a comparison of six commonly used programming languages under two different operating systems. The overall comparison shows that a developer should choose an appropriate language carefully, taking into account the performance expected and the library availability for each language.

  13. Best practices in bioinformatics training for life scientists.

    Science.gov (United States)

    Via, Allegra; Blicher, Thomas; Bongcam-Rudloff, Erik; Brazas, Michelle D; Brooksbank, Cath; Budd, Aidan; De Las Rivas, Javier; Dreyer, Jacqueline; Fernandes, Pedro L; van Gelder, Celia; Jacob, Joachim; Jimenez, Rafael C; Loveland, Jane; Moran, Federico; Mulder, Nicola; Nyrönen, Tommi; Rother, Kristian; Schneider, Maria Victoria; Attwood, Teresa K

    2013-09-01

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists. PMID:23803301

  14. KBWS: an EMBOSS associated package for accessing bioinformatics web services

    Directory of Open Access Journals (Sweden)

    Tomita Masaru

    2011-04-01

    Full Text Available Abstract The availability of bioinformatics web-based services is rapidly proliferating, for their interoperability and ease of use. The next challenge is in the integration of these services in the form of workflows, and several projects are already underway, standardizing the syntax, semantics, and user interfaces. In order to deploy the advantages of web services with locally installed tools, here we describe a collection of proxy client tools for 42 major bioinformatics web services in the form of European Molecular Biology Open Software Suite (EMBOSS UNIX command-line tools. EMBOSS provides sophisticated means for discoverability and interoperability for hundreds of tools, and our package, named the Keio Bioinformatics Web Service (KBWS, adds functionalities of local and multiple alignment of sequences, phylogenetic analyses, and prediction of cellular localization of proteins and RNA secondary structures. This software implemented in C is available under GPL from http://www.g-language.org/kbws/ and GitHub repository http://github.com/cory-ko/KBWS. Users can utilize the SOAP services implemented in Perl directly via WSDL file at http://soap.g-language.org/kbws.wsdl (RPC Encoded and http://soap.g-language.org/kbws_dl.wsdl (Document/literal.

  15. Best practices in bioinformatics training for life scientists.

    Science.gov (United States)

    Via, Allegra; Blicher, Thomas; Bongcam-Rudloff, Erik; Brazas, Michelle D; Brooksbank, Cath; Budd, Aidan; De Las Rivas, Javier; Dreyer, Jacqueline; Fernandes, Pedro L; van Gelder, Celia; Jacob, Joachim; Jimenez, Rafael C; Loveland, Jane; Moran, Federico; Mulder, Nicola; Nyrönen, Tommi; Rother, Kristian; Schneider, Maria Victoria; Attwood, Teresa K

    2013-09-01

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists.

  16. Best practices in bioinformatics training for life scientists.

    KAUST Repository

    Via, Allegra

    2013-06-25

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists.

  17. An Integrative Study on Bioinformatics Computing Concepts, Issues and Problems

    Directory of Open Access Journals (Sweden)

    Muhammad Zakarya

    2011-11-01

    Full Text Available Bioinformatics is the permutation and mishmash of biological science and 4IT. The discipline covers every computational tools and techniques used to administer, examine and manipulate huge sets of biological statistics. The discipline also helps in creation of databases to store up and supervise biological statistics, improvement of computer algorithms to find out relations in these databases and use of computer tools for the study and understanding of biological information, including DNA, RNA, protein sequences, gene expression profiles, protein structures, and biochemical pathways. The study of this paper implements an integrative solution. As we know that solution to a problem in a specific discipline may be a solution to another problem in a different discipline. For example entropy that has been rented from physical sciences is solution to most of the problems and issues in computer science. Another example is bioinformatics, where computing method and applications are implemented over biological information. This paper shows an initiative step towards that and will discuss upon the needs for integration of multiple discipline and sciences. Similarly green chemistry gives birth to a new kind of computing i.e. green computing. In next versions of this paper we will study biological fuel cell and will discuss to develop a mobile battery that will be life time charged using the concepts of biological fuel cell. Another issue that we are going to discuss in our series is brain tumor detection. This paper is a review on BI i.e. bioinformatics to start with.

  18. Higgs Discovery Movie

    CERN Multimedia

    2014-01-01

    The ATLAS & CMS Experiments Celebrate the 2nd Anniversary of the Discovery of the Higgs boson. Here, are some images of the path from LHC startup to Nobel Prize, featuring a musical composition by Roger Zare, performed by the Donald Sinta Quartet, called “LHC”. Happy Discovery Day!

  19. Academic Drug Discovery Centres

    DEFF Research Database (Denmark)

    Kirkegaard, Henriette Schultz; Valentin, Finn

    2014-01-01

    Academic drug discovery centres (ADDCs) are seen as one of the solutions to fill the innovation gap in early drug discovery, which has proven challenging for previous organisational models. Prior studies of ADDCs have identified the need to analyse them from the angle of their economic...

  20. Service discovery at home

    NARCIS (Netherlands)

    Sundramoorthy, Vasughi; Scholten, Hans; Jansen, Pierre; Hartel, Pieter

    2003-01-01

    Service discovery is a fairly new field that kicked off since the advent of ubiquitous computing and has been found essential in the making of intelligent networks by implementing automated discovery and remote control between devices. This paper provides an overview and comparison of several promin

  1. Tales of one gene discovery of a novel candidate receptor in mammalian taste

    OpenAIRE

    Huang, Angela Lilly

    2007-01-01

    There are five basic taste modalities in mammals: bitter, sweet, sour, salty, and Umami (taste of MSG and L-amino acids). Receptors for bitter, sweet, and Umami were previously discovered. Identities of receptors for salty and sour taste modalities remained elusive. In this dissertation, I will present: 1) development of a novel bioinformatics screen to discover candidate receptors; 2) discovery of a novel gene, PKD2L1, in taste receptor cells; 3) evidence demonstrating PKD2L1-expressing tast...

  2. Shared Genetic Etiology between Type 2 Diabetes and Alzheimer's Disease Identified by Bioinformatics Analysis.

    Science.gov (United States)

    Gao, Lei; Cui, Zhen; Shen, Liang; Ji, Hong-Fang

    2015-01-01

    Type 2 diabetes (T2D) and Alzheimer's disease (AD) are two major health issues, and increasing evidence in recent years supports the close connection between these two diseases. The present study aimed to explore the shared genetic etiology underlying T2D and AD based on the available genome wide association studies (GWAS) data collected through August 2014. We performed bioinformatics analyses based on GWAS data of T2D and AD on single nucleotide polymorphisms (SNPs), gene, and pathway levels, respectively. Six SNPs (rs111789331, rs12721046, rs12721051, rs4420638, rs56131196, and rs66626994) were identified for the first time to be shared genetic factors between T2D and AD. Further functional enrichment analysis found lipid metabolism related pathways to be common between these two disorders. The findings may have important implications for future mechanistic and interventional studies for T2D and AD. PMID:26639962

  3. An overview of bioinformatics tools for epitope prediction: implications on vaccine development.

    Science.gov (United States)

    Soria-Guerra, Ruth E; Nieto-Gomez, Ricardo; Govea-Alonso, Dania O; Rosales-Mendoza, Sergio

    2015-02-01

    Exploitation of recombinant DNA and sequencing technologies has led to a new concept in vaccination in which isolated epitopes, capable of stimulating a specific immune response, have been identified and used to achieve advanced vaccine formulations; replacing those constituted by whole pathogen-formulations. In this context, bioinformatics approaches play a critical role on analyzing multiple genomes to select the protective epitopes in silico. It is conceived that cocktails of defined epitopes or chimeric protein arrangements, including the target epitopes, may provide a rationale design capable to elicit convenient humoral or cellular immune responses. This review presents a comprehensive compilation of the most advantageous online immunological software and searchable, in order to facilitate the design and development of vaccines. An outlook on how these tools are supporting vaccine development is presented. HIV and influenza have been taken as examples of promising developments on vaccination against hypervariable viruses. Perspectives in this field are also envisioned.

  4. Integrative genomic analysis by interoperation of bioinformatics tools in GenomeSpace

    Science.gov (United States)

    Thorvaldsdottir, Helga; Liefeld, Ted; Ocana, Marco; Borges-Rivera, Diego; Pochet, Nathalie; Robinson, James T.; Demchak, Barry; Hull, Tim; Ben-Artzi, Gil; Blankenberg, Daniel; Barber, Galt P.; Lee, Brian T.; Kuhn, Robert M.; Nekrutenko, Anton; Segal, Eran; Ideker, Trey; Reich, Michael; Regev, Aviv; Chang, Howard Y.; Mesirov, Jill P.

    2015-01-01

    Integrative analysis of multiple data types to address complex biomedical questions requires the use of multiple software tools in concert and remains an enormous challenge for most of the biomedical research community. Here we introduce GenomeSpace (http://www.genomespace.org), a cloud-based, cooperative community resource. Seeded as a collaboration of six of the most popular genomics analysis tools, GenomeSpace now supports the streamlined interaction of 20 bioinformatics tools and data resources. To facilitate the ability of non-programming users’ to leverage GenomeSpace in integrative analysis, it offers a growing set of ‘recipes’, short workflows involving a few tools and steps to guide investigators through high utility analysis tasks. PMID:26780094

  5. The Greatest Mathematical Discovery?

    Energy Technology Data Exchange (ETDEWEB)

    Bailey, David H.; Borwein, Jonathan M.

    2010-05-12

    What mathematical discovery more than 1500 years ago: (1) Is one of the greatest, if not the greatest, single discovery in the field of mathematics? (2) Involved three subtle ideas that eluded the greatest minds of antiquity, even geniuses such as Archimedes? (3) Was fiercely resisted in Europe for hundreds of years after its discovery? (4) Even today, in historical treatments of mathematics, is often dismissed with scant mention, or else is ascribed to the wrong source? Answer: Our modern system of positional decimal notation with zero, together with the basic arithmetic computational schemes, which were discovered in India about 500 CE.

  6. Promoting synergistic research and education in genomics and bioinformatics.

    Science.gov (United States)

    Yang, Jack Y; Yang, Mary Qu; Zhu, Mengxia Michelle; Arabnia, Hamid R; Deng, Youping

    2008-01-01

    Bioinformatics and Genomics are closely related disciplines that hold great promises for the advancement of research and development in complex biomedical systems, as well as public health, drug design, comparative genomics, personalized medicine and so on. Research and development in these two important areas are impacting the science and technology.High throughput sequencing and molecular imaging technologies marked the beginning of a new era for modern translational medicine and personalized healthcare. The impact of having the human sequence and personalized digital images in hand has also created tremendous demands of developing powerful supercomputing, statistical learning and artificial intelligence approaches to handle the massive bioinformatics and personalized healthcare data, which will obviously have a profound effect on how biomedical research will be conducted toward the improvement of human health and prolonging of human life in the future. The International Society of Intelligent Biological Medicine (http://www.isibm.org) and its official journals, the International Journal of Functional Informatics and Personalized Medicine (http://www.inderscience.com/ijfipm) and the International Journal of Computational Biology and Drug Design (http://www.inderscience.com/ijcbdd) in collaboration with International Conference on Bioinformatics and Computational Biology (Biocomp), touch tomorrow's bioinformatics and personalized medicine throughout today's efforts in promoting the research, education and awareness of the upcoming integrated inter/multidisciplinary field. The 2007 international conference on Bioinformatics and Computational Biology (BIOCOMP07) was held in Las Vegas, the United States of American on June 25-28, 2007. The conference attracted over 400 papers, covering broad research areas in the genomics, biomedicine and bioinformatics. The Biocomp 2007 provides a common platform for the cross fertilization of ideas, and to help shape knowledge and

  7. A Metadata Schema for Geospatial Resource Discovery Use Cases

    OpenAIRE

    Darren Hardy; Kim Durante

    2014-01-01

    We introduce a metadata schema that focuses on GIS discovery use cases for patrons in a research library setting. Text search, faceted refinement, and spatial search and relevancy are among GeoBlacklight's primary use cases for federated geospatial holdings. The schema supports a variety of GIS data types and enables contextual, collection-oriented discovery applications as well as traditional portal applications. One key limitation of GIS resource discovery is the general lack of normative m...

  8. Toxins and drug discovery.

    Science.gov (United States)

    Harvey, Alan L

    2014-12-15

    Components from venoms have stimulated many drug discovery projects, with some notable successes. These are briefly reviewed, from captopril to ziconotide. However, there have been many more disappointments on the road from toxin discovery to approval of a new medicine. Drug discovery and development is an inherently risky business, and the main causes of failure during development programmes are outlined in order to highlight steps that might be taken to increase the chances of success with toxin-based drug discovery. These include having a clear focus on unmet therapeutic needs, concentrating on targets that are well-validated in terms of their relevance to the disease in question, making use of phenotypic screening rather than molecular-based assays, and working with development partners with the resources required for the long and expensive development process. PMID:25448391

  9. Leadership and Discovery

    CERN Document Server

    Goethals, George R

    2009-01-01

    This book, a collection of essays from scholars across disciplines, explores leadership of discovery, probing the guided and collaborative exploration and interpretation of the experience of our inner thoughts and feelings, and of our external worlds

  10. Fateful discovery almost forgotten

    CERN Multimedia

    1989-01-01

    "The discovery of the fission of uranium exactly half a century ago is at risk of passing unremarked because of the general ambivalence towards the consequences of this development. Can that be wise?" (4 pages)

  11. Biophysics and bioinformatics of transcription regulation in bacteria and bacteriophages

    Science.gov (United States)

    Djordjevic, Marko

    2005-11-01

    Due to rapid accumulation of biological data, bioinformatics has become a very important branch of biological research. In this thesis, we develop novel bioinformatic approaches and aid design of biological experiments by using ideas and methods from statistical physics. Identification of transcription factor binding sites within the regulatory segments of genomic DNA is an important step towards understanding of the regulatory circuits that control expression of genes. We propose a novel, biophysics based algorithm, for the supervised detection of transcription factor (TF) binding sites. The method classifies potential binding sites by explicitly estimating the sequence-specific binding energy and the chemical potential of a given TF. In contrast with the widely used information theory based weight matrix method, our approach correctly incorporates saturation in the transcription factor/DNA binding probability. This results in a significant reduction in the number of expected false positives, and in the explicit appearance---and determination---of a binding threshold. The new method was used to identify likely genomic binding sites for the Escherichia coli TFs, and to examine the relationship between TF binding specificity and degree of pleiotropy (number of regulatory targets). We next address how parameters of protein-DNA interactions can be obtained from data on protein binding to random oligos under controlled conditions (SELEX experiment data). We show that 'robust' generation of an appropriate data set is achieved by a suitable modification of the standard SELEX procedure, and propose a novel bioinformatic algorithm for analysis of such data. Finally, we use quantitative data analysis, bioinformatic methods and kinetic modeling to analyze gene expression strategies of bacterial viruses. We study bacteriophage Xp10 that infects rice pathogen Xanthomonas oryzae. Xp10 is an unusual bacteriophage, which has morphology and genome organization that most closely

  12. Application of bioinformatics on the detection of pathogens by Pcr

    International Nuclear Information System (INIS)

    Salmonellas are the main responsible agent for the frequent food-borne gastrointestinal diseases. Their detection using classical methods are laborious and their results take a lot of time to be revealed. In this context, we tried to set up a revealing technique of the invA virulence gene, found in the majority of Salmonella species. After amplification with PCR using specific primers created and verified by bioinformatics programs, two couples of primers were set up and they appeared to be very specific and sensitive for the detection of invA gene. (Author)

  13. Bioinformatics pipeline for functional identification and characterization of proteins

    Science.gov (United States)

    Skarzyńska, Agnieszka; Pawełkowicz, Magdalena; Krzywkowski, Tomasz; Świerkula, Katarzyna; PlÄ der, Wojciech; Przybecki, Zbigniew

    2015-09-01

    The new sequencing methods, called Next Generation Sequencing gives an opportunity to possess a vast amount of data in short time. This data requires structural and functional annotation. Functional identification and characterization of predicted proteins could be done by in silico approches, thanks to a numerous computational tools available nowadays. However, there is a need to confirm the results of proteins function prediction using different programs and comparing the results or confirm experimentally. Here we present a bioinformatics pipeline for structural and functional annotation of proteins.

  14. Provenance of e-Science Experiments - experience from Bioinformatics

    OpenAIRE

    Greenwood, M.; Goble, C.A.; Stevens, R. D.; Zhao, J.(Central China Normal University (HZNU), Wuhan, 430079, China); Addis, M; Marvin, D; Moreau, L; Oinn, T.

    2003-01-01

    Like experiments performed at a laboratory bench, the data associated with an e-Science experiment are of reduced value if other scientists are not able to identify the origin, or provenance, of those data. Provenance information is essential if experiments are to be validated and verified by others, or even by those who originally performed them. In this article, we give an overview of our initial work on the provenance of bioinformatics e-Science experiments within myGrid. We use two kinds ...

  15. Discovery Driven Growth

    DEFF Research Database (Denmark)

    Bukh, Per Nikolaj

    2009-01-01

    Anmeldelse af Discovery Driven Growh : A breakthrough process to reduce risk and seize opportunity, af Rita G. McGrath & Ian C. MacMillan, Boston: Harvard Business Press. Udgivelsesdato: 14 august......Anmeldelse af Discovery Driven Growh : A breakthrough process to reduce risk and seize opportunity, af Rita G. McGrath & Ian C. MacMillan, Boston: Harvard Business Press. Udgivelsesdato: 14 august...

  16. The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis.

    OpenAIRE

    Alva, V.; Nam, S.; Söding, J.; Lupas, A.

    2016-01-01

    The MPI Bioinformatics Toolkit (http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been av...

  17. How to Establish a Bioinformatics Postgraduate Degree Programme: A Case Study from South Africa

    OpenAIRE

    Machanick, Philip; Bishop, Ozlem Tastan

    2014-01-01

    The Research Unit in Bioinformatics at Rhodes University (RUBi), South Africa offers a Masters of Science in Bioinformatics. Growing demand for Bioinformatics qualifications results in applications from across Africa. Courses aim to bridge gaps in the diverse backgrounds of students who range from biologists with no prior computing exposure to computer scientists with no biology background. The programme is evenly split between coursework and research, with diverse modules from a range of dep...

  18. A Monte Carlo-based framework enhances the discovery and interpretation of regulatory sequence motifs

    Directory of Open Access Journals (Sweden)

    Seitzer Phillip

    2012-11-01

    Full Text Available Abstract Background Discovery of functionally significant short, statistically overrepresented subsequence patterns (motifs in a set of sequences is a challenging problem in bioinformatics. Oftentimes, not all sequences in the set contain a motif. These non-motif-containing sequences complicate the algorithmic discovery of motifs. Filtering the non-motif-containing sequences from the larger set of sequences while simultaneously determining the identity of the motif is, therefore, desirable and a non-trivial problem in motif discovery research. Results We describe MotifCatcher, a framework that extends the sensitivity of existing motif-finding tools by employing random sampling to effectively remove non-motif-containing sequences from the motif search. We developed two implementations of our algorithm; each built around a commonly used motif-finding tool, and applied our algorithm to three diverse chromatin immunoprecipitation (ChIP data sets. In each case, the motif finder with the MotifCatcher extension demonstrated improved sensitivity over the motif finder alone. Our approach organizes candidate functionally significant discovered motifs into a tree, which allowed us to make additional insights. In all cases, we were able to support our findings with experimental work from the literature. Conclusions Our framework demonstrates that additional processing at the sequence entry level can significantly improve the performance of existing motif-finding tools. For each biological data set tested, we were able to propose novel biological hypotheses supported by experimental work from the literature. Specifically, in Escherichia coli, we suggested binding site motifs for 6 non-traditional LexA protein binding sites; in Saccharomyces cerevisiae, we hypothesize 2 disparate mechanisms for novel binding sites of the Cse4p protein; and in Halobacterium sp. NRC-1, we discoverd subtle differences in a general transcription factor (GTF binding site motif

  19. Technosciences in Academia: Rethinking a Conceptual Framework for Bioinformatics Undergraduate Curricula

    Science.gov (United States)

    Symeonidis, Iphigenia Sofia

    This paper aims to elucidate guiding concepts for the design of powerful undergraduate bioinformatics degrees which will lead to a conceptual framework for the curriculum. "Powerful" here should be understood as having truly bioinformatics objectives rather than enrichment of existing computer science or life science degrees on which bioinformatics degrees are often based. As such, the conceptual framework will be one which aims to demonstrate intellectual honesty in regards to the field of bioinformatics. A synthesis/conceptual analysis approach was followed as elaborated by Hurd (1983). The approach takes into account the following: bioinfonnatics educational needs and goals as expressed by different authorities, five undergraduate bioinformatics degrees case-studies, educational implications of bioinformatics as a technoscience and approaches to curriculum design promoting interdisciplinarity and integration. Given these considerations, guiding concepts emerged and a conceptual framework was elaborated. The practice of bioinformatics was given a closer look, which led to defining tool-integration skills and tool-thinking capacity as crucial areas of the bioinformatics activities spectrum. It was argued, finally, that a process-based curriculum as a variation of a concept-based curriculum (where the concepts are processes) might be more conducive to the teaching of bioinformatics given a foundational first year of integrated science education as envisioned by Bialek and Botstein (2004). Furthermore, the curriculum design needs to define new avenues of communication and learning which bypass the traditional disciplinary barriers of academic settings as undertaken by Tador and Tidmor (2005) for graduate studies.

  20. Bioinformatics analysis of metastasis-related proteins in hepatocellular carcinoma

    Institute of Scientific and Technical Information of China (English)

    Pei-Ming Song; Yang Zhang; Yu-Fei He; Hui-Min Bao; Jian-Hua Luo; Yin-Kun Liu; Peng-Yuan Yang; Xian Chen

    2008-01-01

    AIM: To analyze the metastasis-related proteins in hepatocellular carcinoma (HCC) and discover the biomark-er candidates for diagnosis and therapeutic intervention of HCC metastasis with bioinformatics tools.METHODS: Metastasis-related proteins were determined by stable isotope labeling and MS analysis and analyzed with bioinformatics resources, including Phobius, Kyoto encyclopedia of genes and genomes (KEGG), online mendelian inheritance in man (OHIH) and human protein reference database (HPRD).RESULTS: All the metastasis-related proteins were linked to 83 pathways in KEGG, including MAPK and p53 signal pathways. Protein-protein interaction network showed that all the metastasis-related proteins were categorized into 19 function groups, including cell cycle, apoptosis and signal transcluction. OMIM analysis linked these proteins to 186 OMIM entries.CONCLUSION: Metastasis-related proteins provide HCC cells with biological advantages in cell proliferation, migration and angiogenesis, and facilitate metastasis of HCC cells. The bird's eye view can reveal a global charac-teristic of metastasis-related proteins and many differen-tially expressed proteins can be identified as candidates for diagnosis and treatment of HCC.

  1. Making sense of genomes of parasitic worms: Tackling bioinformatic challenges.

    Science.gov (United States)

    Korhonen, Pasi K; Young, Neil D; Gasser, Robin B

    2016-01-01

    Billions of people and animals are infected with parasitic worms (helminths). Many of these worms cause diseases that have a major socioeconomic impact worldwide, and are challenging to control because existing treatment methods are often inadequate. There is, therefore, a need to work toward developing new intervention methods, built on a sound understanding of parasitic worms at molecular level, the relationships that they have with their animal hosts and/or the diseases that they cause. Decoding the genomes and transcriptomes of these parasites brings us a step closer to this goal. The key focus of this article is to critically review and discuss bioinformatic tools used for the assembly and annotation of these genomes and transcriptomes, as well as various post-genomic analyses of transcription profiles, biological pathways, synteny, phylogeny, biogeography and the prediction and prioritisation of drug target candidates. Bioinformatic pipelines implemented and established recently provide practical and efficient tools for the assembly and annotation of genomes of parasitic worms, and will be applicable to a wide range of other parasites and eukaryotic organisms. Future research will need to assess the utility of long-read sequence data sets for enhanced genomic assemblies, and develop improved algorithms for gene prediction and post-genomic analyses, to enable comprehensive systems biology explorations of parasitic organisms.

  2. Making sense of genomes of parasitic worms: Tackling bioinformatic challenges.

    Science.gov (United States)

    Korhonen, Pasi K; Young, Neil D; Gasser, Robin B

    2016-01-01

    Billions of people and animals are infected with parasitic worms (helminths). Many of these worms cause diseases that have a major socioeconomic impact worldwide, and are challenging to control because existing treatment methods are often inadequate. There is, therefore, a need to work toward developing new intervention methods, built on a sound understanding of parasitic worms at molecular level, the relationships that they have with their animal hosts and/or the diseases that they cause. Decoding the genomes and transcriptomes of these parasites brings us a step closer to this goal. The key focus of this article is to critically review and discuss bioinformatic tools used for the assembly and annotation of these genomes and transcriptomes, as well as various post-genomic analyses of transcription profiles, biological pathways, synteny, phylogeny, biogeography and the prediction and prioritisation of drug target candidates. Bioinformatic pipelines implemented and established recently provide practical and efficient tools for the assembly and annotation of genomes of parasitic worms, and will be applicable to a wide range of other parasites and eukaryotic organisms. Future research will need to assess the utility of long-read sequence data sets for enhanced genomic assemblies, and develop improved algorithms for gene prediction and post-genomic analyses, to enable comprehensive systems biology explorations of parasitic organisms. PMID:26956711

  3. Comparison: Discovery on WSMOLX and miAamics/jABC

    Science.gov (United States)

    Kubczak, Christian; Vitvar, Tomas; Winkler, Christian; Zaharia, Raluca; Zaremba, Maciej

    This chapter compares the solutions to the SWS-Challenge discovery problems provided by DERI Galway and the joint solution from the Technical University of Dortmund and University of Postdam. The two approaches are described in depth in Chapters 10 and 13. The discovery scenario raises problems associated with making service discovery an automated process. It requires fine-grained specifications of search requests and service functionality including support for fetching dynamic information during the discovery process (e.g., shipment price). Both teams utilize semantics to describe services, service requests and data models in order to enable search at the required fine-grained level of detail.

  4. Bioinformatics analysis suggests base modifications of tRNAs and miRNAs in Arabidopsis thaliana

    Directory of Open Access Journals (Sweden)

    Jin Hailing

    2009-04-01

    Full Text Available Abstract Background Modifications of RNA bases have been found in some mRNAs and non-coding RNAs including rRNAs, tRNAs, and snRNAs, where modified bases are important for RNA function. Little is known about RNA base modifications in Arabidopsis thaliana. Results In the current work, we carried out a bioinformatics analysis of RNA base modifications in tRNAs and miRNAs using large numbers of cDNA sequences of small RNAs (sRNAs generated with the 454 technology and the massively parallel signature sequencing (MPSS method. We looked for sRNAs that map to the genome sequence with one-base mismatch (OMM, which indicate candidate modified nucleotides. We obtained 1,187 sites with possible RNA base modifications supported by both 454 and MPSS sequences. Seven hundred and three of these sites were within tRNA loci. Nucleotide substitutions were frequently located in the T arm (substitutions from A to U or G, upstream of the D arm (from G to C, U, or A, and downstream of the D arm (from G to U. The positions of major substitution sites corresponded with the following known RNA base modifications in tRNAs: N1-methyladenosine (m1A, N2-methylguanosine (m2G, and N2-N2-methylguanosine (m22G. Conclusion These results indicate that our bioinformatics method successfully detected modified nucleotides in tRNAs. Using this method, we also found 147 substitution sites in miRNA loci. As with tRNAs, substitutions from A to U or G and from G to C, U, or A were common, suggesting that base modifications might be similar in tRNAs and miRNAs. We suggest that miRNAs contain modified bases and such modifications might be important for miRNA maturation and/or function.

  5. Bioinformatics Mining and Modeling Methods for the Identification of Disease Mechanisms in Neurodegenerative Disorders

    Directory of Open Access Journals (Sweden)

    Martin Hofmann-Apitius

    2015-12-01

    Full Text Available Since the decoding of the Human Genome, techniques from bioinformatics, statistics, and machine learning have been instrumental in uncovering patterns in increasing amounts and types of different data produced by technical profiling technologies applied to clinical samples, animal models, and cellular systems. Yet, progress on unravelling biological mechanisms, causally driving diseases, has been limited, in part due to the inherent complexity of biological systems. Whereas we have witnessed progress in the areas of cancer, cardiovascular and metabolic diseases, the area of neurodegenerative diseases has proved to be very challenging. This is in part because the aetiology of neurodegenerative diseases such as Alzheimer´s disease or Parkinson´s disease is unknown, rendering it very difficult to discern early causal events. Here we describe a panel of bioinformatics and modeling approaches that have recently been developed to identify candidate mechanisms of neurodegenerative diseases based on publicly available data and knowledge. We identify two complementary strategies—data mining techniques using genetic data as a starting point to be further enriched using other data-types, or alternatively to encode prior knowledge about disease mechanisms in a model based framework supporting reasoning and enrichment analysis. Our review illustrates the challenges entailed in integrating heterogeneous, multiscale and multimodal information in the area of neurology in general and neurodegeneration in particular. We conclude, that progress would be accelerated by increasing efforts on performing systematic collection of multiple data-types over time from each individual suffering from neurodegenerative disease. The work presented here has been driven by project AETIONOMY; a project funded in the course of the Innovative Medicines Initiative (IMI; which is a public-private partnership of the European Federation of Pharmaceutical Industry Associations

  6. A Decade of Discovery

    Energy Technology Data Exchange (ETDEWEB)

    None

    2008-01-01

    This book provides a fascinating account of some of the most significant scientific discoveries and technological innovations coming out of the U.S. Department of Energy’s National Laboratories. This remarkable book illustrates how the men and women of the National Laboratories are keeping us on the cutting edge. Though few Americans are familiar with the scope and scale of the work conducted at these National Laboratories, their research is literally changing our lives and bettering our planet. The book describes the scientific discoveries and technological advancements "in recognition of the men and women working in DOE's seventeen national laboratories across the country." Through highly vivid and accessible stories, this book details recent breakthroughs in three critical areas: 1) Energy and Environment, 2) National Security and 3) Life and Physical Science. The book illustrates how this government-funded research has resulted in more energy-efficient buildings; new, cleaner alternative fuels that reduce greenhouse gas emissions; safer, more efficient, nuclear power plants; improved responses to disease outbreaks; more secure and streamlined airport security; more effective treatments for cancer and other diseases; and astonishing discoveries that are altering our understanding of the universe and enabling scientific breakthroughs in fields such as nanotechnology and particle physics. Specifically, it contains 37 stories. A Decade of Discovery is truly a recent history of discovery - and a fascinating look at what the next decade holds.

  7. Bioinformatics in the secondary science classroom: A study of state content standards and students' perceptions of, and performance in, bioinformatics lessons

    Science.gov (United States)

    Wefer, Stephen H.

    The proliferation of bioinformatics in modern Biology marks a new revolution in science, which promises to influence science education at all levels. This thesis examined state standards for content that articulated bioinformatics, and explored secondary students' affective and cognitive perceptions of, and performance in, a bioinformatics mini-unit. The results are presented as three studies. The first study analyzed secondary science standards of 49 U.S States (Iowa has no science framework) and the District of Columbia for content related to bioinformatics at the introductory high school biology level. The bionformatics content of each state's Biology standards were categorized into nine areas and the prevalence of each area documented. The nine areas were: The Human Genome Project, Forensics, Evolution, Classification, Nucleotide Variations, Medicine, Computer Use, Agriculture/Food Technology, and Science Technology and Society/Socioscientific Issues (STS/SSI). Findings indicated a generally low representation of bioinformatics related content, which varied substantially across the different areas. Recommendations are made for reworking existing standards to incorporate bioinformatics and to facilitate the goal of promoting science literacy in this emerging new field among secondary school students. The second study examined thirty-two students' affective responses to, and content mastery of, a two-week bioinformatics mini-unit. The findings indicate that the students generally were positive relative to their interest level, the usefulness of the lessons, the difficulty level of the lessons, likeliness to engage in additional bioinformatics, and were overall successful on the assessments. A discussion of the results and significance is followed by suggestions for future research and implementation for transferability. The third study presents a case study of individual differences among ten secondary school students, whose cognitive and affective percepts were

  8. Atlas – a data warehouse for integrative bioinformatics

    Directory of Open Access Journals (Sweden)

    Yuen Macaire MS

    2005-02-01

    Full Text Available Abstract Background We present a biological data warehouse called Atlas that locally stores and integrates biological sequences, molecular interactions, homology information, functional annotations of genes, and biological ontologies. The goal of the system is to provide data, as well as a software infrastructure for bioinformatics research and development. Description The Atlas system is based on relational data models that we developed for each of the source data types. Data stored within these relational models are managed through Structured Query Language (SQL calls that are implemented in a set of Application Programming Interfaces (APIs. The APIs include three languages: C++, Java, and Perl. The methods in these API libraries are used to construct a set of loader applications, which parse and load the source datasets into the Atlas database, and a set of toolbox applications which facilitate data retrieval. Atlas stores and integrates local instances of GenBank, RefSeq, UniProt, Human Protein Reference Database (HPRD, Biomolecular Interaction Network Database (BIND, Database of Interacting Proteins (DIP, Molecular Interactions Database (MINT, IntAct, NCBI Taxonomy, Gene Ontology (GO, Online Mendelian Inheritance in Man (OMIM, LocusLink, Entrez Gene and HomoloGene. The retrieval APIs and toolbox applications are critical components that offer end-users flexible, easy, integrated access to this data. We present use cases that use Atlas to integrate these sources for genome annotation, inference of molecular interactions across species, and gene-disease associations. Conclusion The Atlas biological data warehouse serves as data infrastructure for bioinformatics research and development. It forms the backbone of the research activities in our laboratory and facilitates the integration of disparate, heterogeneous biological sources of data enabling new scientific inferences. Atlas achieves integration of diverse data sets at two levels. First

  9. Sequencing of GJB2 in Cameroonians and Black South Africans and comparison to 1000 Genomes Project Data Support Need to Revise Strategy for Discovery of Nonsyndromic Deafness Genes in Africans.

    Science.gov (United States)

    Bosch, Jason; Noubiap, Jean Jacques N; Dandara, Collet; Makubalo, Nomlindo; Wright, Galen; Entfellner, Jean-Baka Domelevo; Tiffin, Nicki; Wonkam, Ambroise

    2014-11-01

    Mutations in the GJB2 gene, encoding connexin 26, could account for 50% of congenital, nonsyndromic, recessive deafness cases in some Caucasian/Asian populations. There is a scarcity of published data in sub-Saharan Africans. We Sanger sequenced the coding region of the GJB2 gene in 205 Cameroonian and Xhosa South Africans with congenital, nonsyndromic deafness; and performed bioinformatic analysis of variations in the GJB2 gene, incorporating data from the 1000 Genomes Project. Amongst Cameroonian patients, 26.1% were familial. The majority of patients (70%) suffered from sensorineural hearing loss. Ten GJB2 genetic variants were detected by sequencing. A previously reported pathogenic mutation, g.3741_3743delTTC (p.F142del), and a putative pathogenic mutation, g.3816G>A (p.V167M), were identified in single heterozygous samples. Amongst eight the remaining variants, two novel variants, g.3318-41G>A and g.3332G>A, were reported. There were no statistically significant differences in allele frequencies between cases and controls. Principal Components Analyses differentiated between Africans, Asians, and Europeans, but only explained 40% of the variation. The present study is the first to compare African GJB2 sequences with the data from the 1000 Genomes Project and have revealed the low variation between population groups. This finding has emphasized the hypothesis that the prevalence of mutations in GJB2 in nonsyndromic deafness amongst European and Asian populations is due to founder effects arising after these individuals migrated out of Africa, and not to a putative "protective" variant in the genomic structure of GJB2 in Africans. Our results confirm that mutations in GJB2 are not associated with nonsyndromic deafness in Africans.

  10. Exploring Cystic Fibrosis Using Bioinformatics Tools: A Module Designed for the Freshman Biology Course

    Science.gov (United States)

    Zhang, Xiaorong

    2011-01-01

    We incorporated a bioinformatics component into the freshman biology course that allows students to explore cystic fibrosis (CF), a common genetic disorder, using bioinformatics tools and skills. Students learn about CF through searching genetic databases, analyzing genetic sequences, and observing the three-dimensional structures of proteins…

  11. Comparative Proteome Bioinformatics: Identification of Phosphotyrosine Signaling Proteins in the Unicellular Protozoan Ciliate Tetrahymena

    DEFF Research Database (Denmark)

    Gammeltoft, Steen; Christensen, Søren Tvorup; Joachimiak, Marcin;

    2005-01-01

    Tetrahymena, bioinformatics, cilia, evolution, signaling, TtPTK1, PTK, Grb2, SH-PTP 2, Plcy, Src, PTP, PI3K, SH2, SH3, PH......Tetrahymena, bioinformatics, cilia, evolution, signaling, TtPTK1, PTK, Grb2, SH-PTP 2, Plcy, Src, PTP, PI3K, SH2, SH3, PH...

  12. Making Bioinformatics Projects a Meaningful Experience in an Undergraduate Biotechnology or Biomedical Science Programme

    Science.gov (United States)

    Sutcliffe, Iain C.; Cummings, Stephen P.

    2007-01-01

    Bioinformatics has emerged as an important discipline within the biological sciences that allows scientists to decipher and manage the vast quantities of data (such as genome sequences) that are now available. Consequently, there is an obvious need to provide graduates in biosciences with generic, transferable skills in bioinformatics. We present…

  13. Computer Programming and Biomolecular Structure Studies: A Step beyond Internet Bioinformatics

    Science.gov (United States)

    Likic, Vladimir A.

    2006-01-01

    This article describes the experience of teaching structural bioinformatics to third year undergraduate students in a subject titled "Biomolecular Structure and Bioinformatics." Students were introduced to computer programming and used this knowledge in a practical application as an alternative to the well established Internet bioinformatics…

  14. A Portable Bioinformatics Course for Upper-Division Undergraduate Curriculum in Sciences

    Science.gov (United States)

    Floraino, Wely B.

    2008-01-01

    This article discusses the challenges that bioinformatics education is facing and describes a bioinformatics course that is successfully taught at the California State Polytechnic University, Pomona, to the fourth year undergraduate students in biological sciences, chemistry, and computer science. Information on lecture and computer practice…

  15. Integration of Bioinformatics into an Undergraduate Biology Curriculum and the Impact on Development of Mathematical Skills

    Science.gov (United States)

    Wightman, Bruce; Hark, Amy T.

    2012-01-01

    The development of fields such as bioinformatics and genomics has created new challenges and opportunities for undergraduate biology curricula. Students preparing for careers in science, technology, and medicine need more intensive study of bioinformatics and more sophisticated training in the mathematics on which this field is based. In this…

  16. A Summer Program Designed to Educate College Students for Careers in Bioinformatics

    Science.gov (United States)

    Krilowicz, Beverly; Johnston, Wendie; Sharp, Sandra B.; Warter-Perez, Nancy; Momand, Jamil

    2007-01-01

    A summer program was created for undergraduates and graduate students that teaches bioinformatics concepts, offers skills in professional development, and provides research opportunities in academic and industrial institutions. We estimate that 34 of 38 graduates (89%) are in a career trajectory that will use bioinformatics. Evidence from…

  17. Bioinformatics in Middle East Program Curricula--A Focus on the Arabian Gulf

    Science.gov (United States)

    Loucif, Samia

    2014-01-01

    The purpose of this paper is to investigate the inclusion of bioinformatics in program curricula in the Middle East, focusing on educational institutions in the Arabian Gulf. Bioinformatics is a multidisciplinary field which has emerged in response to the need for efficient data storage and retrieval, and accurate and fast computational and…

  18. Incorporating a Collaborative Web-Based Virtual Laboratory in an Undergraduate Bioinformatics Course

    Science.gov (United States)

    Weisman, David

    2010-01-01

    Face-to-face bioinformatics courses commonly include a weekly, in-person computer lab to facilitate active learning, reinforce conceptual material, and teach practical skills. Similarly, fully-online bioinformatics courses employ hands-on exercises to achieve these outcomes, although students typically perform this work offsite. Combining a…

  19. BioStar: an online question & answer resource for the bioinformatics community

    Science.gov (United States)

    Although the era of big data has produced many bioinformatics tools and databases, using them effectively often requires specialized knowledge. Many groups lack bioinformatics expertise, and frequently find that software documentation is inadequate and local colleagues may be overburdened or unfamil...

  20. Bioinformatics in High School Biology Curricula: A Study of State Science Standards

    Science.gov (United States)

    Wefer, Stephen H.; Sheppard, Keith

    2008-01-01

    The proliferation of bioinformatics in modern biology marks a modern revolution in science that promises to influence science education at all levels. This study analyzed secondary school science standards of 49 U.S. states (Iowa has no science framework) and the District of Columbia for content related to bioinformatics. The bioinformatics…

  1. Teaching Bioinformatics and Neuroinformatics by Using Free Web-Based Tools

    Science.gov (United States)

    Grisham, William; Schottler, Natalie A.; Valli-Marill, Joanne; Beck, Lisa; Beatty, Jackson

    2010-01-01

    This completely computer-based module's purpose is to introduce students to bioinformatics resources. We present an easy-to-adopt module that weaves together several important bioinformatic tools so students can grasp how these tools are used in answering research questions. Students integrate information gathered from websites dealing with…

  2. Visualizing and Sharing Results in Bioinformatics Projects: GBrowse and GenBank Exports

    Science.gov (United States)

    Effective tools for presenting and sharing data are necessary for collaborative projects, typical for bioinformatics. In order to facilitate sharing our data with other genomics, molecular biology, and bioinformatics researchers, we have developed software to export our data to GenBank and combined ...

  3. The S-Star Trial Bioinformatics Course: An On-line Learning Success

    Science.gov (United States)

    Lim, Yun Ping; Hoog, Jan-Olov; Gardner, Phyllis; Ranganathan, Shoba; Andersson, Siv; Subbiah, Subramanian; Tan, Tin Wee; Hide, Winston; Weiss, Anthony S.

    2003-01-01

    The S-Star Trial Bioinformatics on-line course (www.s-star.org) is a global experiment in bioinformatics distance education. Six universities from five continents have participated in this project. One hundred and fifty students participated in the first trial course of which 96 followed through the entire course and 70 fulfilled the overall…

  4. A Virtual Bioinformatics Knowledge Environment for Early Cancer Detection

    Science.gov (United States)

    Crichton, Daniel; Srivastava, Sudhir; Johnsey, Donald

    2003-01-01

    Discovery of disease biomarkers for cancer is a leading focus of early detection. The National Cancer Institute created a network of collaborating institutions focused on the discovery and validation of cancer biomarkers called the Early Detection Research Network (EDRN). Informatics plays a key role in enabling a virtual knowledge environment that provides scientists real time access to distributed data sets located at research institutions across the nation. The distributed and heterogeneous nature of the collaboration makes data sharing across institutions very difficult. EDRN has developed a comprehensive informatics effort focused on developing a national infrastructure enabling seamless access, sharing and discovery of science data resources across all EDRN sites. This paper will discuss the EDRN knowledge system architecture, its objectives and its accomplishments.

  5. Verbumculus and the Discovery of Unusual Words

    Institute of Scientific and Technical Information of China (English)

    Alerto Apostolico; Fang-Cheng Gong; Stefano Lonardi

    2004-01-01

    Measures relating word frequencies and expectations have been constantly of interest in Bioinfor-matics studies. With sequence data becoming massively available, exhaustive enumeration of such measures have become conceivable, and yet pose significant computational burden even when limited to words of bounded max-imum length. In addition, the display of the huge tables possibly resulting from these counts poses practical problems of visualization and inference. VERBUMCULUS is a suite of software tools for the efficient and fast detection of over- or under-represented words in nucleotide sequences. The inner core of VERBUMCULUS rests on subtly interwoven properties of statistics,pattern matching and combinatorics on words, that enable one to limit drastically and a priori the set of over-or under-represented candidate words of all lengths in a given sequence, thereby rendering it more feasible both to detect and visualize such words in a fast and practically useful way. This paper is devoted to the description of the facility at the outset and to report experimental results, ranging from simulations on synthetic data to the discovery of regulatory elements on the upstream regions of a set of genes of the yeast.The software VERBUMCULUS is accessible at http://www. cs. ucr. edu/ stelo/Verbumculus/or http://wwwdbl.dei. unipd. it/Verbumculus/

  6. Systems biology and bioinformatics in aging research: a workshop report.

    Science.gov (United States)

    Fuellen, Georg; Dengjel, Jörn; Hoeflich, Andreas; Hoeijemakers, Jan; Kestler, Hans A; Kowald, Axel; Priebe, Steffen; Rebholz-Schuhmann, Dietrich; Schmeck, Bernd; Schmitz, Ulf; Stolzing, Alexandra; Sühnel, Jürgen; Wuttke, Daniel; Vera, Julio

    2012-12-01

    In an "aging society," health span extension is most important. As in 2010, talks in this series of meetings in Rostock-Warnemünde demonstrated that aging is an apparently very complex process, where computational work is most useful for gaining insights and to find interventions that counter aging and prevent or counteract aging-related diseases. The specific topics of this year's meeting entitled, "RoSyBA: Rostock Symposium on Systems Biology and Bioinformatics in Ageing Research," were primarily related to "Cancer and Aging" and also had a focus on work funded by the German Federal Ministry of Education and Research (BMBF). The next meeting in the series, scheduled for September 20-21, 2013, will focus on the use of ontologies for computational research into aging, stem cells, and cancer. Promoting knowledge formalization is also at the core of the set of proposed action items concluding this report.

  7. Technical Perspectives on Knowledge Management in Bioinformatics Workflow Systems

    Directory of Open Access Journals (Sweden)

    Walaa N. Ismail

    2015-01-01

    Full Text Available Workflow systems by it’s nature can help bioin-formaticians to plan for their experiments, store, capture and analysis of the runtime generated data. On the other hand, the life science research usually produces new knowledge at an increasing speed; Knowledge such as papers, databases and other systems knowledge that a researcher needs to deal with is actually a complex task that needs much of efforts and time. Thus the management of knowledge is therefore an important issue for life scientists. Approaches has been developed to organize biological knowledge sources and to record provenance knowledge of an experiment into a readily resource are presently being carried out. This article focuses on the knowledge management of in silico experimentation in bioinformatics workflow systems.

  8. The European Bioinformatics Institute in 2016: Data growth and integration.

    Science.gov (United States)

    Cook, Charles E; Bergman, Mary Todd; Finn, Robert D; Cochrane, Guy; Birney, Ewan; Apweiler, Rolf

    2016-01-01

    New technologies are revolutionising biological research and its applications by making it easier and cheaper to generate ever-greater volumes and types of data. In response, the services and infrastructure of the European Bioinformatics Institute (EMBL-EBI, www.ebi.ac.uk) are continually expanding: total disk capacity increases significantly every year to keep pace with demand (75 petabytes as of December 2015), and interoperability between resources remains a strategic priority. Since 2014 we have launched two new resources: the European Variation Archive for genetic variation data and EMPIAR for two-dimensional electron microscopy data, as well as a Resource Description Framework platform. We also launched the Embassy Cloud service, which allows users to run large analyses in a virtual environment next to EMBL-EBI's vast public data resources. PMID:26673705

  9. NMR structure improvement: A structural bioinformatics & visualization approach

    Science.gov (United States)

    Block, Jeremy N.

    The overall goal of this project is to enhance the physical accuracy of individual models in macromolecular NMR (Nuclear Magnetic Resonance) structures and the realism of variation within NMR ensembles of models, while improving agreement with the experimental data. A secondary overall goal is to combine synergistically the best aspects of NMR and crystallographic methodologies to better illuminate the underlying joint molecular reality. This is accomplished by using the powerful method of all-atom contact analysis (describing detailed sterics between atoms, including hydrogens); new graphical representations and interactive tools in 3D and virtual reality; and structural bioinformatics approaches to the expanded and enhanced data now available. The resulting better descriptions of macromolecular structure and its dynamic variation enhances the effectiveness of the many biomedical applications that depend on detailed molecular structure, such as mutational analysis, homology modeling, molecular simulations, protein design, and drug design.

  10. An Adaptive Hybrid Multiprocessor technique for bioinformatics sequence alignment

    KAUST Repository

    Bonny, Talal

    2012-07-28

    Sequence alignment algorithms such as the Smith-Waterman algorithm are among the most important applications in the development of bioinformatics. Sequence alignment algorithms must process large amounts of data which may take a long time. Here, we introduce our Adaptive Hybrid Multiprocessor technique to accelerate the implementation of the Smith-Waterman algorithm. Our technique utilizes both the graphics processing unit (GPU) and the central processing unit (CPU). It adapts to the implementation according to the number of CPUs given as input by efficiently distributing the workload between the processing units. Using existing resources (GPU and CPU) in an efficient way is a novel approach. The peak performance achieved for the platforms GPU + CPU, GPU + 2CPUs, and GPU + 3CPUs is 10.4 GCUPS, 13.7 GCUPS, and 18.6 GCUPS, respectively (with the query length of 511 amino acid). © 2010 IEEE.

  11. Bioinformatics and the Politics of Innovation in the Life Sciences

    Science.gov (United States)

    Zhou, Yinhua; Datta, Saheli; Salter, Charlotte

    2016-01-01

    The governments of China, India, and the United Kingdom are unanimous in their belief that bioinformatics should supply the link between basic life sciences research and its translation into health benefits for the population and the economy. Yet at the same time, as ambitious states vying for position in the future global bioeconomy they differ considerably in the strategies adopted in pursuit of this goal. At the heart of these differences lies the interaction between epistemic change within the scientific community itself and the apparatus of the state. Drawing on desk-based research and thirty-two interviews with scientists and policy makers in the three countries, this article analyzes the politics that shape this interaction. From this analysis emerges an understanding of the variable capacities of different kinds of states and political systems to work with science in harnessing the potential of new epistemic territories in global life sciences innovation.

  12. Integrative content-driven concepts for bioinformatics ``beyond the cell"

    Indian Academy of Sciences (India)

    Edgar Wingender; Torsten Crass; Jennifer D Hogan; Alexander E Kel; Olga V Kel-Margoulis; Anatolij P Potapov

    2007-01-01

    Bioinformatics has delivered great contributions to genome and genomics research, without which the world-wide success of this and other global (‘omics’) approaches would not have been possible. More recently, it has developed further towards the analysis of different kinds of networks thus laying the foundation for comprehensive description, analysis and manipulation of whole living systems in modern ``systems biology”. The next step which is necessary for developing a systems biology that deals with systemic phenomena is to expand the existing and develop new methodologies that are appropriate to characterize intercellular processes and interactions without omitting the causal underlying molecular mechanisms. Modelling the processes on the different levels of complexity involved requires a comprehensive integration of information on gene regulatory events, signal transduction pathways, protein interaction and metabolic networks as well as cellular functions in the respective tissues/organs.

  13. A Performance/Cost Evaluation for a GPU-Based Drug Discovery Application on Volunteer Computing

    Directory of Open Access Journals (Sweden)

    Ginés D. Guerrero

    2014-01-01

    Full Text Available Bioinformatics is an interdisciplinary research field that develops tools for the analysis of large biological databases, and, thus, the use of high performance computing (HPC platforms is mandatory for the generation of useful biological knowledge. The latest generation of graphics processing units (GPUs has democratized the use of HPC as they push desktop computers to cluster-level performance. Many applications within this field have been developed to leverage these powerful and low-cost architectures. However, these applications still need to scale to larger GPU-based systems to enable remarkable advances in the fields of healthcare, drug discovery, genome research, etc. The inclusion of GPUs in HPC systems exacerbates power and temperature issues, increasing the total cost of ownership (TCO. This paper explores the benefits of volunteer computing to scale bioinformatics applications as an alternative to own large GPU-based local infrastructures. We use as a benchmark a GPU-based drug discovery application called BINDSURF that their computational requirements go beyond a single desktop machine. Volunteer computing is presented as a cheap and valid HPC system for those bioinformatics applications that need to process huge amounts of data and where the response time is not a critical factor.

  14. Bioinformatics for Diagnostics, Forensics, and Virulence Characterization and Detection

    Energy Technology Data Exchange (ETDEWEB)

    Gardner, S; Slezak, T

    2005-04-05

    We summarize four of our group's high-risk/high-payoff research projects funded by the Intelligence Technology Innovation Center (ITIC) in conjunction with our DHS-funded pathogen informatics activities. These are (1) quantitative assessment of genomic sequencing needs to predict high quality DNA and protein signatures for detection, and comparison of draft versus finished sequences for diagnostic signature prediction; (2) development of forensic software to identify SNP and PCR-RFLP variations from a large number of viral pathogen sequences and optimization of the selection of markers for maximum discrimination of those sequences; (3) prediction of signatures for the detection of virulence, antibiotic resistance, and toxin genes and genetic engineering markers in bacteria; (4) bioinformatic characterization of virulence factors to rapidly screen genomic data for potential genes with similar functions and to elucidate potential health threats in novel organisms. The results of (1) are being used by policy makers to set national sequencing priorities. Analyses from (2) are being used in collaborations with the CDC to genotype and characterize many variola strains, and reports from these collaborations have been made to the President. We also determined SNPs for serotype and strain discrimination of 126 foot and mouth disease virus (FMDV) genomes. For (3), currently >1000 probes have been predicted for the specific detection of >4000 virulence, antibiotic resistance, and genetic engineering vector sequences, and we expect to complete the bioinformatic design of a comprehensive ''virulence detection chip'' by August 2005. Results of (4) will be a system to rapidly predict potential virulence pathways and phenotypes in organisms based on their genomic sequences.

  15. Hydroxysteroid dehydrogenases (HSDs) in bacteria: a bioinformatic perspective.

    Science.gov (United States)

    Kisiela, Michael; Skarka, Adam; Ebert, Bettina; Maser, Edmund

    2012-03-01

    Steroidal compounds including cholesterol, bile acids and steroid hormones play a central role in various physiological processes such as cell signaling, growth, reproduction, and energy homeostasis. Hydroxysteroid dehydrogenases (HSDs), which belong to the superfamily of short-chain dehydrogenases/reductases (SDR) or aldo-keto reductases (AKR), are important enzymes involved in the steroid hormone metabolism. HSDs function as an enzymatic switch that controls the access of receptor-active steroids to nuclear hormone receptors and thereby mediate a fine-tuning of the steroid response. The aim of this study was the identification of classified functional HSDs and the bioinformatic annotation of these proteins in all complete sequenced bacterial genomes followed by a phylogenetic analysis. For the bioinformatic annotation we constructed specific hidden Markov models in an iterative approach to provide a reliable identification for the specific catalytic groups of HSDs. Here, we show a detailed phylogenetic analysis of 3α-, 7α-, 12α-HSDs and two further functional related enzymes (3-ketosteroid-Δ(1)-dehydrogenase, 3-ketosteroid-Δ(4)(5α)-dehydrogenase) from the superfamily of SDRs. For some bacteria that have been previously reported to posses a specific HSD activity, we could annotate the corresponding HSD protein. The dominating phyla that were identified to express HSDs were that of Actinobacteria, Proteobacteria, and Firmicutes. Moreover, some evolutionarily more ancient microorganisms (e.g., Cyanobacteria and Euryachaeota) were found as well. A large number of HSD-expressing bacteria constitute the normal human gastro-intestinal flora. Another group of bacteria were originally isolated from natural habitats like seawater, soil, marine and permafrost sediments. These bacteria include polycyclic aromatic hydrocarbons-degrading species such as Pseudomonas, Burkholderia and Rhodococcus. In conclusion, HSDs are found in a wide variety of microorganisms including

  16. 9th International Conference on Practical Applications of Computational Biology and Bioinformatics

    CERN Document Server

    Rocha, Miguel; Fdez-Riverola, Florentino; Paz, Juan

    2015-01-01

    This proceedings presents recent practical applications of Computational Biology and  Bioinformatics. It contains the proceedings of the 9th International Conference on Practical Applications of Computational Biology & Bioinformatics held at University of Salamanca, Spain, at June 3rd-5th, 2015. The International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB) is an annual international meeting dedicated to emerging and challenging applied research in Bioinformatics and Computational Biology. Biological and biomedical research are increasingly driven by experimental techniques that challenge our ability to analyse, process and extract meaningful knowledge from the underlying data. The impressive capabilities of next generation sequencing technologies, together with novel and ever evolving distinct types of omics data technologies, have put an increasingly complex set of challenges for the growing fields of Bioinformatics and Computational Biology. The analysis o...

  17. Medical knowledge discovery and management.

    Science.gov (United States)

    Prior, Fred

    2009-05-01

    Although the volume of medical information is growing rapidly, the ability to rapidly convert this data into "actionable insights" and new medical knowledge is lagging far behind. The first step in the knowledge discovery process is data management and integration, which logically can be accomplished through the application of data warehouse technologies. A key insight that arises from efforts in biosurveillance and the global scope of military medicine is that information must be integrated over both time (longitudinal health records) and space (spatial localization of health-related events). Once data are compiled and integrated it is essential to encode the semantics and relationships among data elements through the use of ontologies and semantic web technologies to convert data into knowledge. Medical images form a special class of health-related information. Traditionally knowledge has been extracted from images by human observation and encoded via controlled terminologies. This approach is rapidly being replaced by quantitative analyses that more reliably support knowledge extraction. The goals of knowledge discovery are the improvement of both the timeliness and accuracy of medical decision making and the identification of new procedures and therapies.

  18. Evaluating ten discoveries

    Energy Technology Data Exchange (ETDEWEB)

    1973-02-01

    Mexico's state company, Pemex, announces 10 significant oil and gas discoveries in the states of Tamaulipas and Chiapas. Most promising finds are a new oil province in S. Mexico and a deeper pool strike at the offshore Arenque field. The latter seems to point to the existence of an attractive reefal trend extending on shore toward the State of Nuevo Leon.

  19. Discovery of TUG-770

    DEFF Research Database (Denmark)

    Christiansen, Elisabeth; Hansen, Steffen V F; Urban, Christian;

    2013-01-01

    Free fatty acid receptor 1 (FFA1 or GPR40) enhances glucose-stimulated insulin secretion from pancreatic β-cells and currently attracts high interest as a new target for the treatment of type 2 diabetes. We here report the discovery of a highly potent FFA1 agonist with favorable physicochemical...

  20. Ayurvedic drug discovery.

    Science.gov (United States)

    Balachandran, Premalatha; Govindarajan, Rajgopal

    2007-12-01

    Ayurveda is a major traditional system of Indian medicine that is still being successfully used in many countries. Recapitulation and adaptation of the older science to modern drug discovery processes can bring renewed interest to the pharmaceutical world and offer unique therapeutic solutions for a wide range of human disorders. Eventhough time-tested evidences vouch immense therapeutic benefits for ayurvedic herbs and formulations, several important issues are required to be resolved for successful implementation of ayurvedic principles to present drug discovery methodologies. Additionally, clinical examination in the extent of efficacy, safety and drug interactions of newly developed ayurvedic drugs and formulations are required to be carefully evaluated. Ayurvedic experts suggest a reverse-pharmacology approach focusing on the potential targets for which ayurvedic herbs and herbal products could bring tremendous leads to ayurvedic drug discovery. Although several novel leads and drug molecules have already been discovered from ayurvedic medicinal herbs, further scientific explorations in this arena along with customization of present technologies to ayurvedic drug manufacturing principles would greatly facilitate a standardized ayurvedic drug discovery.

  1. Archaeological Discoveries in Liaoning

    Institute of Scientific and Technical Information of China (English)

    1996-01-01

    LIAONING Province, in northeastern China, has been inhabited by many ethnic groups since ancient times. It is one of the sites of China’s earliest civilization. Since the 1950s many archaeological discoveries from periods beginning with the Paleolithic of 200,000 years ago, and through all the following historic periods, have been made in the province.

  2. Computer-aided vaccine designing approach against fish pathogens Edwardsiella tarda and Flavobacterium columnare using bioinformatics softwares

    Directory of Open Access Journals (Sweden)

    Mahendran R

    2016-05-01

    Full Text Available Radha Mahendran,1 Suganya Jeyabaskar,1 Gayathri Sitharaman,1 Rajamani Dinakaran Michael,2 Agnal Vincent Paul1 1Department of Bioinformatics, 2Centre for Fish Immunology, School of Life Sciences, Vels University, Pallavaram, Chennai, Tamil Nadu, India Abstract: Edwardsiella tarda and Flavobacterium columnare are two important intracellular pathogenic bacteria that cause the infectious diseases edwardsiellosis and columnaris in wild and cultured fish. Prediction of major histocompatibility complex (MHC binding is an important issue in T-cell epitope prediction. In a healthy immune system, the T-cells must recognize epitopes and induce the immune response. In this study, T-cell epitopes were predicted by using in silico immunoinformatics approach with the help of bioinformatics tools that are less expensive and are not time consuming. Such identification of binding interaction between peptides and MHC alleles aids in the discovery of new peptide vaccines. We have reported the potential peptides chosen from the outer membrane proteins (OMPs of E. tarda and F. columnare, which interact well with MHC class I alleles. OMPs from E. tarda and F. columnare were selected and analyzed based on their antigenic and immunogenic properties. The OMPs of the genes TolC and FCOL_04620, respectively, from E. tarda and F. columnare were taken for study. Finally, two epitopes from the OMP of E. tarda exhibited excellent protein–peptide interaction when docked with MHC class I alleles. Five epitopes from the OMP of F. columnare had good protein–peptide interaction when docked with MHC class I alleles. Further in vitro studies can aid in the development of potential peptide vaccines using the predicted peptides. Keywords: E. tarda, F. columnare, edwardsiellosis, columnaris, T-cell epitopes, MHC class I, peptide vaccine, outer membrane proteins 

  3. Neurogenomics: An opportunity to integrate neuroscience, genomics and bioinformatics research in Africa

    Directory of Open Access Journals (Sweden)

    Thomas K. Karikari

    2015-06-01

    Full Text Available Modern genomic approaches have made enormous contributions to improving our understanding of the function, development and evolution of the nervous system, and the diversity within and between species. However, most of these research advances have been recorded in countries with advanced scientific resources and funding support systems. On the contrary, little is known about, for example, the possible interplay between different genes, non-coding elements and environmental factors in modulating neurological diseases among populations in low-income countries, including many African countries. The unique ancestry of African populations suggests that improved inclusion of these populations in neuroscience-related genomic studies would significantly help to identify novel factors that might shape the future of neuroscience research and neurological healthcare. This perspective is strongly supported by the recent identification that diseased individuals and their kindred from specific sub-Saharan African populations lack common neurological disease-associated genetic mutations. This indicates that there may be population-specific causes of neurological diseases, necessitating further investigations into the contribution of additional, presently-unknown genomic factors. Here, we discuss how the development of neurogenomics research in Africa would help to elucidate disease-related genomic variants, and also provide a good basis to develop more effective therapies. Furthermore, neurogenomics would harness African scientists' expertise in neuroscience, genomics and bioinformatics to extend our understanding of the neural basis of behaviour, development and evolution.

  4. Domain Bridging Associations Support Creativity

    OpenAIRE

    Kötter, Tobias; Thiel, Kilian; Berthold, Michael R

    2010-01-01

    This paper proposes a new approach to support creativity through assisting the discovery of unexpected associations across different domains. This is achieved by integrating information from heterogeneous domains into a single network, enabling the interactive discovery of links across the corresponding information resources. We discuss three different pattern of domain crossing associations in this context.

  5. COPD Discovery Might Improve Treatment

    Science.gov (United States)

    ... page: https://medlineplus.gov/news/fullstory_158852.html COPD Discovery Might Improve Treatment Study may help pinpoint ... will progress, a discovery they believe could improve COPD treatment. Their research might help doctors determine which ...

  6. Creating A Guided- discovery Lesson

    Institute of Scientific and Technical Information of China (English)

    田枫

    2005-01-01

    In a guided - discovery lesson, students sequentially uncover layers of mathematical information one step at a time and learn new mathematics. We have identified eight critical steps necessary in developing a successful guided- discovery lesson.

  7. Natural Products as Leads in Schistosome Drug Discovery

    Directory of Open Access Journals (Sweden)

    Bruno J. Neves

    2015-01-01

    Full Text Available Schistosomiasis is a neglected parasitic tropical disease that claims around 200,000 human lives every year. Praziquantel (PZQ, the only drug recommended by the World Health Organization for the treatment and control of human schistosomiasis, is now facing the threat of drug resistance, indicating the urgent need for new effective compounds to treat this disease. Therefore, globally, there is renewed interest in natural products (NPs as a starting point for drug discovery and development for schistosomiasis. Recent advances in genomics, proteomics, bioinformatics, and cheminformatics have brought about unprecedented opportunities for the rapid and more cost-effective discovery of new bioactive compounds against neglected tropical diseases. This review highlights the main contributions that NP drug discovery and development have made in the treatment of schistosomiasis and it discusses how integration with virtual screening (VS strategies may contribute to accelerating the development of new schistosomidal leads, especially through the identification of unexplored, biologically active chemical scaffolds and structural optimization of NPs with previously established activity.

  8. Bioinformatic Characterization of Mosquito Viromes within the Eastern United States and Puerto Rico: Discovery of Novel Viruses

    Science.gov (United States)

    Frey, Kenneth G.; Biser, Tara; Hamilton, Theron; Santos, Carlos J.; Pimentel, Guillermo; Mokashi, Vishwesh P.; Bishop-Lilly, Kimberly A.

    2016-01-01

    Mosquitoes are efficient, militarily relevant vectors of infectious disease pathogens, including many RNA viruses. The vast majority of all viruses are thought to be undiscovered. Accordingly, recent studies have shown that viruses discovered in insects are very divergent from known pathogens and that many of them lack appropriate reference sequences in the public databases. Given that the majority of viruses are likely still undiscovered, environ mental sampling stands to provide much needed reference samples as well as genetic sequences for comparison. In this study, we sought to determine whether samples of mosquitoes collected from different sites (the Caribbean and locations on the US East Coast) could be differentiated using metagenomic analysis of the RNA viral fraction. We report here distinct virome profiles, even from samples collected short distances apart. In addition to profiling the previously known viruses from these samples, we detected a number of viruses that have been previously undiscovered. PMID:27346944

  9. Next Generation Sequencing of Elite Berry Germplasm and Data Analysis Using a Bioinformatics Pipeline for Virus Detection and Discovery

    Science.gov (United States)

    Berry crops (members of the genera Fragaria, Ribes, Rubus, Sambucus and Vaccinium) are known hosts for more than 70 viruses and new ones are identified continually. In modern berry cultivars, viruses tend to be be asymptomatic in single infections and symptoms only develop after plants accumulate m...

  10. Next-Generation Sequencing of Elite Berry Germplasm and Data Analysis Using a Bioinformatics Pipeline for Virus Detection and Discovery

    Science.gov (United States)

    Berry crops (members of the genera Fragaria, Ribes, Rubus, Sambucus and Vaccinium) are known hosts for more than 70 viruses and new ones are identified frequently. In modern berry cultivars, viruses tend to be asymptomatic in single infections and symptoms only develop after plants accumulate multip...

  11. Discovery of transcriptional regulators and signaling pathways in the developing pituitary gland by bioinformatic and genomic approaches

    OpenAIRE

    Brinkmeier, Michelle L.; Davis, Shannon W.; Carninci, Piero; MacDonald, James W.; Kawai, Jun; Ghosh, Debashis; Hayashizaki, Yoshihide; Lyons, Robert H.; Camper, Sally A.

    2009-01-01

    We report a catalog of the mouse embryonic pituitary gland transcriptome consisting of five cDNA libraries including wild type tissue from E12.5 and E14.5, Prop1df/df mutant at E14.5, and two cDNA subtractions: E14.5 WT-E14.5 Prop1df/df and E14.5 WT-E12.5 WT. DNA sequence information is assembled into a searchable database with gene ontology terms representing 12,009 expressed genes. We validated coverage of the libraries by detecting most known homeobox gene transcription factor cDNAs. A tot...

  12. ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis.

    Directory of Open Access Journals (Sweden)

    Michael Römer

    Full Text Available Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/.

  13. ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis.

    Science.gov (United States)

    Römer, Michael; Eichner, Johannes; Dräger, Andreas; Wrzodek, Clemens; Wrzodek, Finja; Zell, Andreas

    2016-01-01

    Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/.

  14. ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis.

    Science.gov (United States)

    Römer, Michael; Eichner, Johannes; Dräger, Andreas; Wrzodek, Clemens; Wrzodek, Finja; Zell, Andreas

    2016-01-01

    Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/. PMID:26882475

  15. Atlas of Astronomical Discoveries

    CERN Document Server

    Schilling, Govert

    2011-01-01

    Four hundred years ago in Middelburg, in the Netherlands, the telescope was invented. The invention unleashed a revolution in the exploration of the universe. Galileo Galilei discovered mountains on the Moon, spots on the Sun, and moons around Jupiter. Christiaan Huygens saw details on Mars and rings around Saturn. William Herschel discovered a new planet and mapped binary stars and nebulae. Other astronomers determined the distances to stars, unraveled the structure of the Milky Way, and discovered the expansion of the universe. And, as telescopes became bigger and more powerful, astronomers delved deeper into the mysteries of the cosmos. In his Atlas of Astronomical Discoveries, astronomy journalist Govert Schilling tells the story of 400 years of telescopic astronomy. He looks at the 100 most important discoveries since the invention of the telescope. In his direct and accessible style, the author takes his readers on an exciting journey encompassing the highlights of four centuries of astronomy. Spectacul...

  16. The discovery of quarks.

    Science.gov (United States)

    Riordan, M

    1992-05-29

    Quarks are widely recognized today as being among the elementary particles of which matter is composed. The key evidence for their existence came from a series of inelastic electron-nucleon scattering experiments conducted between 1967 and 1973 at the Stanford Linear Accelerator Center. Other theoretical and experimental advances of the 1970s confirmed this discovery, leading to the present standard model of elementary particle physics.

  17. Opportunistic Adaptation Knowledge Discovery

    OpenAIRE

    Badra, Fadi; Cordier, Amélie; Lieber, Jean

    2009-01-01

    The original publication is available at www.springerlink.com International audience Adaptation has long been considered as the Achilles' heel of case-based reasoning since it requires some domain-specific knowledge that is difficult to acquire. In this paper, two strategies are combined in order to reduce the knowledge engineering cost induced by the adaptation knowledge (CA) acquisition task: CA is learned from the case base by the means of knowledge discovery techniques, and the CA a...

  18. Discovery with FAST

    Science.gov (United States)

    Wilkinson, P.

    2016-02-01

    FAST offers "transformational" performance well-suited to finding new phenomena - one of which might be polarised spectral transients. But discoveries will only be made if "the system" provides its users with the necessary opportunities. In addition to designing in as much observational flexibility as possible, FAST should be operated with a philosophy which maximises its "human bandwidth". This band includes the astronomers of tomorrow - many of whom not have yet started school or even been born.

  19. Discovery as a process

    Energy Technology Data Exchange (ETDEWEB)

    Loehle, C.

    1994-05-01

    The three great myths, which form a sort of triumvirate of misunderstanding, are the Eureka! myth, the hypothesis myth, and the measurement myth. These myths are prevalent among scientists as well as among observers of science. The Eureka! myth asserts that discovery occurs as a flash of insight, and as such is not subject to investigation. This leads to the perception that discovery or deriving a hypothesis is a moment or event rather than a process. Events are singular and not subject to description. The hypothesis myth asserts that proper science is motivated by testing hypotheses, and that if something is not experimentally testable then it is not scientific. This myth leads to absurd posturing by some workers conducting empirical descriptive studies, who dress up their study with a ``hypothesis`` to obtain funding or get it published. Methods papers are often rejected because they do not address a specific scientific problem. The fact is that many of the great breakthroughs in silence involve methods and not hypotheses or arise from largely descriptive studies. Those captured by this myth also try to block funding for those developing methods. The third myth is the measurement myth, which holds that determining what to measure is straightforward, so one doesn`t need a lot of introspection to do science. As one ecologist put it to me ``Don`t give me any of that philosophy junk, just let me out in the field. I know what to measure.`` These myths lead to difficulties for scientists who must face peer review to obtain funding and to get published. These myths also inhibit the study of science as a process. Finally, these myths inhibit creativity and suppress innovation. In this paper I first explore these myths in more detail and then propose a new model of discovery that opens the supposedly miraculous process of discovery to doser scrutiny.

  20. RNA Editing and Drug Discovery for Cancer Therapy

    Directory of Open Access Journals (Sweden)

    Wei-Hsuan Huang

    2013-01-01

    Full Text Available RNA editing is vital to provide the RNA and protein complexity to regulate the gene expression. Correct RNA editing maintains the cell function and organism development. Imbalance of the RNA editing machinery may lead to diseases and cancers. Recently, RNA editing has been recognized as a target for drug discovery although few studies targeting RNA editing for disease and cancer therapy were reported in the field of natural products. Therefore, RNA editing may be a potential target for therapeutic natural products. In this review, we provide a literature overview of the biological functions of RNA editing on gene expression, diseases, cancers, and drugs. The bioinformatics resources of RNA editing were also summarized.

  1. Human protein reference database as a discovery resource for proteomics

    Science.gov (United States)

    Peri, Suraj; Navarro, J. Daniel; Kristiansen, Troels Z.; Amanchy, Ramars; Surendranath, Vineeth; Muthusamy, Babylakshmi; Gandhi, T. K. B.; Chandrika, K. N.; Deshpande, Nandan; Suresh, Shubha; Rashmi, B. P.; Shanker, K.; Padma, N.; Niranjan, Vidya; Harsha, H. C.; Talreja, Naveen; Vrushabendra, B. M.; Ramya, M. A.; Yatish, A. J.; Joy, Mary; Shivashankar, H. N.; Kavitha, M. P.; Menezes, Minal; Choudhury, Dipanwita Roy; Ghosh, Neelanjana; Saravana, R.; Chandran, Sreenath; Mohan, Sujatha; Jonnalagadda, Chandra Kiran; Prasad, C. K.; Kumar-Sinha, Chandan; Deshpande, Krishna S.; Pandey, Akhilesh

    2004-01-01

    The rapid pace at which genomic and proteomic data is being generated necessitates the development of tools and resources for managing data that allow integration of information from disparate sources. The Human Protein Reference Database (http://www.hprd.org) is a web-based resource based on open source technologies for protein information about several aspects of human proteins including protein–protein interactions, post-translational modifications, enzyme–substrate relationships and disease associations. This information was derived manually by a critical reading of the published literature by expert biologists and through bioinformatics analyses of the protein sequence. This database will assist in biomedical discoveries by serving as a resource of genomic and proteomic information and providing an integrated view of sequence, structure, function and protein networks in health and disease. PMID:14681466

  2. An Affinity Propagation-Based DNA Motif Discovery Algorithm.

    Science.gov (United States)

    Sun, Chunxiao; Huo, Hongwei; Yu, Qiang; Guo, Haitao; Sun, Zhigang

    2015-01-01

    The planted (l, d) motif search (PMS) is one of the fundamental problems in bioinformatics, which plays an important role in locating transcription factor binding sites (TFBSs) in DNA sequences. Nowadays, identifying weak motifs and reducing the effect of local optimum are still important but challenging tasks for motif discovery. To solve the tasks, we propose a new algorithm, APMotif, which first applies the Affinity Propagation (AP) clustering in DNA sequences to produce informative and good candidate motifs and then employs Expectation Maximization (EM) refinement to obtain the optimal motifs from the candidate motifs. Experimental results both on simulated data sets and real biological data sets show that APMotif usually outperforms four other widely used algorithms in terms of high prediction accuracy.

  3. An Affinity Propagation-Based DNA Motif Discovery Algorithm

    Directory of Open Access Journals (Sweden)

    Chunxiao Sun

    2015-01-01

    Full Text Available The planted (l,d motif search (PMS is one of the fundamental problems in bioinformatics, which plays an important role in locating transcription factor binding sites (TFBSs in DNA sequences. Nowadays, identifying weak motifs and reducing the effect of local optimum are still important but challenging tasks for motif discovery. To solve the tasks, we propose a new algorithm, APMotif, which first applies the Affinity Propagation (AP clustering in DNA sequences to produce informative and good candidate motifs and then employs Expectation Maximization (EM refinement to obtain the optimal motifs from the candidate motifs. Experimental results both on simulated data sets and real biological data sets show that APMotif usually outperforms four other widely used algorithms in terms of high prediction accuracy.

  4. Bioinformatics Identification of Modules of Transcription Factor Binding Sites in Alzheimer's Disease-Related Genes by In Silico Promoter Analysis and Microarrays

    Directory of Open Access Journals (Sweden)

    Regina Augustin

    2011-01-01

    Full Text Available The molecular mechanisms and genetic risk factors underlying Alzheimer's disease (AD pathogenesis are only partly understood. To identify new factors, which may contribute to AD, different approaches are taken including proteomics, genetics, and functional genomics. Here, we used a bioinformatics approach and found that distinct AD-related genes share modules of transcription factor binding sites, suggesting a transcriptional coregulation. To detect additional coregulated genes, which may potentially contribute to AD, we established a new bioinformatics workflow with known multivariate methods like support vector machines, biclustering, and predicted transcription factor binding site modules by using in silico analysis and over 400 expression arrays from human and mouse. Two significant modules are composed of three transcription factor families: CTCF, SP1F, and EGRF/ZBPF, which are conserved between human and mouse APP promoter sequences. The specific combination of in silico promoter and multivariate analysis can identify regulation mechanisms of genes involved in multifactorial diseases.

  5. Cancer bioinformatics: detection of chromatin states,SNP-containing motifs, and functional enrichment modules

    Institute of Scientific and Technical Information of China (English)

    Xiaobo Zhou

    2013-01-01

    In this editorial preface,I briefly review cancer bioinformatics and introduce the four articles in this special issue highlighting important applications of the field:detection of chromatin states; detection of SNP-containing motifs and association with transcription factor-binding sites; improvements in functional enrichment modules; and gene association studies on aging and cancer.We expect this issue to provide bioinformatics scientists,cancer biologists,and clinical doctors with a better understanding of how cancer bioinformatics can be used to identify candidate biomarkers and targets and to conduct functional analysis.

  6. BioXSD: the common data-exchange format for everyday bioinformatics web services

    DEFF Research Database (Denmark)

    Kalas, M.; Puntervoll, P.; Joseph, A.;

    2010-01-01

    Motivation: The world-wide community of life scientists has access to a large number of public bioinformatics databases and tools, which are developed and deployed using diverse technologies and designs. More and more of the resources offer programmatic web-service interface. However, efficient use...... of the resources is hampered by the lack of widely used, standard data-exchange formats for the basic, everyday bioinformatics data types. Results: BioXSD has been developed as a candidate for standard, canonical exchange format for basic bioinformatics data. BioXSD is represented by a dedicated XML Schema...

  7. Cancer bioinformatics: detection of chromatin states, SNP-containing motifs, and functional enrichment modules

    Directory of Open Access Journals (Sweden)

    Xiaobo Zhou

    2013-04-01

    Full Text Available In this editorial preface, I briefly review cancer bioinformatics and introduce the four articles in this special issue highlighting important applications of the field: detection of chromatin states; detection of SNP-containing motifs and association with transcription factor-binding sites; improvements in functional enrichment modules; and gene association studies on aging and cancer. We expect this issue to provide bioinformatics scientists, cancer biologists, and clinical doctors with a better understanding of how cancer bioinformatics can be used to identify candidate biomarkers and targets and to conduct functional analysis.

  8. High-throughput bioinformatics with the Cyrille2 pipeline system

    Directory of Open Access Journals (Sweden)

    de Groot Joost CW

    2008-02-01

    Full Text Available Abstract Background Modern omics research involves the application of high-throughput technologies that generate vast volumes of data. These data need to be pre-processed, analyzed and integrated with existing knowledge through the use of diverse sets of software tools, models and databases. The analyses are often interdependent and chained together to form complex workflows or pipelines. Given the volume of the data used and the multitude of computational resources available, specialized pipeline software is required to make high-throughput analysis of large-scale omics datasets feasible. Results We have developed a generic pipeline system called Cyrille2. The system is modular in design and consists of three functionally distinct parts: 1 a web based, graphical user interface (GUI that enables a pipeline operator to manage the system; 2 the Scheduler, which forms the functional core of the system and which tracks what data enters the system and determines what jobs must be scheduled for execution, and; 3 the Executor, which searches for scheduled jobs and executes these on a compute cluster. Conclusion The Cyrille2 system is an extensible, modular system, implementing the stated requirements. Cyrille2 enables easy creation and execution of high throughput, flexible bioinformatics pipelines.

  9. Bioinformatic Analysis of Structural Proteins of Paramyxovirus Tianjin Strain

    Institute of Scientific and Technical Information of China (English)

    Li-ying SHI; Mei LI; Xiao-mian LI; Li-jun YUAN; Qing WANG

    2008-01-01

    The amino acid sequences of the NP,P, M, F,HN and L proteins of the paramyxovirus Tianjin strain were analyzed by using the bioinformatics methods. Phylogenetic analysis based on 6 structural proteins among the Tianjin strain and 25 paramyxoviruses showed that the Tianjin strain belonged to the genus Respirovirus, in the subfamily Paramyxovirinae, and was most closely related to Sendal virus (SeV). Phylogenetic analysis with 14 known SeVs showed that Tianjin strain represented a new evolutionary lineage. Similarities comparisons indicated that Tianjin strain P protein was poorly conserved, sharing only 78.7%-91.9% amino acid identity with the known SeVs, while the L protein was the most conserved, having 96.0%-98.0% amino acid identity with the known SeVs. Alignments of amino acid sequences of 6 structural proteins clearly showed that Tianjin strain possessed many unique amino acid substitutions in their protein sequences, 15 in NP, 29 in P, 6 in M, 13 in F, 18 in HN, and 29 in L. These results revealed that Tianjin strain was most likely a new genotype of SeV. The presence of unique amino acid substitutions suggests that Tianjin strain maybe has a significant difference in biological, pathological, immunological, or epidemiological characteristics from the known SeVs.

  10. Databases, models, and algorithms for functional genomics: a bioinformatics perspective.

    Science.gov (United States)

    Singh, Gautam B; Singh, Harkirat

    2005-02-01

    A variety of patterns have been observed on the DNA and protein sequences that serve as control points for gene expression and cellular functions. Owing to the vital role of such patterns discovered on biological sequences, they are generally cataloged and maintained within internationally shared databases. Furthermore,the variability in a family of observed patterns is often represented using computational models in order to facilitate their search within an uncharacterized biological sequence. As the biological data is comprised of a mosaic of sequence-levels motifs, it is significant to unravel the synergies of macromolecular coordination utilized in cell-specific differential synthesis of proteins. This article provides an overview of the various pattern representation methodologies and the surveys the pattern databases available for use to the molecular biologists. Our aim is to describe the principles behind the computational modeling and analysis techniques utilized in bioinformatics research, with the objective of providing insight necessary to better understand and effectively utilize the available databases and analysis tools. We also provide a detailed review of DNA sequence level patterns responsible for structural conformations within the Scaffold or Matrix Attachment Regions (S/MARs).

  11. Bioinformatic approaches reveal metagenomic characterization of soil microbial community.

    Directory of Open Access Journals (Sweden)

    Zhuofei Xu

    Full Text Available As is well known, soil is a complex ecosystem harboring the most prokaryotic biodiversity on the Earth. In recent years, the advent of high-throughput sequencing techniques has greatly facilitated the progress of soil ecological studies. However, how to effectively understand the underlying biological features of large-scale sequencing data is a new challenge. In the present study, we used 33 publicly available metagenomes from diverse soil sites (i.e. grassland, forest soil, desert, Arctic soil, and mangrove sediment and integrated some state-of-the-art computational tools to explore the phylogenetic and functional characterizations of the microbial communities in soil. Microbial composition and metabolic potential in soils were comprehensively illustrated at the metagenomic level. A spectrum of metagenomic biomarkers containing 46 taxa and 33 metabolic modules were detected to be significantly differential that could be used as indicators to distinguish at least one of five soil communities. The co-occurrence associations between complex microbial compositions and functions were inferred by network-based approaches. Our results together with the established bioinformatic pipelines should provide a foundation for future research into the relation between soil biodiversity and ecosystem function.

  12. Bioinformatic approaches reveal metagenomic characterization of soil microbial community.

    Science.gov (United States)

    Xu, Zhuofei; Hansen, Martin Asser; Hansen, Lars H; Jacquiod, Samuel; Sørensen, Søren J

    2014-01-01

    As is well known, soil is a complex ecosystem harboring the most prokaryotic biodiversity on the Earth. In recent years, the advent of high-throughput sequencing techniques has greatly facilitated the progress of soil ecological studies. However, how to effectively understand the underlying biological features of large-scale sequencing data is a new challenge. In the present study, we used 33 publicly available metagenomes from diverse soil sites (i.e. grassland, forest soil, desert, Arctic soil, and mangrove sediment) and integrated some state-of-the-art computational tools to explore the phylogenetic and functional characterizations of the microbial communities in soil. Microbial composition and metabolic potential in soils were comprehensively illustrated at the metagenomic level. A spectrum of metagenomic biomarkers containing 46 taxa and 33 metabolic modules were detected to be significantly differential that could be used as indicators to distinguish at least one of five soil communities. The co-occurrence associations between complex microbial compositions and functions were inferred by network-based approaches. Our results together with the established bioinformatic pipelines should provide a foundation for future research into the relation between soil biodiversity and ecosystem function. PMID:24691166

  13. Bioinformatics Analysis of MAPKKK Family Genes in Medicago truncatula

    Directory of Open Access Journals (Sweden)

    Wei Li

    2016-04-01

    Full Text Available Mitogen‐activated protein kinase kinase kinase (MAPKKK is a component of the MAPK cascade pathway that plays an important role in plant growth, development, and response to abiotic stress, the functions of which have been well characterized in several plant species, such as Arabidopsis, rice, and maize. In this study, we performed genome‐wide and systemic bioinformatics analysis of MAPKKK family genes in Medicago truncatula. In total, there were 73 MAPKKK family members identified by search of homologs, and they were classified into three subfamilies, MEKK, ZIK, and RAF. Based on the genomic duplication function, 72 MtMAPKKK genes were located throughout all chromosomes, but they cluster in different chromosomes. Using microarray data and high‐throughput sequencing‐data, we assessed their expression profiles in growth and development processes; these results provided evidence for exploring their important functions in developmental regulation, especially in the nodulation process. Furthermore, we investigated their expression in abiotic stresses by RNA‐seq, which confirmed their critical roles in signal transduction and regulation processes under stress. In summary, our genome‐wide, systemic characterization and expressional analysis of MtMAPKKK genes will provide insights that will be useful for characterizing the molecular functions of these genes in M. truncatula.

  14. Bioinformatics Analysis of MAPKKK Family Genes in Medicago truncatula.

    Science.gov (United States)

    Li, Wei; Xu, Hanyun; Liu, Ying; Song, Lili; Guo, Changhong; Shu, Yongjun

    2016-01-01

    Mitogen-activated protein kinase kinase kinase (MAPKKK) is a component of the MAPK cascade pathway that plays an important role in plant growth, development, and response to abiotic stress, the functions of which have been well characterized in several plant species, such as Arabidopsis, rice, and maize. In this study, we performed genome-wide and systemic bioinformatics analysis of MAPKKK family genes in Medicago truncatula. In total, there were 73 MAPKKK family members identified by search of homologs, and they were classified into three subfamilies, MEKK, ZIK, and RAF. Based on the genomic duplication function, 72 MtMAPKKK genes were located throughout all chromosomes, but they cluster in different chromosomes. Using microarray data and high-throughput sequencing-data, we assessed their expression profiles in growth and development processes; these results provided evidence for exploring their important functions in developmental regulation, especially in the nodulation process. Furthermore, we investigated their expression in abiotic stresses by RNA-seq, which confirmed their critical roles in signal transduction and regulation processes under stress. In summary, our genome-wide, systemic characterization and expressional analysis of MtMAPKKK genes will provide insights that will be useful for characterizing the molecular functions of these genes in M. truncatula.

  15. Functional proteomics with new mass spectrometric and bioinformatics tools

    International Nuclear Information System (INIS)

    A comprehensive range of mass spectrometric tools is required to investigate todays life science applications and a strong focus is on addressing the needs of functional proteomics. Application examples are given showing the streamlined process of protein identification from low femtomole amounts of digests. Sample preparation is achieved with a convertible robot for automated 2D gel picking, and MALDI target dispensing. MALDI-TOF or ESI-MS subsequent to enzymatic digestion. A choice of mass spectrometers including Q-q-TOF with multipass capability, MALDI-MS/MS with unsegmented PSD, Ion Trap and FT-MS are discussed for their respective strengths and applications. Bioinformatics software that allows both database work and novel peptide mass spectra interpretation is reviewed. The automated database searching uses either entire digest LC-MSn ESI Ion Trap data or MALDI MS and MS/MS spectra. It is shown how post translational modifications are interactively uncovered and de-novo sequencing of peptides is facilitated

  16. Competing endogenous RNA and interactome bioinformatic analyses on human telomerase.

    Science.gov (United States)

    Arancio, Walter; Pizzolanti, Giuseppe; Genovese, Swonild Ilenia; Baiamonte, Concetta; Giordano, Carla

    2014-04-01

    We present a classic interactome bioinformatic analysis and a study on competing endogenous (ce) RNAs for hTERT. The hTERT gene codes for the catalytic subunit and limiting component of the human telomerase complex. Human telomerase reverse transcriptase (hTERT) is essential for the integrity of telomeres. Telomere dysfunctions have been widely reported to be involved in aging, cancer, and cellular senescence. The hTERT gene network has been analyzed using the BioGRID interaction database (http://thebiogrid.org/) and related analysis tools such as Osprey (http://biodata.mshri.on.ca/osprey/servlet/Index) and GeneMANIA (http://genemania.org/). The network of interaction of hTERT transcripts has been further analyzed following the competing endogenous (ce) RNA hypotheses (messenger [m] RNAs cross-talk via micro [mi] RNAs) using the miRWalk database and tools (www.ma.uni-heidelberg.de/apps/zmf/mirwalk/). These analyses suggest a role for Akt, nuclear factor-κB (NF-κB), heat shock protein 90 (HSP90), p70/p80 autoantigen, 14-3-3 proteins, and dynein in telomere functions. Roles for histone acetylation/deacetylation and proteoglycan metabolism are also proposed. PMID:24713059

  17. Exploiting graphics processing units for computational biology and bioinformatics.

    Science.gov (United States)

    Payne, Joshua L; Sinnott-Armstrong, Nicholas A; Moore, Jason H

    2010-09-01

    Advances in the video gaming industry have led to the production of low-cost, high-performance graphics processing units (GPUs) that possess more memory bandwidth and computational capability than central processing units (CPUs), the standard workhorses of scientific computing. With the recent release of generalpurpose GPUs and NVIDIA's GPU programming language, CUDA, graphics engines are being adopted widely in scientific computing applications, particularly in the fields of computational biology and bioinformatics. The goal of this article is to concisely present an introduction to GPU hardware and programming, aimed at the computational biologist or bioinformaticist. To this end, we discuss the primary differences between GPU and CPU architecture, introduce the basics of the CUDA programming language, and discuss important CUDA programming practices, such as the proper use of coalesced reads, data types, and memory hierarchies. We highlight each of these topics in the context of computing the all-pairs distance between instances in a dataset, a common procedure in numerous disciplines of scientific computing. We conclude with a runtime analysis of the GPU and CPU implementations of the all-pairs distance calculation. We show our final GPU implementation to outperform the CPU implementation by a factor of 1700. PMID:20658333

  18. Bioinformatic Prediction of WSSV-Host Protein-Protein Interaction

    Directory of Open Access Journals (Sweden)

    Zheng Sun

    2014-01-01

    Full Text Available WSSV is one of the most dangerous pathogens in shrimp aquaculture. However, the molecular mechanism of how WSSV interacts with shrimp is still not very clear. In the present study, bioinformatic approaches were used to predict interactions between proteins from WSSV and shrimp. The genome data of WSSV (NC_003225.1 and the constructed transcriptome data of F. chinensis were used to screen potentially interacting proteins by searching in protein interaction databases, including STRING, Reactome, and DIP. Forty-four pairs of proteins were suggested to have interactions between WSSV and the shrimp. Gene ontology analysis revealed that 6 pairs of these interacting proteins were classified into “extracellular region” or “receptor complex” GO-terms. KEGG pathway analysis showed that they were involved in the “ECM-receptor interaction pathway.” In the 6 pairs of interacting proteins, an envelope protein called “collagen-like protein” (WSSV-CLP encoded by an early virus gene “wsv001” in WSSV interacted with 6 deduced proteins from the shrimp, including three integrin alpha (ITGA, two integrin beta (ITGB, and one syndecan (SDC. Sequence analysis on WSSV-CLP, ITGA, ITGB, and SDC revealed that they possessed the sequence features for protein-protein interactions. This study might provide new insights into the interaction mechanisms between WSSV and shrimp.

  19. "Broadband" Bioinformatics Skills Transfer with the Knowledge Transfer Programme (KTP): Educational Model for Upliftment and Sustainable Development.

    Science.gov (United States)

    Chimusa, Emile R; Mbiyavanga, Mamana; Masilela, Velaphi; Kumuthini, Judit

    2015-11-01

    A shortage of practical skills and relevant expertise is possibly the primary obstacle to social upliftment and sustainable development in Africa. The "omics" fields, especially genomics, are increasingly dependent on the effective interpretation of large and complex sets of data. Despite abundant natural resources and population sizes comparable with many first-world countries from which talent could be drawn, countries in Africa still lag far behind the rest of the world in terms of specialized skills development. Moreover, there are serious concerns about disparities between countries within the continent. The multidisciplinary nature of the bioinformatics field, coupled with rare and depleting expertise, is a critical problem for the advancement of bioinformatics in Africa. We propose a formalized matchmaking system, which is aimed at reversing this trend, by introducing the Knowledge Transfer Programme (KTP). Instead of individual researchers travelling to other labs to learn, researchers with desirable skills are invited to join African research groups for six weeks to six months. Visiting researchers or trainers will pass on their expertise to multiple people simultaneously in their local environments, thus increasing the efficiency of knowledge transference. In return, visiting researchers have the opportunity to develop professional contacts, gain industry work experience, work with novel datasets, and strengthen and support their ongoing research. The KTP develops a network with a centralized hub through which groups and individuals are put into contact with one another and exchanges are facilitated by connecting both parties with potential funding sources. This is part of the PLOS Computational Biology Education collection. PMID:26583922

  20. "Broadband" Bioinformatics Skills Transfer with the Knowledge Transfer Programme (KTP): Educational Model for Upliftment and Sustainable Development.

    Science.gov (United States)

    Chimusa, Emile R; Mbiyavanga, Mamana; Masilela, Velaphi; Kumuthini, Judit

    2015-11-01

    A shortage of practical skills and relevant expertise is possibly the primary obstacle to social upliftment and sustainable development in Africa. The "omics" fields, especially genomics, are increasingly dependent on the effective interpretation of large and complex sets of data. Despite abundant natural resources and population sizes comparable with many first-world countries from which talent could be drawn, countries in Africa still lag far behind the rest of the world in terms of specialized skills development. Moreover, there are serious concerns about disparities between countries within the continent. The multidisciplinary nature of the bioinformatics field, coupled with rare and depleting expertise, is a critical problem for the advancement of bioinformatics in Africa. We propose a formalized matchmaking system, which is aimed at reversing this trend, by introducing the Knowledge Transfer Programme (KTP). Instead of individual researchers travelling to other labs to learn, researchers with desirable skills are invited to join African research groups for six weeks to six months. Visiting researchers or trainers will pass on their expertise to multiple people simultaneously in their local environments, thus increasing the efficiency of knowledge transference. In return, visiting researchers have the opportunity to develop professional contacts, gain industry work experience, work with novel datasets, and strengthen and support their ongoing research. The KTP develops a network with a centralized hub through which groups and individuals are put into contact with one another and exchanges are facilitated by connecting both parties with potential funding sources. This is part of the PLOS Computational Biology Education collection.

  1. Visual gene developer: a fully programmable bioinformatics software for synthetic gene optimization

    Directory of Open Access Journals (Sweden)

    McDonald Karen

    2011-08-01

    Full Text Available Abstract Background Direct gene synthesis is becoming more popular owing to decreases in gene synthesis pricing. Compared with using natural genes, gene synthesis provides a good opportunity to optimize gene sequence for specific applications. In order to facilitate gene optimization, we have developed a stand-alone software called Visual Gene Developer. Results The software not only provides general functions for gene analysis and optimization along with an interactive user-friendly interface, but also includes unique features such as programming capability, dedicated mRNA secondary structure prediction, artificial neural network modeling, network & multi-threaded computing, and user-accessible programming modules. The software allows a user to analyze and optimize a sequence using main menu functions or specialized module windows. Alternatively, gene optimization can be initiated by designing a gene construct and configuring an optimization strategy. A user can choose several predefined or user-defined algorithms to design a complicated strategy. The software provides expandable functionality as platform software supporting module development using popular script languages such as VBScript and JScript in the software programming environment. Conclusion Visual Gene Developer is useful for both researchers who want to quickly analyze and optimize genes, and those who are interested in developing and testing new algorithms in bioinformatics. The software is available for free download at http://www.visualgenedeveloper.net.

  2. New Milk Protein-Derived Peptides with Potential Antimicrobial Activity: An Approach Based on Bioinformatic Studies

    Science.gov (United States)

    Dziuba, Bartłomiej; Dziuba, Marta

    2014-01-01

    New peptides with potential antimicrobial activity, encrypted in milk protein sequences, were searched for with the use of bioinformatic tools. The major milk proteins were hydrolyzed in silico by 28 enzymes. The obtained peptides were characterized by the following parameters: molecular weight, isoelectric point, composition and number of amino acid residues, net charge at pH 7.0, aliphatic index, instability index, Boman index, and GRAVY index, and compared with those calculated for known 416 antimicrobial peptides including 59 antimicrobial peptides (AMPs) from milk proteins listed in the BIOPEP database. A simple analysis of physico-chemical properties and the values of biological activity indicators were insufficient to select potentially antimicrobial peptides released in silico from milk proteins by proteolytic enzymes. The final selection was made based on the results of multidimensional statistical analysis such as support vector machines (SVM), random forest (RF), artificial neural networks (ANN) and discriminant analysis (DA) available in the Collection of Anti-Microbial Peptides (CAMP database). Eleven new peptides with potential antimicrobial activity were selected from all peptides released during in silico proteolysis of milk proteins. PMID:25141106

  3. New Milk Protein-Derived Peptides with Potential Antimicrobial Activity: An Approach Based on Bioinformatic Studies

    Directory of Open Access Journals (Sweden)

    Bartłomiej Dziuba

    2014-08-01

    Full Text Available New peptides with potential antimicrobial activity, encrypted in milk protein sequences, were searched for with the use of bioinformatic tools. The major milk proteins were hydrolyzed in silico by 28 enzymes. The obtained peptides were characterized by the following parameters: molecular weight, isoelectric point, composition and number of amino acid residues, net charge at pH 7.0, aliphatic index, instability index, Boman index, and GRAVY index, and compared with those calculated for known 416 antimicrobial peptides including 59 antimicrobial peptides (AMPs from milk proteins listed in the BIOPEP database. A simple analysis of physico-chemical properties and the values of biological activity indicators were insufficient to select potentially antimicrobial peptides released in silico from milk proteins by proteolytic enzymes. The final selection was made based on the results of multidimensional statistical analysis such as support vector machines (SVM, random forest (RF, artificial neural networks (ANN and discriminant analysis (DA available in the Collection of Anti-Microbial Peptides (CAMP database. Eleven new peptides with potential antimicrobial activity were selected from all peptides released during in silico proteolysis of milk proteins.

  4. Bioinformatics in high school biology curricula: a study of state science standards.

    Science.gov (United States)

    Wefer, Stephen H; Sheppard, Keith

    2008-01-01

    The proliferation of bioinformatics in modern biology marks a modern revolution in science that promises to influence science education at all levels. This study analyzed secondary school science standards of 49 U.S. states (Iowa has no science framework) and the District of Columbia for content related to bioinformatics. The bioinformatics content of each state's biology standards was analyzed and categorized into nine areas: Human Genome Project/genomics, forensics, evolution, classification, nucleotide variations, medicine, computer use, agriculture/food technology, and science technology and society/socioscientific issues. Findings indicated a generally low representation of bioinformatics-related content, which varied substantially across the different areas, with Human Genome Project/genomics and computer use being the lowest (8%), and evolution being the highest (64%) among states' science frameworks. This essay concludes with recommendations for reworking/rewording existing standards to facilitate the goal of promoting science literacy among secondary school students.

  5. [Post-translational modification (PTM) bioinformatics in China: progresses and perspectives].

    Science.gov (United States)

    Zexian, Liu; Yudong, Cai; Xuejiang, Guo; Ao, Li; Tingting, Li; Jianding, Qiu; Jian, Ren; Shaoping, Shi; Jiangning, Song; Minghui, Wang; Lu, Xie; Yu, Xue; Ziding, Zhang; Xingming, Zhao

    2015-07-01

    Post-translational modifications (PTMs) are essential for regulating conformational changes, activities and functions of proteins, and are involved in almost all cellular pathways and processes. Identification of protein PTMs is the basis for understanding cellular and molecular mechanisms. In contrast with labor-intensive and time-consuming experiments, the PTM prediction using various bioinformatics approaches can provide accurate, convenient, and efficient strategies and generate valuable information for further experimental consideration. In this review, we summarize the current progresses made by Chineses bioinformaticians in the field of PTM Bioinformatics, including the design and improvement of computational algorithms for predicting PTM substrates and sites, design and maintenance of online and offline tools, establishment of PTM-related databases and resources, and bioinformatics analysis of PTM proteomics data. Through comparing similar studies in China and other countries, we demonstrate both advantages and limitations of current PTM bioinformatics as well as perspectives for future studies in China.

  6. Bioinformatics resources for cancer research with an emphasis on gene function and structure prediction tools

    Directory of Open Access Journals (Sweden)

    Daisuke Kihara

    2006-01-01

    Full Text Available The immensely popular fields of cancer research and bioinformatics overlap in many different areas, e.g. large data repositories that allow for users to analyze data from many experiments (data handling, databases, pattern mining, microarray data analysis, and interpretation of proteomics data. There are many newly available resources in these areas that may be unfamiliar to most cancer researchers wanting to incorporate bioinformatics tools and analyses into their work, and also to bioinformaticians looking for real data to develop and test algorithms. This review reveals the interdependence of cancer research and bioinformatics, and highlight the most appropriate and useful resources available to cancer researchers. These include not only public databases, but general and specific bioinformatics tools which can be useful to the cancer researcher. The primary foci are function and structure prediction tools of protein genes. The result is a useful reference to cancer researchers and bioinformaticians studying cancer alike.

  7. Fundamentals of bioinformatics and computational biology methods and exercises in matlab

    CERN Document Server

    Singh, Gautam B

    2015-01-01

    This book offers comprehensive coverage of all the core topics of bioinformatics, and includes practical examples completed using the MATLAB bioinformatics toolbox™. It is primarily intended as a textbook for engineering and computer science students attending advanced undergraduate and graduate courses in bioinformatics and computational biology. The book develops bioinformatics concepts from the ground up, starting with an introductory chapter on molecular biology and genetics. This chapter will enable physical science students to fully understand and appreciate the ultimate goals of applying the principles of information technology to challenges in biological data management, sequence analysis, and systems biology. The first part of the book also includes a survey of existing biological databases, tools that have become essential in today’s biotechnology research. The second part of the book covers methodologies for retrieving biological information, including fundamental algorithms for sequence compar...

  8. Biopython: freely available Python tools for computational molecular biology and bioinformatics

    DEFF Research Database (Denmark)

    Cock, Peter J A; Antao, Tiago; Chang, Jeffrey T;

    2009-01-01

    SUMMARY: The Biopython project is a mature open source international collaboration of volunteer developers, providing Python libraries for a wide range of bioinformatics problems. Biopython includes modules for reading and writing different sequence file formats and multiple sequence alignments...

  9. The 2nd DBCLS BioHackathon: interoperable bioinformatics Web services for integrated applications

    Directory of Open Access Journals (Sweden)

    Katayama Toshiaki

    2011-08-01

    Full Text Available Abstract Background The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices and standards. Here, we report on an initiative in which software developers and genome biologists came together to explore and raise awareness of these issues: BioHackathon 2009. Results Developers in attendance came from diverse backgrounds, with experts in Web services, workflow tools, text mining and visualization. Genome biologists provided expertise and exemplar data from the domains of sequence and pathway analysis and glyco-informatics. One goal of the meeting was to evaluate the ability to address real world use cases in these domains using the tools that the developers represented. This resulted in i a workflow to annotate 100,000 sequences from an invertebrate species; ii an integrated system for analysis of the transcription factor binding sites (TFBSs enriched based on differential gene expression data obtained from a microarray experiment; iii a workflow to enumerate putative physical protein interactions among enzymes in a metabolic pathway using protein structure data; iv a workflow to analyze glyco-gene-related diseases by searching for human homologs of glyco-genes in other species, such as fruit flies, and retrieving their phenotype-annotated SNPs. Conclusions Beyond deriving prototype solutions for each use-case, a second major purpose of the BioHackathon was to highlight areas of insufficiency. We discuss the issues raised by our exploration of the problem/solution space, concluding that there are still problems with the way Web services are modeled and annotated, including: i the absence of several useful data or analysis functions in the Web service "space"; ii the lack of documentation of methods; iii lack of

  10. Bioinformatic science and devices for computer analysis and visualization of macromolecules

    Directory of Open Access Journals (Sweden)

    Yu.B. Porozov

    2010-06-01

    Full Text Available The goals and objectives of bioinformatic science are presented in the article. The main methods and approaches used in computer biology are highlighted. Areas in which bioinformatic science can greatly facilitate and speed up the work of practical biologist and pharmacologist are revealed. The features of both the basic packages and software devices for complete, thorough analysis of macromolecules and for development and modeling of ligands and binding centers are described

  11. Forensic Bioinformatics: An innovative technological advancement in the field of Forensic Medicine and Diagnosis

    OpenAIRE

    Kumar Ajay; Singh Neetu; Gaurav S.S

    2012-01-01

    Background: The role of Bioinformatics in this modern age of technology advancement can not be over-emphasized. Aim: This study reviews the principle, techniques, and applications of Forensic Bioinformatics. Methods and Materials: Literature searches were done to identify relevant studies. Results: The concepts of sequence annotation and whole genome sequencing were possible due to the assimilation of software based tools which are exclusively responsible for the segregation of bulk genomic d...

  12. Overview of Random Forest Methodology and Practical Guidance with Emphasis on Computational Biology and Bioinformatics

    OpenAIRE

    Boulesteix, Anne-Laure; Janitza, Silke; Kruppa, Jochen; König, Inke R.

    2012-01-01

    The Random Forest (RF) algorithm by Leo Breiman has become a standard data analysis tool in bioinformatics. It has shown excellent performance in settings where the number of variables is much larger than the number of observations, can cope with complex interaction structures as well as highly correlated variables and returns measures of variable importance. This paper synthesizes ten years of RF development with emphasis on applications to bioinformatics and computational biology. Specia...

  13. Predicted protein-protein interactions in the moss Physcomitrella patens: a new bioinformatic resource

    OpenAIRE

    Schuette, Scott; Piatkowski, Brian; Corley, Aaron; Lang, Daniel; Geisler, Matt

    2015-01-01

    Background Physcomitrella patens, a haploid dominant plant, is fast becoming a useful molecular genetics and bioinformatics tool due to its key phylogenetic position as a bryophyte in the post-genomic era. Genome sequences from select reference species were compared bioinformatically to Physcomitrella patens using reciprocal blasts with the InParanoid software package. A reference protein interaction database assembled using MySQL by compiling BioGrid, BIND, DIP, and Intact databases was quer...

  14. Learning with Support Vector Machines

    CERN Document Server

    Campbell, Colin

    2010-01-01

    Support Vectors Machines have become a well established tool within machine learning. They work well in practice and have now been used across a wide range of applications from recognizing hand-written digits, to face identification, text categorisation, bioinformatics, and database marketing. In this book we give an introductory overview of this subject. We start with a simple Support Vector Machine for performing binary classification before considering multi-class classification and learning in the presence of noise. We show that this framework can be extended to many other scenarios such a

  15. A New Technique to Manage Big Bioinformatics Data Using Genetic Algorithms

    Directory of Open Access Journals (Sweden)

    Huda Jalil Dikhil

    2016-10-01

    Full Text Available The continuous growth of data, mainly the medical data at laboratories becomes very complex to use and to manage by using traditional ways. So, the researchers start studying genetic information field which increased in the past thirty years in bioinformatics domain (the computer science field, genetic biology field, and DNA. This growth of data becomes known as big bioinformatics data. Thus, efficient algorithms such as Genetic Algorithms are needed to deal with this big and vast amount of bioinformatics data in genetic laboratories. So the researchers proposed two models to manage the big bioinformatics data in addition to the traditional model. The first model by applying Genetic Algorithms before MapReduce, the second model by applying Genetic Algorithms after the MapReduce, and the original or the traditional model by applying only MapReduce without using Genetic Algorithms. The three models were implemented and evaluated using big bioinformatics data collected from the Duchenne Muscular Dystrophy (DMD disorder. The researchers conclude that the second model is the best one among the three models in reducing the size of the data, in execution time, and in addition to the ability to manage and summarize big bioinformatics data. Finally by comparing the percentage errors of the second model with the first model and the traditional model, the researchers obtained the following results 1.136%, 10.227%, and 11.363% respectively. So the second model is the most accurate model with the less percentage error.

  16. The bioinformatics of psychosocial genomics in alternative and complementary medicine.

    Science.gov (United States)

    Rossi, E

    2003-06-01

    The bioinformatics of alternative and complementary medicine is outlined in 3 hypotheses that extend the molecular-genomic revolution initiated by Watson and Crick 50 years ago to include psychology in the new discipline of psychosocial and cultural genomics. Stress-induced changes in the alternative splicing of genes demonstrate how psychosomatic stress in humans modulates activity-dependent gene expression, protein formation, physiological function, and psychological experience. The molecular messengers generated by stress, injury, and disease can activate immediate early genes within stem cells so that they then signal the target genes required to synthesize the proteins that will transform (differentiate) stem cells into mature well-functioning tissues. Such activity-dependent gene expression and its consequent activity-dependent neurogenesis and stem cell healing is proposed as the molecular-genomic-cellular basis of rehabilitative medicine, physical, and occupational therapy as well as the many alternative and complementary approaches to mind-body healing. The therapeutic replaying of enriching life experiences that evoke the novelty-numinosum-neurogenesis effect during creative moments of art, music, dance, drama, humor, literature, poetry, and spirituality, as well as cultural rituals of life transitions (birth, puberty, marriage, illness, healing, and death) can optimize consciousness, personal relationships, and healing in a manner that has much in common with the psychogenomic foundations of naturalistic and complementary medicine. The entire history of alternative and complementary approaches to healing is consistent with this new neuroscience world view about the role of psychological arousal and fascination in modulating gene expression, neurogenesis, and healing via the psychosocial and cultural rites of human societies.

  17. Bioinformatic Identification and Analysis of Extensins in the Plant Kingdom.

    Directory of Open Access Journals (Sweden)

    Xiao Liu

    Full Text Available Extensins (EXTs are a family of plant cell wall hydroxyproline-rich glycoproteins (HRGPs that are implicated to play important roles in plant growth, development, and defense. Structurally, EXTs are characterized by the repeated occurrence of serine (Ser followed by three to five prolines (Pro residues, which are hydroxylated as hydroxyproline (Hyp and glycosylated. Some EXTs have Tyrosine (Tyr-X-Tyr (where X can be any amino acid motifs that are responsible for intramolecular or intermolecular cross-linkings. EXTs can be divided into several classes: classical EXTs, short EXTs, leucine-rich repeat extensins (LRXs, proline-rich extensin-like receptor kinases (PERKs, formin-homolog EXTs (FH EXTs, chimeric EXTs, and long chimeric EXTs. To guide future research on the EXTs and understand evolutionary history of EXTs in the plant kingdom, a bioinformatics study was conducted to identify and classify EXTs from 16 fully sequenced plant genomes, including Ostreococcus lucimarinus, Chlamydomonas reinhardtii, Volvox carteri, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Pinus taeda, Picea abies, Brachypodium distachyon, Zea mays, Oryza sativa, Glycine max, Medicago truncatula, Brassica rapa, Solanum lycopersicum, and Solanum tuberosum, to supplement data previously obtained from Arabidopsis thaliana and Populus trichocarpa. A total of 758 EXTs were newly identified, including 87 classical EXTs, 97 short EXTs, 61 LRXs, 75 PERKs, 54 FH EXTs, 38 long chimeric EXTs, and 346 other chimeric EXTs. Several notable findings were made: (1 classical EXTs were likely derived after the terrestrialization of plants; (2 LRXs, PERKs, and FHs were derived earlier than classical EXTs; (3 monocots have few classical EXTs; (4 Eudicots have the greatest number of classical EXTs and Tyr-X-Tyr cross-linking motifs are predominantly in classical EXTs; (5 green algae have no classical EXTs but have a number of long chimeric EXTs that are absent in

  18. Comparison of acceleration techniques forselected low-level bioinformatics operations

    Directory of Open Access Journals (Sweden)

    Daniel eLangenkaemper

    2016-02-01

    Full Text Available Within the recent years clock rates of modern processors stagnated while the demand for computing power continued to grow. This applied particularly for the fields of life sciences and bioinformatics, where new technologies keep on creating rapidly growing piles of raw data with increasing speed. The number of cores per processor increased in an attempt to compensate for slight increments of clock rates. This technological shift demands changes in software development, especially in the field of high performance computing where parallelization techniques are gaining in importance due to the pressing issue of large sized datasets generated by e.g. modern genomics.This paper presents an overview of state-of-the-art manual and automatic acceleration techniques and lists some applications employing these in different areas of sequence informatics. Furthermore we provide examples for automatic acceleration of two use cases to show typical problems and gains of transforming a serial application to a parallel one. The paper should aid the reader in deciding for a certain techniques for the problem at hand.We compare four different state-of-the-art automatic acceleration approaches (OpenMP, PluTo-SICA, PPCG, and OpenACC. Their performance as well as their applicability for selected use cases is discussed. While optimizations targeting the CPU worked better in the complex k-mer use case, optimizers for Graphics Processing Units (GPUs performed better in the matrix multiplication example. But performance is only superior at a certain problem size due to data migration overhead.We show that automatic code parallelization is feasible with current compiler software and yields significant increases in execution speed. Automatic optimizers for CPU are mature and usually

  19. Bioinformatic Identification and Analysis of Extensins in the Plant Kingdom

    Science.gov (United States)

    Liu, Xiao; Wolfe, Richard; Welch, Lonnie R.; Domozych, David S.; Popper, Zoë A.; Showalter, Allan M.

    2016-01-01

    Extensins (EXTs) are a family of plant cell wall hydroxyproline-rich glycoproteins (HRGPs) that are implicated to play important roles in plant growth, development, and defense. Structurally, EXTs are characterized by the repeated occurrence of serine (Ser) followed by three to five prolines (Pro) residues, which are hydroxylated as hydroxyproline (Hyp) and glycosylated. Some EXTs have Tyrosine (Tyr)-X-Tyr (where X can be any amino acid) motifs that are responsible for intramolecular or intermolecular cross-linkings. EXTs can be divided into several classes: classical EXTs, short EXTs, leucine-rich repeat extensins (LRXs), proline-rich extensin-like receptor kinases (PERKs), formin-homolog EXTs (FH EXTs), chimeric EXTs, and long chimeric EXTs. To guide future research on the EXTs and understand evolutionary history of EXTs in the plant kingdom, a bioinformatics study was conducted to identify and classify EXTs from 16 fully sequenced plant genomes, including Ostreococcus lucimarinus, Chlamydomonas reinhardtii, Volvox carteri, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Pinus taeda, Picea abies, Brachypodium distachyon, Zea mays, Oryza sativa, Glycine max, Medicago truncatula, Brassica rapa, Solanum lycopersicum, and Solanum tuberosum, to supplement data previously obtained from Arabidopsis thaliana and Populus trichocarpa. A total of 758 EXTs were newly identified, including 87 classical EXTs, 97 short EXTs, 61 LRXs, 75 PERKs, 54 FH EXTs, 38 long chimeric EXTs, and 346 other chimeric EXTs. Several notable findings were made: (1) classical EXTs were likely derived after the terrestrialization of plants; (2) LRXs, PERKs, and FHs were derived earlier than classical EXTs; (3) monocots have few classical EXTs; (4) Eudicots have the greatest number of classical EXTs and Tyr-X-Tyr cross-linking motifs are predominantly in classical EXTs; (5) green algae have no classical EXTs but have a number of long chimeric EXTs that are absent in

  20. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis.

    Directory of Open Access Journals (Sweden)

    Roslyn D Noar

    Full Text Available Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that

  1. Bioinformatics analyses of Shigella CRISPR structure and spacer classification.

    Science.gov (United States)

    Wang, Pengfei; Zhang, Bing; Duan, Guangcai; Wang, Yingfang; Hong, Lijuan; Wang, Linlin; Guo, Xiangjiao; Xi, Yuanlin; Yang, Haiyan

    2016-03-01

    Clustered regularly interspaced short palindromic repeats (CRISPR) are inheritable genetic elements of a variety of archaea and bacteria and indicative of the bacterial ecological adaptation, conferring acquired immunity against invading foreign nucleic acids. Shigella is an important pathogen for anthroponosis. This study aimed to analyze the features of Shigella CRISPR structure and classify the spacers through bioinformatics approach. Among 107 Shigella, 434 CRISPR structure loci were identified with two to seven loci in different strains. CRISPR-Q1, CRISPR-Q4 and CRISPR-Q5 were widely distributed in Shigella strains. Comparison of the first and last repeats of CRISPR1, CRISPR2 and CRISPR3 revealed several base variants and different stem-loop structures. A total of 259 cas genes were found among these 107 Shigella strains. The cas gene deletions were discovered in 88 strains. However, there is one strain that does not contain cas gene. Intact clusters of cas genes were found in 19 strains. From comprehensive analysis of sequence signature and BLAST and CRISPRTarget score, the 708 spacers were classified into three subtypes: Type I, Type II and Type III. Of them, Type I spacer referred to those linked with one gene segment, Type II spacer linked with two or more different gene segments, and Type III spacer undefined. This study examined the diversity of CRISPR/cas system in Shigella strains, demonstrated the main features of CRISPR structure and spacer classification, which provided critical information for elucidation of the mechanisms of spacer formation and exploration of the role the spacers play in the function of the CRISPR/cas system.

  2. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis.

    Science.gov (United States)

    Noar, Roslyn D; Daub, Margaret E

    2016-01-01

    Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity) for six of the PKS sequences. One of the PKS sequences was not similar (banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that they may encode polyketides important in pathogenicity. PMID:27388157

  3. The bioinformatics of psychosocial genomics in alternative and complementary medicine.

    Science.gov (United States)

    Rossi, E

    2003-06-01

    The bioinformatics of alternative and complementary medicine is outlined in 3 hypotheses that extend the molecular-genomic revolution initiated by Watson and Crick 50 years ago to include psychology in the new discipline of psychosocial and cultural genomics. Stress-induced changes in the alternative splicing of genes demonstrate how psychosomatic stress in humans modulates activity-dependent gene expression, protein formation, physiological function, and psychological experience. The molecular messengers generated by stress, injury, and disease can activate immediate early genes within stem cells so that they then signal the target genes required to synthesize the proteins that will transform (differentiate) stem cells into mature well-functioning tissues. Such activity-dependent gene expression and its consequent activity-dependent neurogenesis and stem cell healing is proposed as the molecular-genomic-cellular basis of rehabilitative medicine, physical, and occupational therapy as well as the many alternative and complementary approaches to mind-body healing. The therapeutic replaying of enriching life experiences that evoke the novelty-numinosum-neurogenesis effect during creative moments of art, music, dance, drama, humor, literature, poetry, and spirituality, as well as cultural rituals of life transitions (birth, puberty, marriage, illness, healing, and death) can optimize consciousness, personal relationships, and healing in a manner that has much in common with the psychogenomic foundations of naturalistic and complementary medicine. The entire history of alternative and complementary approaches to healing is consistent with this new neuroscience world view about the role of psychological arousal and fascination in modulating gene expression, neurogenesis, and healing via the psychosocial and cultural rites of human societies. PMID:12853721

  4. Bioinformatic Identification and Analysis of Extensins in the Plant Kingdom.

    Science.gov (United States)

    Liu, Xiao; Wolfe, Richard; Welch, Lonnie R; Domozych, David S; Popper, Zoë A; Showalter, Allan M

    2016-01-01

    Extensins (EXTs) are a family of plant cell wall hydroxyproline-rich glycoproteins (HRGPs) that are implicated to play important roles in plant growth, development, and defense. Structurally, EXTs are characterized by the repeated occurrence of serine (Ser) followed by three to five prolines (Pro) residues, which are hydroxylated as hydroxyproline (Hyp) and glycosylated. Some EXTs have Tyrosine (Tyr)-X-Tyr (where X can be any amino acid) motifs that are responsible for intramolecular or intermolecular cross-linkings. EXTs can be divided into several classes: classical EXTs, short EXTs, leucine-rich repeat extensins (LRXs), proline-rich extensin-like receptor kinases (PERKs), formin-homolog EXTs (FH EXTs), chimeric EXTs, and long chimeric EXTs. To guide future research on the EXTs and understand evolutionary history of EXTs in the plant kingdom, a bioinformatics study was conducted to identify and classify EXTs from 16 fully sequenced plant genomes, including Ostreococcus lucimarinus, Chlamydomonas reinhardtii, Volvox carteri, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Pinus taeda, Picea abies, Brachypodium distachyon, Zea mays, Oryza sativa, Glycine max, Medicago truncatula, Brassica rapa, Solanum lycopersicum, and Solanum tuberosum, to supplement data previously obtained from Arabidopsis thaliana and Populus trichocarpa. A total of 758 EXTs were newly identified, including 87 classical EXTs, 97 short EXTs, 61 LRXs, 75 PERKs, 54 FH EXTs, 38 long chimeric EXTs, and 346 other chimeric EXTs. Several notable findings were made: (1) classical EXTs were likely derived after the terrestrialization of plants; (2) LRXs, PERKs, and FHs were derived earlier than classical EXTs; (3) monocots have few classical EXTs; (4) Eudicots have the greatest number of classical EXTs and Tyr-X-Tyr cross-linking motifs are predominantly in classical EXTs; (5) green algae have no classical EXTs but have a number of long chimeric EXTs that are absent in

  5. Early days in drug discovery by crystallography - personal recollections.

    Science.gov (United States)

    Colman, Peter M

    2013-01-01

    The influences of Lawrence Bragg and Max Perutz are evident in the contemporary emphasis on 'structural enablement' in drug discovery. On this occasion of the centenary of Bragg's equation, his role in supporting the earliest structural studies of biological materials at the Cavendish Laboratory is remembered. The 1962 Nobel Prizes for the structures of DNA and proteins marked the golden anniversary of the von Laue and Bragg discoveries.

  6. Fractionated Marine Invertebrate Extract Libraries for Drug Discovery

    Directory of Open Access Journals (Sweden)

    Chris M. Ireland

    2008-06-01

    Full Text Available The high-throughput screening and drug discovery paradigm has necessitated a change in preparation of natural product samples for screening programs. In an attempt to improve the quality of marine natural products samples for screening, several fractionation strategies were investigated. The final method used HP20SS as a solid support to effectively desalt extracts and fractionate the organic components. Additionally, methods to integrate an automated LCMS fractionation approach to shorten discovery time lines have been implemented.

  7. How to Nurture Scientific Discoveries Despite Their Unpredictable Nature

    CERN Document Server

    Loeb, Abraham

    2012-01-01

    The history of science reveals that major discoveries are not predictable. Naively, one might conclude therefore that it is not possible to artificially cultivate an environment that promotes discoveries. I suggest instead that open research without a programmatic agenda establishes a fertile ground for unexpected breakthroughs. Contrary to current practice, funding agencies should allocate a small fraction of their funds to support research in centers of excellence without programmatic reins tied to specific goals.

  8. Multiple aspects of DNA and RNA from biophysics to bioinformatics

    CERN Document Server

    Chatenay, Didier; Monasson, Remi; Thieffry, Denis; Dalibard, Jean

    2005-01-01

    This book is dedicated to the multiple aspects, that is, biological, physical and computational of DNA and RNA molecules. These molecules, central to vital processes, have been experimentally studied by molecular biologists for five decades since the discovery of the structure of DNA by Watson and Crick in 1953. Recent progresses (e.g. use of DNA chips, manipulations at the single molecule level, availability of huge genomic databases...) have revealed an imperious need for theoretical modelling. Further progresses will clearly not be possible without an integrated understanding of all DNA an

  9. Swift: 10 Years of Discovery

    Science.gov (United States)

    The conference Swift: 10 years of discovery was held in Roma at La Sapienza University on Dec. 2-5 2014 to celebrate 10 years of Swift successes. Thanks to a large attendance and a lively program, it provided the opportunity to review recent advances of our knowledge of the high-energy transient Universe both from the observational and theoretical sides. When Swift was launched on November 20, 2004, its prime objective was to chase Gamma-Ray Bursts and deepen our knowledge of these cosmic explosions. And so it did, unveiling the secrets of long and short GRBs. However, its multi-wavelength instrumentation and fast scheduling capabilities made it the most versatile mission ever flown. Besides GRBs, Swift has observed, and contributed to our understanding of, an impressive variety of targets including AGNs, supernovae, pulsars, microquasars, novae, variable stars, comets, and much more. Swift is continuously discovering rare and surprising events distributed over a wide range of redshifts, out to the most distant transient objects in the Universe. Such a trove of discoveries has been addressed during the conference with sessions dedicated to each class of events. Indeed, the conference in Rome was a spectacular celebration of the Swift 10th anniversary. It included sessions on all types of transient and steady sources. Top scientists from around the world gave invited and contributed talks. There was a large poster session, sumptuous lunches, news interviews and a glorious banquet with officials attending from INAF and ASI. All the presentations, as well as several conference pictures, can be found in the conference website (http://www.brera.inaf.it/Swift10/Welcome.html). These proceedings have been collected owing to the efforts of Paolo D’Avanzo who has followed each paper from submission to final acceptance. Our warmest thanks to Paolo for all his work. The Conference has been made possible by the support from La Sapienza University as well as from the ARAP

  10. Representation Discovery using Harmonic Analysis

    CERN Document Server

    Mahadevan, Sridhar

    2008-01-01

    Representations are at the heart of artificial intelligence (AI). This book is devoted to the problem of representation discovery: how can an intelligent system construct representations from its experience? Representation discovery re-parameterizes the state space - prior to the application of information retrieval, machine learning, or optimization techniques - facilitating later inference processes by constructing new task-specific bases adapted to the state space geometry. This book presents a general approach to representation discovery using the framework of harmonic analysis, in particu

  11. Computer-aided vaccine designing approach against fish pathogens Edwardsiella tarda and Flavobacterium columnare using bioinformatics softwares.

    Science.gov (United States)

    Mahendran, Radha; Jeyabaskar, Suganya; Sitharaman, Gayathri; Michael, Rajamani Dinakaran; Paul, Agnal Vincent

    2016-01-01

    Edwardsiella tarda and Flavobacterium columnare are two important intracellular pathogenic bacteria that cause the infectious diseases edwardsiellosis and columnaris in wild and cultured fish. Prediction of major histocompatibility complex (MHC) binding is an important issue in T-cell epitope prediction. In a healthy immune system, the T-cells must recognize epitopes and induce the immune response. In this study, T-cell epitopes were predicted by using in silico immunoinformatics approach with the help of bioinformatics tools that are less expensive and are not time consuming. Such identification of binding interaction between peptides and MHC alleles aids in the discovery of new peptide vaccines. We have reported the potential peptides chosen from the outer membrane proteins (OMPs) of E. tarda and F. columnare, which interact well with MHC class I alleles. OMPs from E. tarda and F. columnare were selected and analyzed based on their antigenic and immunogenic properties. The OMPs of the genes TolC and FCOL_04620, respectively, from E. tarda and F. columnare were taken for study. Finally, two epitopes from the OMP of E. tarda exhibited excellent protein-peptide interaction when docked with MHC class I alleles. Five epitopes from the OMP of F. columnare had good protein-peptide interaction when docked with MHC class I alleles. Further in vitro studies can aid in the development of potential peptide vaccines using the predicted peptides. PMID:27284239

  12. [Bioinformatics of tumor molecular targets from big data].

    Science.gov (United States)

    Huang, Jinyan; Yu, Yingyan

    2015-01-01

    The big data from high throughput research disclosed 4V features: volume of data, variety of data, value for deep mining, and velocity of processing speed. Regarding the whole genome sequencing for human sample, at average 30x of coverage, a total of 100 GB of original data (compression FASTQ format) could be produced. Replying to the binary BAM format, a total of 150 GB data could be produced. In the analysis of high throughput data, we need to combine both clinical information and pathological features. In addition, the data sources of medical research involved in ethical and privacy of patients. At present, the costs are gradually cheaper. For example, a whole genome sequencing by Illumina X Ten with 30x coverage costs about 10,000 RMB, and RNA-seq costs 5000 RMB for a single sample. Therefore, cancer genome research provides opportunities for discovery of molecular targets, but also brings enormous challenges on the data integration and utilization. This article introduces methodologies for high throughput data analysis and processing, and explains possible application on molecular target discovery. PMID:25656022

  13. The Gaggle: An open-source software system for integrating bioinformatics software and data sources

    Directory of Open Access Journals (Sweden)

    Bonneau Richard

    2006-03-01

    Full Text Available Abstract Background Systems biologists work with many kinds of data, from many different sources, using a variety of software tools. Each of these tools typically excels at one type of analysis, such as of microarrays, of metabolic networks and of predicted protein structure. A crucial challenge is to combine the capabilities of these (and other forthcoming data resources and tools to create a data exploration and analysis environment that does justice to the variety and complexity of systems biology data sets. A solution to this problem should recognize that data types, formats and software in this high throughput age of biology are constantly changing. Results In this paper we describe the Gaggle -a simple, open-source Java software environment that helps to solve the problem of software and database integration. Guided by the classic software engineering strategy of separation of concerns and a policy of semantic flexibility, it integrates existing popular programs and web resources into a user-friendly, easily-extended environment. We demonstrate that four simple data types (names, matrices, networks, and associative arrays are sufficient to bring together diverse databases and software. We highlight some capabilities of the Gaggle with an exploration of Helicobacter pylori pathogenesis genes, in which we identify a putative ricin-like protein -a discovery made possible by simultaneous data exploration using a wide range of publicly available data and a variety of popular bioinformatics software tools. Conclusion We have integrated diverse databases (for example, KEGG, BioCyc, String and software (Cytoscape, DataMatrixViewer, R statistical environment, and TIGR Microarray Expression Viewer. Through this loose coupling of diverse software and databases the Gaggle enables simultaneous exploration of experimental data (mRNA and protein abundance, protein-protein and protein-DNA interactions, functional associations (operon, chromosomal

  14. Denton Vacuum Discovery-550 Sputterer

    Data.gov (United States)

    Federal Laboratory Consortium — Description: CORAL Name: Sputter 2 Similar to the existing 4-Gun Denton Discovery 22 Sputter system, with the following enhancements: Specifications / Capabilities:...

  15. Optogenetics enlightens neuroscience drug discovery.

    Science.gov (United States)

    Song, Chenchen; Knöpfel, Thomas

    2016-02-01

    Optogenetics - the use of light and genetics to manipulate and monitor the activities of defined cell populations - has already had a transformative impact on basic neuroscience research. Now, the conceptual and methodological advances associated with optogenetic approaches are providing fresh momentum to neuroscience drug discovery, particularly in areas that are stalled on the concept of 'fixing the brain chemistry'. Optogenetics is beginning to translate and transit into drug discovery in several key domains, including target discovery, high-throughput screening and novel therapeutic approaches to disease states. Here, we discuss the exciting potential of optogenetic technologies to transform neuroscience drug discovery.

  16. Thresholds for Discovery: EAD Tag Analysis in ArchiveGrid, and Implications for Discovery Systems

    Directory of Open Access Journals (Sweden)

    M. Proffitt

    2013-10-01

    Full Text Available The ArchiveGrid discovery system is made up in part of an aggregation of EAD (Encoded Archival Description encoded finding aids from hundreds of contributing institutions. In creating the ArchiveGrid discovery interface, the OCLC Research project team has long wrestled with what we can reasonably do with the large (120,000+ corpus of EAD documents. This paper presents an analysis of the EAD documents (the largest analysis of EAD documents to date. The analysis is paired with an evaluation of how well the documents support various aspects of online discovery. The paper also establishes a framework for thresholds of completeness and consistency to evaluate the results. We find that, while the EAD standard and encoding practices have not offered support for all aspects of online discovery, especially in a large and heterogeneous aggregation of EAD documents, current trends suggest that the evolution of the EAD standard and the shift from retrospective conversion to new shared tools for improved encoding hold real promise for the future.

  17. Interpretation of a discovery

    Directory of Open Access Journals (Sweden)

    Vučković Vladan

    2006-01-01

    Full Text Available The paper presents the development of the theory of asynchronous motors since Tesla’s discovery until the present day. The theory of steady state, as we know it today, was completed already during the first dozen of years. That was followed by a period of stagnation during a number of decades, when the theory of asynchronous motors was developed only in the framework of the general theory of electric machines, which was stimulated by the problems of the development of synchronous generators and big electric networks. It is only in our time that this simple motor, which was used for a long time just to perform crude tasks, became again the inspiration for the researchers and engineers who enabled it, with the help of power electronics and semi-conductor technology, to be used in the finest drives.

  18. Will solid-state drives accelerate your bioinformatics? In-depth profiling, performance analysis and beyond.

    Science.gov (United States)

    Lee, Sungmin; Min, Hyeyoung; Yoon, Sungroh

    2016-07-01

    A wide variety of large-scale data have been produced in bioinformatics. In response, the need for efficient handling of biomedical big data has been partly met by parallel computing. However, the time demand of many bioinformatics programs still remains high for large-scale practical uses because of factors that hinder acceleration by parallelization. Recently, new generations of storage devices have emerged, such as NAND flash-based solid-state drives (SSDs), and with the renewed interest in near-data processing, they are increasingly becoming acceleration methods that can accompany parallel processing. In certain cases, a simple drop-in replacement of hard disk drives by SSDs results in dramatic speedup. Despite the various advantages and continuous cost reduction of SSDs, there has been little review of SSD-based profiling and performance exploration of important but time-consuming bioinformatics programs. For an informative review, we perform in-depth profiling and analysis of 23 key bioinformatics programs using multiple types of devices. Based on the insight we obtain from this research, we further discuss issues related to design and optimize bioinformatics algorithms and pipelines to fully exploit SSDs. The programs we profile cover traditional and emerging areas of importance, such as alignment, assembly, mapping, expression analysis, variant calling and metagenomics. We explain how acceleration by parallelization can be combined with SSDs for improved performance and also how using SSDs can expedite important bioinformatics pipelines, such as variant calling by the Genome Analysis Toolkit and transcriptome analysis using RNA sequencing. We hope that this review can provide useful directions and tips to accompany future bioinformatics algorithm design procedures that properly consider new generations of powerful storage devices. PMID:26330577

  19. Genetics of rheumatoid arthritis contributes to biology and drug discovery

    Science.gov (United States)

    Okada, Yukinori; Wu, Di; Trynka, Gosia; Raj, Towfique; Terao, Chikashi; Ikari, Katsunori; Kochi, Yuta; Ohmura, Koichiro; Suzuki, Akari; Yoshida, Shinji; Graham, Robert R.; Manoharan, Arun; Ortmann, Ward; Bhangale, Tushar; Denny, Joshua C.; Carroll, Robert J.; Eyler, Anne E.; Greenberg, Jeffrey D.; Kremer, Joel M.; Pappas, Dimitrios A.; Jiang, Lei; Yin, Jian; Ye, Lingying; Su, Ding-Feng; Yang, Jian; Xie, Gang; Keystone, Ed; Westra, Harm-Jan; Esko, Tõnu; Metspalu, Andres; Zhou, Xuezhong; Gupta, Namrata; Mirel, Daniel; Stahl, Eli A.; Diogo, Dorothée; Cui, Jing; Liao, Katherine; Guo, Michael H.; Myouzen, Keiko; Kawaguchi, Takahisa; Coenen, Marieke J.H.; van Riel, Piet L.C.M.; van de Laar, Mart A.F.J.; Guchelaar, Henk-Jan; Huizinga, Tom W.J.; Dieudé, Philippe; Mariette, Xavier; Bridges, S. Louis; Zhernakova, Alexandra; Toes, Rene E.M.; Tak, Paul P.; Miceli-Richard, Corinne; Bang, So-Young; Lee, Hye-Soon; Martin, Javier; Gonzalez-Gay, Miguel A.; Rodriguez-Rodriguez, Luis; Rantapää-Dahlqvist, Solbritt; Ärlestig, Lisbeth; Choi, Hyon K.; Kamatani, Yoichiro; Galan, Pilar; Lathrop, Mark; Eyre, Steve; Bowes, John; Barton, Anne; de Vries, Niek; Moreland, Larry W.; Criswell, Lindsey A.; Karlson, Elizabeth W.; Taniguchi, Atsuo; Yamada, Ryo; Kubo, Michiaki; Liu, Jun S.; Bae, Sang-Cheol; Worthington, Jane; Padyukov, Leonid; Klareskog, Lars; Gregersen, Peter K.; Raychaudhuri, Soumya; Stranger, Barbara E.; De Jager, Philip L.; Franke, Lude; Visscher, Peter M.; Brown, Matthew A.; Yamanaka, Hisashi; Mimori, Tsuneyo; Takahashi, Atsushi; Xu, Huji; Behrens, Timothy W.; Siminovitch, Katherine A.; Momohara, Shigeki; Matsuda, Fumihiko; Yamamoto, Kazuhiko; Plenge, Robert M.

    2013-01-01

    A major challenge in human genetics is to devise a systematic strategy to integrate disease-associated variants with diverse genomic and biological datasets to provide insight into disease pathogenesis and guide drug discovery for complex traits such as rheumatoid arthritis (RA)1. Here, we performed a genome-wide association study (GWAS) meta-analysis in a total of >100,000 subjects of European and Asian ancestries (29,880 RA cases and 73,758 controls), by evaluating ~10 million single nucleotide polymorphisms (SNPs). We discovered 42 novel RA risk loci at a genome-wide level of significance, bringing the total to 1012–4. We devised an in-silico pipeline using established bioinformatics methods based on functional annotation5, cis-acting expression quantitative trait loci (cis-eQTL)6, and pathway analyses7–9 – as well as novel methods based on genetic overlap with human primary immunodeficiency (PID), hematological cancer somatic mutations and knock-out mouse phenotypes – to identify 98 biological candidate genes at these 101 risk loci. We demonstrate that these genes are the targets of approved therapies for RA, and further suggest that drugs approved for other indications may be repurposed for the treatment of RA. Together, this comprehensive genetic study sheds light on fundamental genes, pathways and cell types that contribute to RA pathogenesis, and provides empirical evidence that the genetics of RA can provide important information for drug discovery. PMID:24390342

  20. Genetics of rheumatoid arthritis contributes to biology and drug discovery.

    Science.gov (United States)

    Okada, Yukinori; Wu, Di; Trynka, Gosia; Raj, Towfique; Terao, Chikashi; Ikari, Katsunori; Kochi, Yuta; Ohmura, Koichiro; Suzuki, Akari; Yoshida, Shinji; Graham, Robert R; Manoharan, Arun; Ortmann, Ward; Bhangale, Tushar; Denny, Joshua C; Carroll, Robert J; Eyler, Anne E; Greenberg, Jeffrey D; Kremer, Joel M; Pappas, Dimitrios A; Jiang, Lei; Yin, Jian; Ye, Lingying; Su, Ding-Feng; Yang, Jian; Xie, Gang; Keystone, Ed; Westra, Harm-Jan; Esko, Tõnu; Metspalu, Andres; Zhou, Xuezhong; Gupta, Namrata; Mirel, Daniel; Stahl, Eli A; Diogo, Dorothée; Cui, Jing; Liao, Katherine; Guo, Michael H; Myouzen, Keiko; Kawaguchi, Takahisa; Coenen, Marieke J H; van Riel, Piet L C M; van de Laar, Mart A F J; Guchelaar, Henk-Jan; Huizinga, Tom W J; Dieudé, Philippe; Mariette, Xavier; Bridges, S Louis; Zhernakova, Alexandra; Toes, Rene E M; Tak, Paul P; Miceli-Richard, Corinne; Bang, So-Young; Lee, Hye-Soon; Martin, Javier; Gonzalez-Gay, Miguel A; Rodriguez-Rodriguez, Luis; Rantapää-Dahlqvist, Solbritt; Arlestig, Lisbeth; Choi, Hyon K; Kamatani, Yoichiro; Galan, Pilar; Lathrop, Mark; Eyre, Steve; Bowes, John; Barton, Anne; de Vries, Niek; Moreland, Larry W; Criswell, Lindsey A; Karlson, Elizabeth W; Taniguchi, Atsuo; Yamada, Ryo; Kubo, Michiaki; Liu, Jun S; Bae, Sang-Cheol; Worthington, Jane; Padyukov, Leonid; Klareskog, Lars; Gregersen, Peter K; Raychaudhuri, Soumya; Stranger, Barbara E; De Jager, Philip L; Franke, Lude; Visscher, Peter M; Brown, Matthew A; Yamanaka, Hisashi; Mimori, Tsuneyo; Takahashi, Atsushi; Xu, Huji; Behrens, Timothy W; Siminovitch, Katherine A; Momohara, Shigeki; Matsuda, Fumihiko; Yamamoto, Kazuhiko; Plenge, Robert M

    2014-02-20

    A major challenge in human genetics is to devise a systematic strategy to integrate disease-associated variants with diverse genomic and biological data sets to provide insight into disease pathogenesis and guide drug discovery for complex traits such as rheumatoid arthritis (RA). Here we performed a genome-wide association study meta-analysis in a total of >100,000 subjects of European and Asian ancestries (29,880 RA cases and 73,758 controls), by evaluating ∼10 million single-nucleotide polymorphisms. We discovered 42 novel RA risk loci at a genome-wide level of significance, bringing the total to 101 (refs 2 - 4). We devised an in silico pipeline using established bioinformatics methods based on functional annotation, cis-acting expression quantitative trait loci and pathway analyses--as well as novel methods based on genetic overlap with human primary immunodeficiency, haematological cancer somatic mutations and knockout mouse phenotypes--to identify 98 biological candidate genes at these 101 risk loci. We demonstrate that these genes are the targets of approved therapies for RA, and further suggest that drugs approved for other indications may be repurposed for the treatment of RA. Together, this comprehensive genetic study sheds light on fundamental genes, pathways and cell types that contribute to RA pathogenesis, and provides empirical evidence that the genetics of RA can provide important information for drug discovery. PMID:24390342

  1. Proteomics and Its Application in Biomarker Discovery and Drug Development

    Institute of Scientific and Technical Information of China (English)

    He Qing-Yu; Chiu Jen-Fu

    2004-01-01

    Proteomics is a research field aiming to characterize molecular and cellular dynamics in protein expression and function on a global level. The introduction of proteomics has been greatly broadening our view and accelerating our path in various medical researches. The most significant advantage of proteomics is its ability to examine a whole proteome or sub-proteome in a single experiment so that the protein alterations corresponding to a pathological or biochemical condition at a given time can be considered in an integrated way. Proteomic technology has been extensively used to tackle a wide variety of medical subjects including biomarker discovery and drug development. By complement with other new technique advance in genomics and bioinformatics,proteomics has a great potential to make considerable contribution to biomarker identification and revolutionize drug development process. A brief overview of the proteomic technologies will be provided and the application of proteomics in biomarker discovery and drug development will be discussed using our current research projects as examples.

  2. Bioinformatics and multiepitope DNA immunization to design rational snake antivenom.

    Directory of Open Access Journals (Sweden)

    Simon C Wagstaff

    2006-06-01

    Full Text Available BACKGROUND: Snake venom is a potentially lethal and complex mixture of hundreds of functionally diverse proteins that are difficult to purify and hence difficult to characterize. These difficulties have inhibited the development of toxin-targeted therapy, and conventional antivenom is still generated from the sera of horses or sheep immunized with whole venom. Although life-saving, antivenoms contain an immunoglobulin pool of unknown antigen specificity and known redundancy, which necessitates the delivery of large volumes of heterologous immunoglobulin to the envenomed victim, thus increasing the risk of anaphylactoid and serum sickness adverse effects. Here we exploit recent molecular sequence analysis and DNA immunization tools to design more rational toxin-targeted antivenom. METHODS AND FINDINGS: We developed a novel bioinformatic strategy that identified sequences encoding immunogenic and structurally significant epitopes from an expressed sequence tag database of a venom gland cDNA library of Echis ocellatus, the most medically important viper in Africa. Focusing upon snake venom metalloproteinases (SVMPs that are responsible for the severe and frequently lethal hemorrhage in envenomed victims, we identified seven epitopes that we predicted would be represented in all isomers of this multimeric toxin and that we engineered into a single synthetic multiepitope DNA immunogen (epitope string. We compared the specificity and toxin-neutralizing efficacy of antiserum raised against the string to antisera raised against a single SVMP toxin (or domains or antiserum raised by conventional (whole venom immunization protocols. The SVMP string antiserum, as predicted in silico, contained antibody specificities to numerous SVMPs in E. ocellatus venom and venoms of several other African vipers. More significantly, the antiserum cross-specifically neutralized hemorrhage induced by E. ocellatus and Cerastes cerastes cerastes venoms. CONCLUSIONS: These

  3. Discovery Learning Strategies in English

    Science.gov (United States)

    Singaravelu, G.

    2012-01-01

    The study substantiates that the effectiveness of Discovery Learning method in learning English Grammar for the learners at standard V. Discovery Learning is particularly beneficial for any student learning a second language. It promotes peer interaction and development of the language and the learning of concepts with content. Reichert and…

  4. Discoveries of isotopes by fission

    Indian Academy of Sciences (India)

    M Thoennessen

    2015-09-01

    Of the about 3000 isotopes presently known, about 20% have been discovered in fission. The history of fission as it relates to the discovery of isotopes as well as the various reaction mechanisms leading to isotope discoveries involving fission are presented.

  5. Service discovery using Bloom filters

    NARCIS (Netherlands)

    Goering, Patrick; Heijenk, Geert; Lelieveldt, B.P.F.; Haverkort, B.R.H.M.; Laat, de C.T.A.M.; Heijnsdijk, J.W.J.

    2006-01-01

    A protocol to perform service discovery in adhoc networks is introduced in this paper. Attenuated Bloom filters are used to distribute services to nodes in the neighborhood and thus enable local service discovery. The protocol has been implemented in a discrete event simulator to investigate the beh

  6. 29 CFR 2700.56 - Discovery; general.

    Science.gov (United States)

    2010-07-01

    ...(c) or 111 of the Act has been filed. 30 U.S.C. 815(c) and 821. (e) Completion of discovery... 29 Labor 9 2010-07-01 2010-07-01 false Discovery; general. 2700.56 Section 2700.56 Labor... Hearings § 2700.56 Discovery; general. (a) Discovery methods. Parties may obtain discovery by one or...

  7. The Semanticscience Integrated Ontology (SIO) for biomedical research and knowledge discovery.

    Science.gov (United States)

    Dumontier, Michel; Baker, Christopher Jo; Baran, Joachim; Callahan, Alison; Chepelev, Leonid; Cruz-Toledo, José; Del Rio, Nicholas R; Duck, Geraint; Furlong, Laura I; Keath, Nichealla; Klassen, Dana; McCusker, James P; Queralt-Rosinach, Núria; Samwald, Matthias; Villanueva-Rosales, Natalia; Wilkinson, Mark D; Hoehndorf, Robert

    2014-03-06

    The Semanticscience Integrated Ontology (SIO) is an ontology to facilitate biomedical knowledge discovery. SIO features a simple upper level comprised of essential types and relations for the rich description of arbitrary (real, hypothesized, virtual, fictional) objects, processes and their attributes. SIO specifies simple design patterns to describe and associate qualities, capabilities, functions, quantities, and informational entities including textual, geometrical, and mathematical entities, and provides specific extensions in the domains of chemistry, biology, biochemistry, and bioinformatics. SIO provides an ontological foundation for the Bio2RDF linked data for the life sciences project and is used for semantic integration and discovery for SADI-based semantic web services. SIO is freely available to all users under a creative commons by attribution license. See website for further information: http://sio.semanticscience.org.

  8. Model-driven discovery of underground metabolic functions in Escherichia coli

    DEFF Research Database (Denmark)

    Guzmán, Gabriela I.; Utrilla, José; Nurk, Sergey;

    2015-01-01

    of unknown underground pathways stemming from enzymatic cross-reactivity. We demonstrate a workflow that couples constraint-based modeling and bioinformatic tools with KO strain analysis and adaptive laboratory evolution for the purpose of predicting promiscuity at the genome scale. Three cases of genes......E, and gltA and prpC. This study demonstrates how a targeted model-driven approach to discovery can systematically fill knowledge gaps, characterize underground metabolism, and elucidate regulatory mechanisms of adaptation in response to gene KO perturbations.......-scale models, which have been widely used for predicting growth phenotypes in various environments or following a genetic perturbation; however, these predictions occasionally fail. Failed predictions of gene essentiality offer an opportunity for targeting biological discovery, suggesting the presence...

  9. Genomics Politics through Space and Time: The Case of Bioinformatics in Brazil.

    Science.gov (United States)

    Bicudo, Edison

    2016-01-01

    The emergence of scientific disciplines, as well as the policies aimed to steer them, have geographical implications. This becomes visible in areas such as genomics and related fields. In this paper, the relation between scientific evolution, political decisions and geographical configuration is studied. The recent formation of bioinformatics in Brazil is focused on. The study involves an analysis of data collected on the website of CNPq, a funding agency attached to the Ministry of Science and Technology. Furthermore, I conducted fieldwork in four cities, interviewing 15 bioinformaticians. In the history of Brazilian bioinformatics, three periods can be identified. In the first period (1900-1996), bioinformatics was actually absent, but biology research groups were formed which would subsequently explore bioinformatics. The second period (1997-2006) was marked by the emergence of the discipline and geographical concentration of major research groups in the southern part of Brazil. A third period can be pointed to (2007-2014), in which political choices have turned geographical diffusion and institutional equality into a national target. As a consequence of the recent shifts, genomics and bioinformatics researchers have been involved in a debate, some defending the existence of few specialized research and sequencing platforms, whereas others welcoming the constitution of a scientific scenario based on decentralized platforms. I defend an intermediate solution, whereby some places would be selected to be genomics hubs. This would fit the regional diversity of this vast country, in addition to tackling the scientific weaknesses of the northern area.

  10. Resource discovery in distributed digital libraries through visual knowledge navigation

    Institute of Scientific and Technical Information of China (English)

    GU Qian-yi; AHMAD Faisal; SUMNER Tamara

    2005-01-01

    In order to support users to search and browse for various resources, digital libraries are composed of discovery systems which provide user interface and information retrieve system. Recent researches in Information Retrieval have investigated different techniques through improving precision and recall to enhance the effectiveness of discovery system in digital libraries. In this paper, we present our work to enhance discovery system effectiveness with a different approach, through resource discovery based on visual knowledge navigation. In our Strand Map Services project under National Science Digital Library, we introduce the visual resource discovery system called conceptual browsing interfaces, to help educators and learners to locate,comprehend and use educational resources in digital libraries. The paper begins with a short introduction of the Strand Map Services. Then we illustrate the service architecture, the design and implementation of its major components. We will focus our discussion of how the visualization system of the Strand Map Services supports the visual knowledge navigation for distributed digital libraries. This includes the knowledge acquisition of the conceptual browsing interfaces, different knowledge representations in the system perspective and user interface perspective, visualization system modules, algorithm and Web services integration to use visual knowledge navigation to enhance resource discovery in digital libraries.

  11. Informationist support for a study of the role of proteases and peptides in cancer pain

    Directory of Open Access Journals (Sweden)

    Alisa Surkis

    2013-05-01

    Full Text Available Two supplements were awarded to the New York University Health Sciences Libraries from the National Library of Medicine's informationist grant program. These supplements funded research support in a number of areas, including data management and bioinformatics, two fields that the library had recently begun to explore. As such, the supplements were of particular value to the library as a testing ground for these newer services.This paper will discuss a supplement received in support of a grant from the National Institute of Dental and Craniofacial Research (PI: Brian Schmidt on the role of proteases and peptides in cancer pain. A number of barriers were preventing the research team from maximizing the efficiency and effectiveness of their work. A critical component of the research was to identify which proteins, from among hundreds identified in collected samples, to include in preclinical testing. This selection involved laborious and prohibitively time-consuming manual searching of the literature on protein function. Additionally, the research team encompassed ten investigators working in two different cities, which led to issues around the sharing and tracking of both data and citations.The supplement outlined three areas in which the informationists would assist the researchers in overcoming these barriers: 1 creating an automated literature searching system for protein function discovery, 2 introducing tools and associated workflows for sharing citations, and 3 introducing tools and workflows for sharing data and specimens.

  12. Integrated bioinformatics to decipher the ascorbic acid metabolic network in tomato.

    Science.gov (United States)

    Ruggieri, Valentino; Bostan, Hamed; Barone, Amalia; Frusciante, Luigi; Chiusano, Maria Luisa

    2016-07-01

    Ascorbic acid is involved in a plethora of reactions in both plant and animal metabolism. It plays an essential role neutralizing free radicals and acting as enzyme co-factor in several reaction. Since humans are ascorbate auxotrophs, enhancing the nutritional quality of a widely consumed vegetable like tomato is a desirable goal. Although the main reactions of the ascorbate biosynthesis, recycling and translocation pathways have been characterized, the assignment of tomato genes to each enzymatic step of the entire network has never been reported to date. By integrating bioinformatics approaches, omics resources and transcriptome collections today available for tomato, this study provides an overview on the architecture of the ascorbate pathway. In particular, 237 tomato loci were associated with the different enzymatic steps of the network, establishing the first comprehensive reference collection of candidate genes based on the recently released tomato gene annotation. The co-expression analyses performed by using RNA-Seq data supported the functional investigation of main expression patterns for the candidate genes and highlighted a coordinated spatial-temporal regulation of genes of the different pathways across tissues and developmental stages. Taken together these results provide evidence of a complex interplaying mechanism and highlight the pivotal role of functional related genes. The definition of genes contributing to alternative pathways and their expression profiles corroborates previous hypothesis on mechanisms of accumulation of ascorbate in the later stages of fruit ripening. Results and evidences here provided may facilitate the development of novel strategies for biofortification of tomato fruit with Vitamin C and offer an example framework for similar studies concerning other metabolic pathways and species. PMID:27007138

  13. Plant microRNAs and their role in defense against viruses: a bioinformatics approach

    Directory of Open Access Journals (Sweden)

    López Camilo

    2010-07-01

    Full Text Available Abstract Background microRNAs (miRNAs are non-coding short RNAs that regulate gene expression in eukaryotes by translational inhibition or cleavage of complementary mRNAs. In plants, miRNAs are known to target mostly transcription factors and are implicated in diverse aspects of plant growth and development. A role has been suggested for the miRNA pathway in antiviral defense in plants. In this work, a bioinformatics approach was taken to test whether plant miRNAs from six species could have antiviral activity by targeting the genomes of plant infecting viruses. Results All plants showed a repertoire of miRNAs with potential for targeting viral genomes. The viruses were targeted by abundant and conserved miRNA families in regions coding for cylindrical inclusion proteins, capsid proteins, and nuclear inclusion body proteins. The parameters for our predicted miRNA:target pairings in the viral genomes were similar to those for validated targets in the plant genomes, indicating that our predicted pairings might behave in-vivo as natural miRNa-target pairings. Our screening was compared with negative controls comprising randomly generated miRNAs, animal miRNAs, and genomes of animal-infecting viruses. We found that plant miRNAs target plant viruses more efficiently than any other sequences, but also, miRNAs can either preferentially target plant-infecting viruses or target any virus without preference. Conclusions Our results show a strong potential for antiviral activity of plant miRNAs and suggest that the miRNA pathway may be a support mechanism to the siRNA pathway in antiviral defense.

  14. Discovery and Formulation

    Science.gov (United States)

    Clark, Don; Rodger, Caroline

    The uses of Raman spectroscopy in active pharmaceutical ingredient (API) discovery, development and release cover a wide variety of sample types and applications. However, the technique does not have a high prominence in the pharmaceutical industry despite being recognised by regulatory authorities as a suitable methodology for the analysis and release of pharmaceutical API and products. One reason is that other analytical techniques are well established and changing to Raman methods is cost prohibitive considering return on investments (ROI). In addition the technique is often regarded as being one for "experts" and not one for main stream applications. As a consequence Raman spectroscopy is frequently the 2nd or 3rd technique of choice for a specific application. However, due to its unique sampling attributes (e.g. micro and macro measurements direct from the sample, through glass, from well plates or in the presence of water) and selectivity, applications of this technology are found throughout the life cycles of pharmaceutical products. It can therefore be considered to be a spectroscopic common denominator. This chapter highlights a number of routine, specialised and niche Raman spectroscopy applications which have been used in the development of new medicines while detailing some of the limitations of these approaches.

  15. Discovery Mondays: Surveyors' Tools

    CERN Multimedia

    2003-01-01

    Surveyors of all ages, have your rulers and compasses at the ready! This sixth edition of Discovery Monday is your chance to learn about the surveyor's tools - the state of the art in measuring instruments - and see for yourself how they work. With their usual daunting precision, the members of CERN's Surveying Group have prepared some demonstrations and exercises for you to try. Find out the techniques for ensuring accelerator alignment and learn about high-tech metrology systems such as deviation indicators, tracking lasers and total stations. The surveyors will show you how they precisely measure magnet positioning, with accuracy of a few thousandths of a millimetre. You can try your hand at precision measurement using different types of sensor and a modern-day version of the Romans' bubble level, accurate to within a thousandth of a millimetre. You will learn that photogrammetry techniques can transform even a simple digital camera into a remarkable measuring instrument. Finally, you will have a chance t...

  16. Discovery of neptunium

    International Nuclear Information System (INIS)

    This paper reports that a number of distinguished scientist, all knowledgeable about the periodic table, irradiated uranium with neutrons during 1934-1938. They observed a number of beta-emitting activities that seemed to be from transuranium elements. discovery of fission and of neptunium was delayed because they assumed that elements 93 and 94 would have chemical properties similar to those of rhenium and osmium, respectively. After fission was finally demonstrated, as new search for element 93 was initiated by McMillan, who showed that when thin films of uranium are exposed to neutrons, high-energy fission products leave the film; 23-min and 2.3-day activities remain. The 23-min activity was known to be an isotope of uranium. In May 1940 Abelson produced conclusive evidence that the 2.3 day activity was from the transuranium element 93. In a reduce state, element 93 coprecipitates with rare earth fluorides. In an oxidized state, the fluoride no longer coprecipitates. Behavior of element 93 was found to resemble that of uranium

  17. New Directions in Statistical Physics: Econophysics, Bioinformatics, and Pattern Recognition

    International Nuclear Information System (INIS)

    This book contains 18 contributions from different authors. Its subtitle 'Econophysics, Bioinformatics, and Pattern Recognition' says more precisely what it is about: not so much about central problems of conventional statistical physics like equilibrium phase transitions and critical phenomena, but about its interdisciplinary applications. After a long period of specialization, physicists have, over the last few decades, found more and more satisfaction in breaking out of the limitations set by the traditional classification of sciences. Indeed, this classification had never been strict, and physicists in particular had always ventured into other fields. Helmholtz, in the middle of the 19th century, had considered himself a physicist when working on physiology, stressing that the physics of animate nature is as much a legitimate field of activity as the physics of inanimate nature. Later, Max Delbrueck and Francis Crick did for experimental biology what Schroedinger did for its theoretical foundation. And many of the experimental techniques used in chemistry, biology, and medicine were developed by a steady stream of talented physicists who left their proper discipline to venture out into the wider world of science. The development we have witnessed over the last thirty years or so is different. It started with neural networks where methods could be applied which had been developed for spin glasses, but todays list includes vehicular traffic (driven lattice gases), geology (self-organized criticality), economy (fractal stochastic processes and large scale simulations), engineering (dynamical chaos), and many others. By staying in the physics departments, these activities have transformed the physics curriculum and the view physicists have of themselves. In many departments there are now courses on econophysics or on biological physics, and some universities offer degrees in the physics of traffic or in econophysics. In order to document this change of attitude

  18. Choosing experiments to accelerate collective discovery.

    Science.gov (United States)

    Rzhetsky, Andrey; Foster, Jacob G; Foster, Ian T; Evans, James A

    2015-11-24

    A scientist's choice of research problem affects his or her personal career trajectory. Scientists' combined choices affect the direction and efficiency of scientific discovery as a whole. In this paper, we infer preferences that shape problem selection from patterns of published findings and then quantify their efficiency. We represent research problems as links between scientific entities in a knowledge network. We then build a generative model of discovery informed by qualitative research on scientific problem selection. We map salient features from this literature to key network properties: an entity's importance corresponds to its degree centrality, and a problem's difficulty corresponds to the network distance it spans. Drawing on millions of papers and patents published over 30 years, we use this model to infer the typical research strategy used to explore chemical relationships in biomedicine. This strategy generates conservative research choices focused on building up knowledge around important molecules. These choices become more conservative over time. The observed strategy is efficient for initial exploration of the network and supports scientific careers that require steady output, but is inefficient for science as a whole. Through supercomputer experiments on a sample of the network, we study thousands of alternatives and identify strategies much more efficient at exploring mature knowledge networks. We find that increased risk-taking and the publication of experimental failures would substantially improve the speed of discovery. We consider institutional shifts in grant making, evaluation, and publication that would help realize these efficiencies.

  19. Innovative bioinformatic approaches for developing peptide-based vaccines against hypervariable viruses.

    Science.gov (United States)

    Sirskyj, Danylo; Diaz-Mitoma, Francisco; Golshani, Ashkan; Kumar, Ashok; Azizi, Ali

    2011-01-01

    The application of the fields of pharmacogenomics and pharmacogenetics to vaccine design has been recently labeled 'vaccinomics'. This newly named area of vaccine research, heavily intertwined with bioinformatics, seems to be leading the charge in developing novel vaccines for currently unmet medical needs against hypervariable viruses such as human immunodeficiency virus (HIV), hepatitis C and emerging avian and swine influenza. Some of the more recent bioinformatic approaches in the area of vaccine research include the use of epitope determination and prediction algorithms for exploring the use of peptide epitopes as vaccine immunogens. This paper briefly discusses and explores some current uses of bioinformatics in vaccine design toward the pursuit of peptide vaccines for hypervariable viruses. The various informatics and vaccine design strategies attempted by other groups toward hypervariable viruses will also be briefly examined, along with the strategy used by our group in the design and synthesis of peptide immunogens for candidate HIV and influenza vaccines.

  20. An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics

    International Nuclear Information System (INIS)

    Bioinformatics researchers are increasingly confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce. An overview is given of the current usage within the bioinformatics community of Hadoop, a top-level Apache Software Foundation project, and of associated open source software projects. The concepts behind Hadoop and the associated HBase project are defined, and current bioinformatics software that employ Hadoop is described. The focus is on next-generation sequencing, as the leading application area to date.

  1. Advances in knowledge discovery in databases

    CERN Document Server

    Adhikari, Animesh

    2015-01-01

    This book presents recent advances in Knowledge discovery in databases (KDD) with a focus on the areas of market basket database, time-stamped databases and multiple related databases. Various interesting and intelligent algorithms are reported on data mining tasks. A large number of association measures are presented, which play significant roles in decision support applications. This book presents, discusses and contrasts new developments in mining time-stamped data, time-based data analyses, the identification of temporal patterns, the mining of multiple related databases, as well as local patterns analysis.  

  2. A Survey of Bioinformatics Database and Software Usage through Mining the Literature.

    Directory of Open Access Journals (Sweden)

    Geraint Duck

    Full Text Available Computer-based resources are central to much, if not most, biological and medical research. However, while there is an ever expanding choice of bioinformatics resources to use, described within the biomedical literature, little work to date has provided an evaluation of the full range of availability or levels of usage of database and software resources. Here we use text mining to process the PubMed Central full-text corpus, identifying mentions of databases or software within the scientific literature. We provide an audit of the resources contained within the biomedical literature, and a comparison of their relative usage, both over time and between the sub-disciplines of bioinformatics, biology and medicine. We find that trends in resource usage differs between these domains. The bioinformatics literature emphasises novel resource development, while database and software usage within biology and medicine is more stable and conservative. Many resources are only mentioned in the bioinformatics literature, with a relatively small number making it out into general biology, and fewer still into the medical literature. In addition, many resources are seeing a steady decline in their usage (e.g., BLAST, SWISS-PROT, though some are instead seeing rapid growth (e.g., the GO, R. We find a striking imbalance in resource usage with the top 5% of resource names (133 names accounting for 47% of total usage, and over 70% of resources extracted being only mentioned once each. While these results highlight the dynamic and creative nature of bioinformatics research they raise questions about software reuse, choice and the sharing of bioinformatics practice. Is it acceptable that so many resources are apparently never reused? Finally, our work is a step towards automated extraction of scientific method from text. We make the dataset generated by our study available under the CC0 license here: http://dx.doi.org/10.6084/m9.figshare.1281371.

  3. A Survey of Bioinformatics Database and Software Usage through Mining the Literature

    Science.gov (United States)

    Nenadic, Goran; Filannino, Michele; Brass, Andy; Robertson, David L.; Stevens, Robert

    2016-01-01

    Computer-based resources are central to much, if not most, biological and medical research. However, while there is an ever expanding choice of bioinformatics resources to use, described within the biomedical literature, little work to date has provided an evaluation of the full range of availability or levels of usage of database and software resources. Here we use text mining to process the PubMed Central full-text corpus, identifying mentions of databases or software within the scientific literature. We provide an audit of the resources contained within the biomedical literature, and a comparison of their relative usage, both over time and between the sub-disciplines of bioinformatics, biology and medicine. We find that trends in resource usage differs between these domains. The bioinformatics literature emphasises novel resource development, while database and software usage within biology and medicine is more stable and conservative. Many resources are only mentioned in the bioinformatics literature, with a relatively small number making it out into general biology, and fewer still into the medical literature. In addition, many resources are seeing a steady decline in their usage (e.g., BLAST, SWISS-PROT), though some are instead seeing rapid growth (e.g., the GO, R). We find a striking imbalance in resource usage with the top 5% of resource names (133 names) accounting for 47% of total usage, and over 70% of resources extracted being only mentioned once each. While these results highlight the dynamic and creative nature of bioinformatics research they raise questions about software reuse, choice and the sharing of bioinformatics practice. Is it acceptable that so many resources are apparently never reused? Finally, our work is a step towards automated extraction of scientific method from text. We make the dataset generated by our study available under the CC0 license here: http://dx.doi.org/10.6084/m9.figshare.1281371. PMID:27331905

  4. A Survey of Bioinformatics Database and Software Usage through Mining the Literature.

    Science.gov (United States)

    Duck, Geraint; Nenadic, Goran; Filannino, Michele; Brass, Andy; Robertson, David L; Stevens, Robert

    2016-01-01

    Computer-based resources are central to much, if not most, biological and medical research. However, while there is an ever expanding choice of bioinformatics resources to use, described within the biomedical literature, little work to date has provided an evaluation of the full range of availability or levels of usage of database and software resources. Here we use text mining to process the PubMed Central full-text corpus, identifying mentions of databases or software within the scientific literature. We provide an audit of the resources contained within the biomedical literature, and a comparison of their relative usage, both over time and between the sub-disciplines of bioinformatics, biology and medicine. We find that trends in resource usage differs between these domains. The bioinformatics literature emphasises novel resource development, while database and software usage within biology and medicine is more stable and conservative. Many resources are only mentioned in the bioinformatics literature, with a relatively small number making it out into general biology, and fewer still into the medical literature. In addition, many resources are seeing a steady decline in their usage (e.g., BLAST, SWISS-PROT), though some are instead seeing rapid growth (e.g., the GO, R). We find a striking imbalance in resource usage with the top 5% of resource names (133 names) accounting for 47% of total usage, and over 70% of resources extracted being only mentioned once each. While these results highlight the dynamic and creative nature of bioinformatics research they raise questions about software reuse, choice and the sharing of bioinformatics practice. Is it acceptable that so many resources are apparently never reused? Finally, our work is a step towards automated extraction of scientific method from text. We make the dataset generated by our study available under the CC0 license here: http://dx.doi.org/10.6084/m9.figshare.1281371. PMID:27331905

  5. Coal Discovery Trail officially opens

    Energy Technology Data Exchange (ETDEWEB)

    Gallinger, C. [Elk Valley Coal Corporation, Sparwood, BC (Canada)

    2004-09-01

    The opening of the 30-kilometre Coal Discovery Trail in August is described. The trail, through a pine, spruce, and larch forest, extends from Sparwood to Fernie and passes through Hosmer, a historic mining site. The trail, part of the Elk Valley Coal Discovery Centre, will be used for hiking, bicycling, horseback riding, and cross-country skiing. The Coal Discovery Centre will provide an interpretive centre that concentrates on history of coal mining and miners, preservation of mining artifacts and sites, and existing technology. 3 figs.

  6. Deep Learning in Drug Discovery.

    Science.gov (United States)

    Gawehn, Erik; Hiss, Jan A; Schneider, Gisbert

    2016-01-01

    Artificial neural networks had their first heyday in molecular informatics and drug discovery approximately two decades ago. Currently, we are witnessing renewed interest in adapting advanced neural network architectures for pharmaceutical research by borrowing from the field of "deep learning". Compared with some of the other life sciences, their application in drug discovery is still limited. Here, we provide an overview of this emerging field of molecular informatics, present the basic concepts of prominent deep learning methods and offer motivation to explore these techniques for their usefulness in computer-assisted drug discovery and design. We specifically emphasize deep neural networks, restricted Boltzmann machine networks and convolutional networks.

  7. Deep Learning in Drug Discovery.

    Science.gov (United States)

    Gawehn, Erik; Hiss, Jan A; Schneider, Gisbert

    2016-01-01

    Artificial neural networks had their first heyday in molecular informatics and drug discovery approximately two decades ago. Currently, we are witnessing renewed interest in adapting advanced neural network architectures for pharmaceutical research by borrowing from the field of "deep learning". Compared with some of the other life sciences, their application in drug discovery is still limited. Here, we provide an overview of this emerging field of molecular informatics, present the basic concepts of prominent deep learning methods and offer motivation to explore these techniques for their usefulness in computer-assisted drug discovery and design. We specifically emphasize deep neural networks, restricted Boltzmann machine networks and convolutional networks. PMID:27491648

  8. "Broadband" Bioinformatics Skills Transfer with the Knowledge Transfer Programme (KTP: Educational Model for Upliftment and Sustainable Development.

    Directory of Open Access Journals (Sweden)

    Emile R Chimusa

    2015-11-01

    Full Text Available A shortage of practical skills and relevant expertise is possibly the primary obstacle to social upliftment and sustainable development in Africa. The "omics" fields, especially genomics, are increasingly dependent on the effective interpretation of large and complex sets of data. Despite abundant natural resources and population sizes comparable with many first-world countries from which talent could be drawn, countries in Africa still lag far behind the rest of the world in terms of specialized skills development. Moreover, there are serious concerns about disparities between countries within the continent. The multidisciplinary nature of the bioinformatics field, coupled with rare and depleting expertise, is a critical problem for the advancement of bioinformatics in Africa. We propose a formalized matchmaking system, which is aimed at reversing this trend, by introducing the Knowledge Transfer Programme (KTP. Instead of individual researchers travelling to other labs to learn, researchers with desirable skills are invited to join African research groups for six weeks to six months. Visiting researchers or trainers will pass on their expertise to multiple people simultaneously in their local environments, thus increasing the efficiency of knowledge transference. In return, visiting researchers have the opportunity to develop professional contacts, gain industry work experience, work with novel datasets, and strengthen and support their ongoing research. The KTP develops a network with a centralized hub through which groups and individuals are put into contact with one another and exchanges are facilitated by connecting both parties with potential funding sources. This is part of the PLOS Computational Biology Education collection.

  9. Rough-fuzzy pattern recognition applications in bioinformatics and medical imaging

    CERN Document Server

    Maji, Pradipta

    2012-01-01

    Learn how to apply rough-fuzzy computing techniques to solve problems in bioinformatics and medical image processing Emphasizing applications in bioinformatics and medical image processing, this text offers a clear framework that enables readers to take advantage of the latest rough-fuzzy computing techniques to build working pattern recognition models. The authors explain step by step how to integrate rough sets with fuzzy sets in order to best manage the uncertainties in mining large data sets. Chapters are logically organized according to the major phases of pattern recognition systems dev

  10. Cake: a bioinformatics pipeline for the integrated analysis of somatic variants in cancer genomes

    Science.gov (United States)

    Rashid, Mamunur; Robles-Espinoza, Carla Daniela; Rust, Alistair G.; Adams, David J.

    2013-01-01

    Summary: We have developed Cake, a bioinformatics software pipeline that integrates four publicly available somatic variant-calling algorithms to identify single nucleotide variants with higher sensitivity and accuracy than any one algorithm alone. Cake can be run on a high-performance computer cluster or used as a stand-alone application. Availabilty: Cake is open-source and is available from http://cakesomatic.sourceforge.net/ Contact: da1@sanger.ac.uk Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:23803469

  11. Cyberinfrastructure for Atmospheric Discovery

    Science.gov (United States)

    Wilhelmson, R.; Moore, C. W.

    2004-12-01

    Each year across the United States, floods, tornadoes, hail, strong winds, lightning, hurricanes, and winter storms cause hundreds of deaths, routinely disrupt transportation and commerce, and result in billions of dollars in annual economic losses . MEAD and LEAD are two recent efforts aimed at developing the cyberinfrastructure for studying and forecasting these events through collection, integration, and analysis of observational data coupled with numerical simulation, data mining, and visualization. MEAD (Modeling Environment for Atmospheric Discovery) has been funded for two years as an NCSA (National Center for Supercomputing Applications) Alliance Expedition. The goal of this expedition has been the development/adaptation of cyberinfrastructure that will enable research simulations, datamining, machine learning and visualization of hurricanes and storms utilizing the high performance computing environments including the TeraGrid. Portal grid and web infrastructure are being tested that will enable launching of hundreds of individual WRF (Weather Research and Forecasting) simulations. In a similar way, multiple Regional Ocean Modeling System (ROMS) or WRF/ROMS simulations can be carried out. Metadata and the resulting large volumes of data will then be made available for further study and for educational purposes using analysis, mining, and visualization services. Initial coupling of the ROMS and WRF codes has been completed and parallel I/O is being implemented for these models. Management of these activities (services) are being enabled through Grid workflow technologies (e.g. OGCE). LEAD (Linked Environments for Atmospheric Discovery) is a recently funded 5-year, large NSF ITR grant that involves 9 institutions who are developing a comprehensive national cyberinfrastructure in mesoscale meteorology, particularly one that can interoperate with others being developed. LEAD is addressing the fundamental information technology (IT) research challenges needed

  12. Serendipity: Accidental Discoveries in Science

    Science.gov (United States)

    Roberts, Royston M.

    1989-06-01

    Many of the things discovered by accident are important in our everyday lives: Teflon, Velcro, nylon, x-rays, penicillin, safety glass, sugar substitutes, and polyethylene and other plastics. And we owe a debt to accident for some of our deepest scientific knowledge, including Newton's theory of gravitation, the Big Bang theory of Creation, and the discovery of DNA. Even the Rosetta Stone, the Dead Sea Scrolls, and the ruins of Pompeii came to light through chance. This book tells the fascinating stories of these and other discoveries and reveals how the inquisitive human mind turns accident into discovery. Written for the layman, yet scientifically accurate, this illuminating collection of anecdotes portrays invention and discovery as quintessentially human acts, due in part to curiosity, perserverance, and luck.

  13. Discovery of the Arsenic Isotopes

    CERN Document Server

    Shore, A; Heim, M; Schuh, A; Thoennessen, M

    2009-01-01

    Twenty-nine arsenic isotopes have so far been observed; the discovery of these isotopes is discussed. For each isotope a brief summary of the first refereed publication, including the production and identification method, is presented.

  14. Discovery of the Scandium Isotopes

    CERN Document Server

    Meierfrankenfeld, D

    2010-01-01

    Twenty-three scandium isotopes have so far been observed; the discovery of these isotopes is discussed. For each isotope a brief summary of the first refereed publication, including the production and identification method, is presented.

  15. Discovery of the Tungsten Isotopes

    OpenAIRE

    A. Fritsch; Ginepro, J. Q.; Heim, M.; Schuh, A.; SHORE, A.; Thoennessen, M

    2009-01-01

    Thirty-five tungsten isotopes have so far been observed; the discovery of these isotopes is discussed. For each isotope a brief summary of the first refereed publication, including the production and identification method, is presented.

  16. Discovery of the Vanadium Isotopes

    OpenAIRE

    SHORE, A.; A. Fritsch; Heim, M.; Schuh, A.; Thoennessen, M

    2009-01-01

    Twenty-four vanadium isotopes have so far been observed; the discovery of these isotopes is discussed. For each isotope a brief summary of the first refereed publication, including the production and identification method, is presented.

  17. Discovery of the Barium Isotopes

    OpenAIRE

    SHORE, A.; A. Fritsch; Ginepro, J. Q.; Heim, M.; Schuh, A.; Thoennessen, M

    2009-01-01

    Thirty-eight barium isotopes have so far been observed; the discovery of these isotopes is discussed. For each isotope a brief summary of the first refereed publication, including the production and identification method, is presented.

  18. Discovery of the Silver Isotopes

    OpenAIRE

    Schuh, A.; A. Fritsch; Ginepro, J. Q.; Heim, M.; SHORE, A.; Thoennessen, M

    2009-01-01

    Thirty-eight silver isotopes have so far been observed; the discovery of these isotopes is discussed. For each isotope a brief summary of the first refereed publication, including the production and identification method, is presented.

  19. Discovery of the Cadmium Isotopes

    OpenAIRE

    Amos, S.; Thoennessen, M

    2009-01-01

    Thirty-seven cadmium isotopes have so far been observed; the discovery of these isotopes is discussed. For each isotope a brief summary of the first refereed publication, including the production and identification method, is presented.

  20. Discovery of the Krypton Isotopes

    OpenAIRE

    Heim, M.; A. Fritsch; Schuh, A.; SHORE, A.; Thoennessen, M

    2009-01-01

    Thirty-two krypton isotopes have been observed so far; the discovery of these isotopes is discussed. For each isotope a brief summary of the first refereed publication, including the production and identification method, is presented.

  1. Discovery of the Iron Isotopes

    OpenAIRE

    Schuh, A.; A. Fritsch; Heim, M.; SHORE, A.; Thoennessen, M

    2009-01-01

    Twenty-eight iron isotopes have so far been observed; the discovery of these isotopes is discussed. For each isotope a brief summary of the first refereed publication, including the production and identification method, is presented.

  2. Discovery of the Gold Isotopes

    OpenAIRE

    Schuh, A.; A. Fritsch; Ginepro, J. Q.; Heim, M.; SHORE, A.; Thoennessen, M

    2009-01-01

    Thirty-six gold isotopes have so far been observed; the discovery of these isotopes is discussed. For each isotope a brief summary of the first refereed publication, including the production and identification method, is presented.

  3. Discovery of the Cobalt Isotopes

    OpenAIRE

    Szymanski, T; Thoennessen, M

    2009-01-01

    Twenty-six cobalt isotopes have so far been observed; the discovery of these isotopes is discussed. For each isotope a brief summary of the first refereed publication, including the production and identification method, is presented.

  4. Taxonomy Enabled Discovery (TED) Project

    Data.gov (United States)

    National Aeronautics and Space Administration — The proposal addresses the NASA's need to enable scientific discovery and the topic's requirements for: processing large volumes of data, commonly available on the...

  5. Cloud Infrastructures for In Silico Drug Discovery: Economic and Practical Aspects

    OpenAIRE

    Daniele D'Agostino; Andrea Clematis; Alfonso Quarati; Daniele Cesini; Federica Chiappori; Luciano Milanesi; Ivan Merelli

    2013-01-01

    Cloud computing opens new perspectives for small-medium biotechnology laboratories that need to perform bioinformatics analysis in a flexible and effective way. This seems particularly true for hybrid clouds that couple the scalability offered by general-purpose public clouds with the greater control and ad hoc customizations supplied by the private ones. A hybrid cloud broker, acting as an intermediary between users and public providers, can support customers in the selection of the most sui...

  6. A discovery solution from Ebsco

    OpenAIRE

    Martín Michavila, Mariano

    2009-01-01

    EBSCO offers complementary access tools to gain greater value from the discovery solution: a customizable, versatile knowledgebase and listing service that enhances discovery, brings more visibility to the library’s collection, get users to specific content, and makes it easier for librarians to manage e-resources. It is an effective tool for end users who may gather information by browsing (journal lists, TOC, etc.), alerting (journal alerts, RSS) or known-item searching.

  7. Verbessern Discovery Systeme die Informationskompetenz?

    OpenAIRE

    Dörte Böhner

    2013-01-01

    Mit dem Auftauchen von Discovery Systemen entstand die Hoffnung, dass der Schulungsbedarf und damit der Aufwand für die Vermittlung von Informationskompetenz verringert und durch die Systeme selbst die Informationskompetenz verbessert werden kann. Dieser Beitrag beleuchtet kritisch das Zusam­menspiel von Discovery Systemen und Informationskompetenz anhand von Recherchegewohnheiten eher untrainierter Studierender. Anhand einer kurzen Gegenüberstellung der unterschiedlichen Recherche­konzepte v...

  8. Verbessern Discovery Systeme die Informationskompetenz?

    Directory of Open Access Journals (Sweden)

    Dörte Böhner

    2013-09-01

    Full Text Available Mit dem Auftauchen von Discovery Systemen entstand die Hoffnung, dass der Schulungsbedarf und damit der Aufwand für die Vermittlung von Informationskompetenz verringert und durch die Systeme selbst die Informationskompetenz verbessert werden kann. Dieser Beitrag beleuchtet kritisch das Zusam­menspiel von Discovery Systemen und Informationskompetenz anhand von Recherchegewohnheiten eher untrainierter Studierender. Anhand einer kurzen Gegenüberstellung der unterschiedlichen Recherche­konzepte von Suchmaschinen und bibliothekarischen Rechercheangeboten werden Rückschlüsse auf die zu fördernden Schwerpunkte bei der Informationskompetenz gezogen und untersucht, welche Schlüsselrolle Discovery Systeme dabei spielen.The dissemination of discovery systems has raised hopes that the need for extensive teaching of information literacy might be reduced in the future. Also, their intuitive approach has been hopefully seen to enable the development of a better information literacy. This article critically focusses on the interaction of discovery systems and information literacy by analysing research behaviours of comparatively untrained students. In order to define which specific themes should be promoted by information literacy education, the author examines various research concepts of search engines as well as research offers of libraries. On this basis the key role of discovery systems will be explored.

  9. Evaluating the Effectiveness of a Practical Inquiry-Based Learning Bioinformatics Module on Undergraduate Student Engagement and Applied Skills

    Science.gov (United States)

    Brown, James A. L.

    2016-01-01

    A pedagogic intervention, in the form of an inquiry-based peer-assisted learning project (as a practical student-led bioinformatics module), was assessed for its ability to increase students' engagement, practical bioinformatic skills and process-specific knowledge. Elements assessed were process-specific knowledge following module completion,…

  10. Get Involved in Planetary Discoveries through New Worlds, New Discoveries

    Science.gov (United States)

    Shupla, Christine; Shipp, S. S.; Halligan, E.; Dalton, H.; Boonstra, D.; Buxner, S.; SMD Planetary Forum, NASA

    2013-01-01

    "New Worlds, New Discoveries" is a synthesis of NASA’s 50-year exploration history which provides an integrated picture of our new understanding of our solar system. As NASA spacecraft head to and arrive at key locations in our solar system, "New Worlds, New Discoveries" provides an integrated picture of our new understanding of the solar system to educators and the general public! The site combines the amazing discoveries of past NASA planetary missions with the most recent findings of ongoing missions, and connects them to the related planetary science topics. "New Worlds, New Discoveries," which includes the "Year of the Solar System" and the ongoing celebration of the "50 Years of Exploration," includes 20 topics that share thematic solar system educational resources and activities, tied to the national science standards. This online site and ongoing event offers numerous opportunities for the science community - including researchers and education and public outreach professionals - to raise awareness, build excitement, and make connections with educators, students, and the public about planetary science. Visitors to the site will find valuable hands-on science activities, resources and educational materials, as well as the latest news, to engage audiences in planetary science topics and their related mission discoveries. The topics are tied to the big questions of planetary science: how did the Sun’s family of planets and bodies originate and how have they evolved? How did life begin and evolve on Earth, and has it evolved elsewhere in our solar system? Scientists and educators are encouraged to get involved either directly or by sharing "New Worlds, New Discoveries" and its resources with educators, by conducting presentations and events, sharing their resources and events to add to the site, and adding their own public events to the site’s event calendar! Visit to find quality resources and ideas. Connect with educators, students and the public to

  11. Can Full Duplex reduce the discovery time in D2D Communication?

    DEFF Research Database (Denmark)

    Gatnau, Marta; Berardinelli, Gilberto; Mahmood, Nurul Huda;

    2016-01-01

    Device-to-device (D2D) communication is considered as one of the key technologies to support new types of services, such as public safety and proximity-based applications. D2D communication requires a discovery phase, i.e., the node awareness procedure prior to the communication phase. Conventional...... half duplex transmission may not be sufficient to provide fast discovery and cope with the strict latency targets of future 5G services. On the other hand, in-band full duplex, by allowing simultaneous transmission and reception, may complete the discovery phase faster. In this paper, the potential of...... full duplex in providing fast discovery for the next 5th generation (5G) system supporting D2D communication is investigated. A design for such system is presented and evaluated via simulations, showing that full duplex can accelerate the discovery phase by supporting a higher transmission probability...

  12. A Metadata Schema for Geospatial Resource Discovery Use Cases

    Directory of Open Access Journals (Sweden)

    Darren Hardy

    2014-07-01

    Full Text Available We introduce a metadata schema that focuses on GIS discovery use cases for patrons in a research library setting. Text search, faceted refinement, and spatial search and relevancy are among GeoBlacklight's primary use cases for federated geospatial holdings. The schema supports a variety of GIS data types and enables contextual, collection-oriented discovery applications as well as traditional portal applications. One key limitation of GIS resource discovery is the general lack of normative metadata practices, which has led to a proliferation of metadata schemas and duplicate records. The ISO 19115/19139 and FGDC standards specify metadata formats, but are intricate, lengthy, and not focused on discovery. Moreover, they require sophisticated authoring environments and cataloging expertise. Geographic metadata standards target preservation and quality measure use cases, but they do not provide for simple inter-institutional sharing of metadata for discovery use cases. To this end, our schema reuses elements from Dublin Core and GeoRSS to leverage their normative semantics, community best practices, open-source software implementations, and extensive examples already deployed in discovery contexts such as web search and mapping. Finally, we discuss a Solr implementation of the schema using a "geo" extension to MODS.

  13. Knowledge Discovery from Biomedical Ontologies in Cross Domains.

    Science.gov (United States)

    Shen, Feichen; Lee, Yugyung

    2016-01-01

    In recent years, there is an increasing demand for sharing and integration of medical data in biomedical research. In order to improve a health care system, it is required to support the integration of data by facilitating semantic interoperability systems and practices. Semantic interoperability is difficult to achieve in these systems as the conceptual models underlying datasets are not fully exploited. In this paper, we propose a semantic framework, called Medical Knowledge Discovery and Data Mining (MedKDD), that aims to build a topic hierarchy and serve the semantic interoperability between different ontologies. For the purpose, we fully focus on the discovery of semantic patterns about the association of relations in the heterogeneous information network representing different types of objects and relationships in multiple biological ontologies and the creation of a topic hierarchy through the analysis of the discovered patterns. These patterns are used to cluster heterogeneous information networks into a set of smaller topic graphs in a hierarchical manner and then to conduct cross domain knowledge discovery from the multiple biological ontologies. Thus, patterns made a greater contribution in the knowledge discovery across multiple ontologies. We have demonstrated the cross domain knowledge discovery in the MedKDD framework using a case study with 9 primary biological ontologies from Bio2RDF and compared it with the cross domain query processing approach, namely SLAP. We have confirmed the effectiveness of the MedKDD framework in knowledge discovery from multiple medical ontologies.

  14. Knowledge Discovery from Biomedical Ontologies in Cross Domains

    Science.gov (United States)

    Shen, Feichen; Lee, Yugyung

    2016-01-01

    In recent years, there is an increasing demand for sharing and integration of medical data in biomedical research. In order to improve a health care system, it is required to support the integration of data by facilitating semantic interoperability systems and practices. Semantic interoperability is difficult to achieve in these systems as the conceptual models underlying datasets are not fully exploited. In this paper, we propose a semantic framework, called Medical Knowledge Discovery and Data Mining (MedKDD), that aims to build a topic hierarchy and serve the semantic interoperability between different ontologies. For the purpose, we fully focus on the discovery of semantic patterns about the association of relations in the heterogeneous information network representing different types of objects and relationships in multiple biological ontologies and the creation of a topic hierarchy through the analysis of the discovered patterns. These patterns are used to cluster heterogeneous information networks into a set of smaller topic graphs in a hierarchical manner and then to conduct cross domain knowledge discovery from the multiple biological ontologies. Thus, patterns made a greater contribution in the knowledge discovery across multiple ontologies. We have demonstrated the cross domain knowledge discovery in the MedKDD framework using a case study with 9 primary biological ontologies from Bio2RDF and compared it with the cross domain query processing approach, namely SLAP. We have confirmed the effectiveness of the MedKDD framework in knowledge discovery from multiple medical ontologies. PMID:27548262

  15. Knowledge Discovery in Textual Documentation: Qualitative and Quantitative Analyses.

    Science.gov (United States)

    Loh, Stanley; De Oliveira, Jose Palazzo M.; Gastal, Fabio Leite

    2001-01-01

    Presents an application of knowledge discovery in texts (KDT) concerning medical records of a psychiatric hospital. The approach helps physicians to extract knowledge about patients and diseases that may be used for epidemiological studies, for training professionals, and to support physicians to diagnose and evaluate diseases. (Author/AEF)

  16. Introducing Bioinformatics into the Biology Curriculum: Exploring the National Center for Biotechnology Information.

    Science.gov (United States)

    Smith, Thomas M.; Emmeluth, Donald S.

    2002-01-01

    Explains the potential of computer technology in science education and argues for integrating bioinformatics into the biology curriculum. Describes three modules designed to introduce students to technological advancements in biology. Aims to develop a better understanding of molecular biology among students. (YDS)

  17. Bioinformatics and structural characterization of a hypothetical protein from Streptococcus mutans

    DEFF Research Database (Denmark)

    Nan, Jie; Brostromer, Erik; Liu, Xiang-Yu;

    2009-01-01

    . From the interlinking structural and bioinformatics studies, we have concluded that SMU.440 could be involved in polyketide-like antibiotic resistance, providing a better understanding of this hypothetical protein. Besides, the combination of multiple methods in this study can be used as a general...

  18. Bioinformatic analysis of functional differences between the immunoproteasome and the constitutive proteasome

    DEFF Research Database (Denmark)

    Kesmir, Can; van Noort, V.; de Boer, R.J.;

    2003-01-01

    not yet been quantified how different the specificity of two forms of the proteasome are. The main question, which still lacks direct evidence, is whether the immunoproteasome generates more MHC ligands. Here we use bioinformatics tools to quantify these differences and show that the immunoproteasome...

  19. A Complementary Bioinformatics Approach to Identify Potential Plant Cell Wall Glycosyltransferase-Encoding Genes

    DEFF Research Database (Denmark)

    Egelund, Jack; Skjøt, Michael; Geshi, Naomi;

    2004-01-01

    . Although much is known with regard to composition and fine structures of the plant CW, only a handful of CW biosynthetic GT genes-all classified in the CAZy system-have been characterized. In an effort to identify CW GTs that have not yet been classified in the CAZy database, a simple bioinformatics...

  20. A novel bioinformatic strategy to characterise microbial communities in biogas reactors

    DEFF Research Database (Denmark)

    Treu, Laura; Campanaro, Stefano; De Francisci, Davide;

    2014-01-01

    , J.J. et al., 2011). For this reason we developed a bioinformatics strategy in order to create a tool to review the generated dataset and to obtain a more strict control on the bacterial composition at the species level, with estimation of its reliability. The program perform local similarity search...