WorldWideScience

Sample records for comprehensive bioinformatics analysis

  1. Comprehensive decision tree models in bioinformatics.

    Directory of Open Access Journals (Sweden)

    Gregor Stiglic

    Full Text Available PURPOSE: Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. METHODS: This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. RESULTS: The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. CONCLUSIONS: The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets

  2. Comprehensive decision tree models in bioinformatics.

    Science.gov (United States)

    Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter

    2012-01-01

    Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class attributes and a high number of possibly

  3. Bioinformatics

    DEFF Research Database (Denmark)

    Baldi, Pierre; Brunak, Søren

    , and medicine will be particularly affected by the new results and the increased understanding of life at the molecular level. Bioinformatics is the development and application of computer methods for analysis, interpretation, and prediction, as well as for the design of experiments. It has emerged...

  4. Integrative cluster analysis in bioinformatics

    CERN Document Server

    Abu-Jamous, Basel; Nandi, Asoke K

    2015-01-01

    Clustering techniques are increasingly being put to use in the analysis of high-throughput biological datasets. Novel computational techniques to analyse high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. This book details the complete pathway of cluster analysis, from the basics of molecular biology to the generation of biological knowledge. The book also presents the latest clustering methods and clustering validation, thereby offering the reader a comprehensive review o

  5. Mathematics and evolutionary biology make bioinformatics education comprehensible

    Science.gov (United States)

    Weisstein, Anton E.

    2013-01-01

    The patterns of variation within a molecular sequence data set result from the interplay between population genetic, molecular evolutionary and macroevolutionary processes—the standard purview of evolutionary biologists. Elucidating these patterns, particularly for large data sets, requires an understanding of the structure, assumptions and limitations of the algorithms used by bioinformatics software—the domain of mathematicians and computer scientists. As a result, bioinformatics often suffers a ‘two-culture’ problem because of the lack of broad overlapping expertise between these two groups. Collaboration among specialists in different fields has greatly mitigated this problem among active bioinformaticians. However, science education researchers report that much of bioinformatics education does little to bridge the cultural divide, the curriculum too focused on solving narrow problems (e.g. interpreting pre-built phylogenetic trees) rather than on exploring broader ones (e.g. exploring alternative phylogenetic strategies for different kinds of data sets). Herein, we present an introduction to the mathematics of tree enumeration, tree construction, split decomposition and sequence alignment. We also introduce off-line downloadable software tools developed by the BioQUEST Curriculum Consortium to help students learn how to interpret and critically evaluate the results of standard bioinformatics analyses. PMID:23821621

  6. Mathematics and evolutionary biology make bioinformatics education comprehensible.

    Science.gov (United States)

    Jungck, John R; Weisstein, Anton E

    2013-09-01

    The patterns of variation within a molecular sequence data set result from the interplay between population genetic, molecular evolutionary and macroevolutionary processes-the standard purview of evolutionary biologists. Elucidating these patterns, particularly for large data sets, requires an understanding of the structure, assumptions and limitations of the algorithms used by bioinformatics software-the domain of mathematicians and computer scientists. As a result, bioinformatics often suffers a 'two-culture' problem because of the lack of broad overlapping expertise between these two groups. Collaboration among specialists in different fields has greatly mitigated this problem among active bioinformaticians. However, science education researchers report that much of bioinformatics education does little to bridge the cultural divide, the curriculum too focused on solving narrow problems (e.g. interpreting pre-built phylogenetic trees) rather than on exploring broader ones (e.g. exploring alternative phylogenetic strategies for different kinds of data sets). Herein, we present an introduction to the mathematics of tree enumeration, tree construction, split decomposition and sequence alignment. We also introduce off-line downloadable software tools developed by the BioQUEST Curriculum Consortium to help students learn how to interpret and critically evaluate the results of standard bioinformatics analyses.

  7. Comprehensive analysis of the N-glycan biosynthetic pathway using bioinformatics to generate UniCorn: A theoretical N-glycan structure database.

    Science.gov (United States)

    Akune, Yukie; Lin, Chi-Hung; Abrahams, Jodie L; Zhang, Jingyu; Packer, Nicolle H; Aoki-Kinoshita, Kiyoko F; Campbell, Matthew P

    2016-08-05

    Glycan structures attached to proteins are comprised of diverse monosaccharide sequences and linkages that are produced from precursor nucleotide-sugars by a series of glycosyltransferases. Databases of these structures are an essential resource for the interpretation of analytical data and the development of bioinformatics tools. However, with no template to predict what structures are possible the human glycan structure databases are incomplete and rely heavily on the curation of published, experimentally determined, glycan structure data. In this work, a library of 45 human glycosyltransferases was used to generate a theoretical database of N-glycan structures comprised of 15 or less monosaccharide residues. Enzyme specificities were sourced from major online databases including Kyoto Encyclopedia of Genes and Genomes (KEGG) Glycan, Consortium for Functional Glycomics (CFG), Carbohydrate-Active enZymes (CAZy), GlycoGene DataBase (GGDB) and BRENDA. Based on the known activities, more than 1.1 million theoretical structures and 4.7 million synthetic reactions were generated and stored in our database called UniCorn. Furthermore, we analyzed the differences between the predicted glycan structures in UniCorn and those contained in UniCarbKB (www.unicarbkb.org), a database which stores experimentally described glycan structures reported in the literature, and demonstrate that UniCorn can be used to aid in the assignment of ambiguous structures whilst also serving as a discovery database. Copyright © 2016 Elsevier Ltd. All rights reserved.

  8. Bioinformatic Analysis of Strawberry GSTF12 Gene

    Science.gov (United States)

    Wang, Xiran; Jiang, Leiyu; Tang, Haoru

    2018-01-01

    GSTF12 has always been known as a key factor of proanthocyanins accumulate in plant testa. Through bioinformatics analysis of the nucleotide and encoded protein sequence of GSTF12, it is more advantageous to the study of genes related to anthocyanin biosynthesis accumulation pathway. Therefore, we chosen GSTF12 gene of 11 kinds species, downloaded their nucleotide and protein sequence from NCBI as the research object, found strawberry GSTF12 gene via bioinformation analyse, constructed phylogenetic tree. At the same time, we analysed the strawberry GSTF12 gene of physical and chemical properties and its protein structure and so on. The phylogenetic tree showed that Strawberry and petunia were closest relative. By the protein prediction, we found that the protein owed one proper signal peptide without obvious transmembrane regions.

  9. In-depth analysis of the adipocyte proteome by mass spectrometry and bioinformatics

    DEFF Research Database (Denmark)

    Adachi, Jun; Kumar, Chanchal; Zhang, Yanling

    2007-01-01

    , mitochondria, membrane, and cytosol of 3T3-L1 adipocytes. We identified 3,287 proteins while essentially eliminating false positives, making this one of the largest high confidence proteomes reported to date. Comprehensive bioinformatics analysis revealed that the adipocyte proteome, despite its specialized...

  10. Secretome Analysis of Lipid-Induced Insulin Resistance in Skeletal Muscle Cells by a Combined Experimental and Bioinformatics Workflow

    DEFF Research Database (Denmark)

    Deshmukh, Atul S; Cox, Juergen; Jensen, Lars Juhl

    2015-01-01

    , in principle, allows an unbiased and comprehensive analysis of cellular secretomes; however, the distinction of bona fide secreted proteins from proteins released upon lysis of a small fraction of dying cells remains challenging. Here we applied highly sensitive MS and streamlined bioinformatics to analyze......-resistant conditions. Our study demonstrates an efficient combined experimental and bioinformatics workflow to identify putative secreted proteins from insulin-resistant skeletal muscle cells, which could easily be adapted to other cellular models....

  11. Rapid cloning and bioinformatic analysis of spinach Y chromosome ...

    Indian Academy of Sciences (India)

    Rapid cloning and bioinformatic analysis of spinach Y chromosome- specific EST sequences. Chuan-Liang Deng, Wei-Li Zhang, Ying Cao, Shao-Jing Wang, ... Arabidopsis thaliana mRNA for mitochondrial half-ABC transporter (STA1 gene). 389 2.31E-13. 98.96. SP3−12. Betula pendula histidine kinase 3 (HK3) mRNA, ...

  12. A Brief Review of Bioinformatics Tools for Glycosylation Analysis by Mass Spectrometry

    OpenAIRE

    Tsai, Pei-Lun; Chen, Sung-Fang

    2017-01-01

    The purpose of this review is to provide updated information regarding bioinformatic software for the use in the characterization of glycosylated structures since 2013. A comprehensive review by Woodin et al. Analyst 138: 2793?2803, 2013 (ref. 1) described two main approaches that are introduced for starting researchers in this area; analysis of released glycans and the identification of glycopeptide in enzymatic digests, respectively. Complementary to that report, this review focuses on m...

  13. Biochip microsystem for bioinformatics recognition and analysis

    Science.gov (United States)

    Lue, Jaw-Chyng (Inventor); Fang, Wai-Chi (Inventor)

    2011-01-01

    A system with applications in pattern recognition, or classification, of DNA assay samples. Because DNA reference and sample material in wells of an assay may be caused to fluoresce depending upon dye added to the material, the resulting light may be imaged onto an embodiment comprising an array of photodetectors and an adaptive neural network, with applications to DNA analysis. Other embodiments are described and claimed.

  14. Bioinformatics analysis of disordered proteins in prokaryotes

    Directory of Open Access Journals (Sweden)

    Malkov Saša N

    2011-03-01

    Full Text Available Abstract Background A significant number of proteins have been shown to be intrinsically disordered, meaning that they lack a fixed 3 D structure or contain regions that do not posses a well defined 3 D structure. It has also been proven that a protein's disorder content is related to its function. We have performed an exhaustive analysis and comparison of the disorder content of proteins from prokaryotic organisms (i.e., superkingdoms Archaea and Bacteria with respect to functional categories they belong to, i.e., Clusters of Orthologous Groups of proteins (COGs and groups of COGs-Cellular processes (Cp, Information storage and processing (Isp, Metabolism (Me and Poorly characterized (Pc. We also analyzed the disorder content of proteins with respect to various genomic, metabolic and ecological characteristics of the organism they belong to. We used correlations and association rule mining in order to identify the most confident associations between specific modalities of the characteristics considered and disorder content. Results Bacteria are shown to have a somewhat higher level of protein disorder than archaea, except for proteins in the Me functional group. It is demonstrated that the Isp and Cp functional groups in particular (L-repair function and N-cell motility and secretion COGs of proteins in specific possess the highest disorder content, while Me proteins, in general, posses the lowest. Disorder fractions have been confirmed to have the lowest level for the so-called order-promoting amino acids and the highest level for the so-called disorder promoters. For each pair of organism characteristics, specific modalities are identified with the maximum disorder proteins in the corresponding organisms, e.g., high genome size-high GC content organisms, facultative anaerobic-low GC content organisms, aerobic-high genome size organisms, etc. Maximum disorder in archaea is observed for high GC content-low genome size organisms, high GC content

  15. Bioinformatics and Microarray Data Analysis on the Cloud.

    Science.gov (United States)

    Calabrese, Barbara; Cannataro, Mario

    2016-01-01

    High-throughput platforms such as microarray, mass spectrometry, and next-generation sequencing are producing an increasing volume of omics data that needs large data storage and computing power. Cloud computing offers massive scalable computing and storage, data sharing, on-demand anytime and anywhere access to resources and applications, and thus, it may represent the key technology for facing those issues. In fact, in the recent years it has been adopted for the deployment of different bioinformatics solutions and services both in academia and in the industry. Although this, cloud computing presents several issues regarding the security and privacy of data, that are particularly important when analyzing patients data, such as in personalized medicine. This chapter reviews main academic and industrial cloud-based bioinformatics solutions; with a special focus on microarray data analysis solutions and underlines main issues and problems related to the use of such platforms for the storage and analysis of patients data.

  16. Bioinformatics analysis and detection of gelatinase encoded gene in Lysinibacillussphaericus

    Science.gov (United States)

    Repin, Rul Aisyah Mat; Mutalib, Sahilah Abdul; Shahimi, Safiyyah; Khalid, Rozida Mohd.; Ayob, Mohd. Khan; Bakar, Mohd. Faizal Abu; Isa, Mohd Noor Mat

    2016-11-01

    In this study, we performed bioinformatics analysis toward genome sequence of Lysinibacillussphaericus (L. sphaericus) to determine gene encoded for gelatinase. L. sphaericus was isolated from soil and gelatinase species-specific bacterium to porcine and bovine gelatin. This bacterium offers the possibility of enzymes production which is specific to both species of meat, respectively. The main focus of this research is to identify the gelatinase encoded gene within the bacteria of L. Sphaericus using bioinformatics analysis of partially sequence genome. From the research study, three candidate gene were identified which was, gelatinase candidate gene 1 (P1), NODE_71_length_93919_cov_158.931839_21 which containing 1563 base pair (bp) in size with 520 amino acids sequence; Secondly, gelatinase candidate gene 2 (P2), NODE_23_length_52851_cov_190.061386_17 which containing 1776 bp in size with 591 amino acids sequence; and Thirdly, gelatinase candidate gene 3 (P3), NODE_106_length_32943_cov_169.147919_8 containing 1701 bp in size with 566 amino acids sequence. Three pairs of oligonucleotide primers were designed and namely as, F1, R1, F2, R2, F3 and R3 were targeted short sequences of cDNA by PCR. The amplicons were reliably results in 1563 bp in size for candidate gene P1 and 1701 bp in size for candidate gene P3. Therefore, the results of bioinformatics analysis of L. Sphaericus resulting in gene encoded gelatinase were identified.

  17. 5th HUPO BPP Bioinformatics Meeting at the European Bioinformatics Institute in Hinxton, UK--Setting the analysis frame.

    Science.gov (United States)

    Stephan, Christian; Hamacher, Michael; Blüggel, Martin; Körting, Gerhard; Chamrad, Daniel; Scheer, Christian; Marcus, Katrin; Reidegeld, Kai A; Lohaus, Christiane; Schäfer, Heike; Martens, Lennart; Jones, Philip; Müller, Michael; Auyeung, Kevin; Taylor, Chris; Binz, Pierre-Alain; Thiele, Herbert; Parkinson, David; Meyer, Helmut E; Apweiler, Rolf

    2005-09-01

    The Bioinformatics Committee of the HUPO Brain Proteome Project (HUPO BPP) meets regularly to execute the post-lab analyses of the data produced in the HUPO BPP pilot studies. On July 7, 2005 the members came together for the 5th time at the European Bioinformatics Institute (EBI) in Hinxton, UK, hosted by Rolf Apweiler. As a main result, the parameter set of the semi-automated data re-analysis of MS/MS spectra has been elaborated and the subsequent work steps have been defined.

  18. Video Bioinformatics Analysis of Human Embryonic Stem Cell Colony Growth

    Science.gov (United States)

    Lin, Sabrina; Fonteno, Shawn; Satish, Shruthi; Bhanu, Bir; Talbot, Prue

    2010-01-01

    Because video data are complex and are comprised of many images, mining information from video material is difficult to do without the aid of computer software. Video bioinformatics is a powerful quantitative approach for extracting spatio-temporal data from video images using computer software to perform dating mining and analysis. In this article, we introduce a video bioinformatics method for quantifying the growth of human embryonic stem cells (hESC) by analyzing time-lapse videos collected in a Nikon BioStation CT incubator equipped with a camera for video imaging. In our experiments, hESC colonies that were attached to Matrigel were filmed for 48 hours in the BioStation CT. To determine the rate of growth of these colonies, recipes were developed using CL-Quant software which enables users to extract various types of data from video images. To accurately evaluate colony growth, three recipes were created. The first segmented the image into the colony and background, the second enhanced the image to define colonies throughout the video sequence accurately, and the third measured the number of pixels in the colony over time. The three recipes were run in sequence on video data collected in a BioStation CT to analyze the rate of growth of individual hESC colonies over 48 hours. To verify the truthfulness of the CL-Quant recipes, the same data were analyzed manually using Adobe Photoshop software. When the data obtained using the CL-Quant recipes and Photoshop were compared, results were virtually identical, indicating the CL-Quant recipes were truthful. The method described here could be applied to any video data to measure growth rates of hESC or other cells that grow in colonies. In addition, other video bioinformatics recipes can be developed in the future for other cell processes such as migration, apoptosis, and cell adhesion. PMID:20495527

  19. ISEV position paper: extracellular vesicle RNA analysis and bioinformatics

    Directory of Open Access Journals (Sweden)

    Andrew F. Hill

    2013-12-01

    Full Text Available Extracellular vesicles (EVs are the collective term for the various vesicles that are released by cells into the extracellular space. Such vesicles include exosomes and microvesicles, which vary by their size and/or protein and genetic cargo. With the discovery that EVs contain genetic material in the form of RNA (evRNA has come the increased interest in these vesicles for their potential use as sources of disease biomarkers and potential therapeutic agents. Rapid developments in the availability of deep sequencing technologies have enabled the study of EV-related RNA in detail. In October 2012, the International Society for Extracellular Vesicles (ISEV held a workshop on “evRNA analysis and bioinformatics.” Here, we report the conclusions of one of the roundtable discussions where we discussed evRNA analysis technologies and provide some guidelines to researchers in the field to consider when performing such analysis.

  20. Agonist Binding to Chemosensory Receptors: A Systematic Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Fabrizio Fierro

    2017-09-01

    Full Text Available Human G-protein coupled receptors (hGPCRs constitute a large and highly pharmaceutically relevant membrane receptor superfamily. About half of the hGPCRs' family members are chemosensory receptors, involved in bitter taste and olfaction, along with a variety of other physiological processes. Hence these receptors constitute promising targets for pharmaceutical intervention. Molecular modeling has been so far the most important tool to get insights on agonist binding and receptor activation. Here we investigate both aspects by bioinformatics-based predictions across all bitter taste and odorant receptors for which site-directed mutagenesis data are available. First, we observe that state-of-the-art homology modeling combined with previously used docking procedures turned out to reproduce only a limited fraction of ligand/receptor interactions inferred by experiments. This is most probably caused by the low sequence identity with available structural templates, which limits the accuracy of the protein model and in particular of the side-chains' orientations. Methods which transcend the limited sampling of the conformational space of docking may improve the predictions. As an example corroborating this, we review here multi-scale simulations from our lab and show that, for the three complexes studied so far, they significantly enhance the predictive power of the computational approach. Second, our bioinformatics analysis provides support to previous claims that several residues, including those at positions 1.50, 2.50, and 7.52, are involved in receptor activation.

  1. Bioinformatics analysis of Brucella vaccines and vaccine targets using VIOLIN.

    Science.gov (United States)

    He, Yongqun; Xiang, Zuoshuang

    2010-09-27

    Brucella spp. are Gram-negative, facultative intracellular bacteria that cause brucellosis, one of the commonest zoonotic diseases found worldwide in humans and a variety of animal species. While several animal vaccines are available, there is no effective and safe vaccine for prevention of brucellosis in humans. VIOLIN (http://www.violinet.org) is a web-based vaccine database and analysis system that curates, stores, and analyzes published data of commercialized vaccines, and vaccines in clinical trials or in research. VIOLIN contains information for 454 vaccines or vaccine candidates for 73 pathogens. VIOLIN also contains many bioinformatics tools for vaccine data analysis, data integration, and vaccine target prediction. To demonstrate the applicability of VIOLIN for vaccine research, VIOLIN was used for bioinformatics analysis of existing Brucella vaccines and prediction of new Brucella vaccine targets. VIOLIN contains many literature mining programs (e.g., Vaxmesh) that provide in-depth analysis of Brucella vaccine literature. As a result of manual literature curation, VIOLIN contains information for 38 Brucella vaccines or vaccine candidates, 14 protective Brucella antigens, and 68 host response studies to Brucella vaccines from 97 peer-reviewed articles. These Brucella vaccines are classified in the Vaccine Ontology (VO) system and used for different ontological applications. The web-based VIOLIN vaccine target prediction program Vaxign was used to predict new Brucella vaccine targets. Vaxign identified 14 outer membrane proteins that are conserved in six virulent strains from B. abortus, B. melitensis, and B. suis that are pathogenic in humans. Of the 14 membrane proteins, two proteins (Omp2b and Omp31-1) are not present in B. ovis, a Brucella species that is not pathogenic in humans. Brucella vaccine data stored in VIOLIN were compared and analyzed using the VIOLIN query system. Bioinformatics curation and ontological representation of Brucella vaccines

  2. Bioinformatics approaches to single-cell analysis in developmental biology.

    Science.gov (United States)

    Yalcin, Dicle; Hakguder, Zeynep M; Otu, Hasan H

    2016-03-01

    Individual cells within the same population show various degrees of heterogeneity, which may be better handled with single-cell analysis to address biological and clinical questions. Single-cell analysis is especially important in developmental biology as subtle spatial and temporal differences in cells have significant associations with cell fate decisions during differentiation and with the description of a particular state of a cell exhibiting an aberrant phenotype. Biotechnological advances, especially in the area of microfluidics, have led to a robust, massively parallel and multi-dimensional capturing, sorting, and lysis of single-cells and amplification of related macromolecules, which have enabled the use of imaging and omics techniques on single cells. There have been improvements in computational single-cell image analysis in developmental biology regarding feature extraction, segmentation, image enhancement and machine learning, handling limitations of optical resolution to gain new perspectives from the raw microscopy images. Omics approaches, such as transcriptomics, genomics and epigenomics, targeting gene and small RNA expression, single nucleotide and structural variations and methylation and histone modifications, rely heavily on high-throughput sequencing technologies. Although there are well-established bioinformatics methods for analysis of sequence data, there are limited bioinformatics approaches which address experimental design, sample size considerations, amplification bias, normalization, differential expression, coverage, clustering and classification issues, specifically applied at the single-cell level. In this review, we summarize biological and technological advancements, discuss challenges faced in the aforementioned data acquisition and analysis issues and present future prospects for application of single-cell analyses to developmental biology. © The Author 2015. Published by Oxford University Press on behalf of the European

  3. DNA mimic proteins: functions, structures, and bioinformatic analysis.

    Science.gov (United States)

    Wang, Hao-Ching; Ho, Chun-Han; Hsu, Kai-Cheng; Yang, Jinn-Moon; Wang, Andrew H-J

    2014-05-13

    DNA mimic proteins have DNA-like negative surface charge distributions, and they function by occupying the DNA binding sites of DNA binding proteins to prevent these sites from being accessed by DNA. DNA mimic proteins control the activities of a variety of DNA binding proteins and are involved in a wide range of cellular mechanisms such as chromatin assembly, DNA repair, transcription regulation, and gene recombination. However, the sequences and structures of DNA mimic proteins are diverse, making them difficult to predict by bioinformatic search. To date, only a few DNA mimic proteins have been reported. These DNA mimics were not found by searching for functional motifs in their sequences but were revealed only by structural analysis of their charge distribution. This review highlights the biological roles and structures of 16 reported DNA mimic proteins. We also discuss approaches that might be used to discover new DNA mimic proteins.

  4. Bioinformatics Database Tools in Analysis of Genetics of Neurodevelopmental Disorders

    Directory of Open Access Journals (Sweden)

    Dibyashree Mallik

    2017-10-01

    Full Text Available Bioinformatics tools are recently used in various sectors of biology. Many questions regarding Neurodevelopmental disorder which arises as a major health issue recently can be solved by using various bioinformatics databases. Schizophrenia is such a mental disorder which is now arises as a major threat in young age people because it is mostly seen in case of people during their late adolescence or early adulthood period. Databases like DISGENET, GWAS, PHARMGKB, and DRUGBANK have huge repository of genes associated with schizophrenia. We found a lot of genes are being associated with schizophrenia, but approximately 200 genes are found to be present in any of these databases. After further screening out process 20 genes are found to be highly associated with each other and are also a common genes in many other diseases also. It is also found that they all are serves as a common targeting gene in many antipsychotic drugs. After analysis of various biological properties, molecular function it is found that these 20 genes are mostly involved in biological regulation process and are having receptor activity. They are belonging mainly to receptor protein class. Among these 20 genes CYP2C9, CYP3A4, DRD2, HTR1A, HTR2A are shown to be a main targeting genes of most of the antipsychotic drugs and are associated with  more than 40% diseases. The basic findings of the present study enumerated that a suitable combined drug can be design by targeting these genes which can be used for the better treatment of schizophrenia.

  5. In silico cloning and bioinformatic analysis of PEPCK gene in ...

    African Journals Online (AJOL)

    Phosphoenolpyruvate carboxykinase (PEPCK), a critical gluconeogenic enzyme, catalyzes the first committed step in the diversion of tricarboxylic acid cycle intermediates toward gluconeogenesis. According to the relative conservation of homologous gene, a bioinformatics strategy was applied to clone Fusarium ...

  6. PATRIC, the bacterial bioinformatics database and analysis resource

    Science.gov (United States)

    Wattam, Alice R.; Abraham, David; Dalay, Oral; Disz, Terry L.; Driscoll, Timothy; Gabbard, Joseph L.; Gillespie, Joseph J.; Gough, Roger; Hix, Deborah; Kenyon, Ronald; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K.; Olson, Robert; Overbeek, Ross; Pusch, Gordon D.; Shukla, Maulik; Schulman, Julie; Stevens, Rick L.; Sullivan, Daniel E.; Vonstein, Veronika; Warren, Andrew; Will, Rebecca; Wilson, Meredith J.C.; Yoo, Hyun Seung; Zhang, Chengdong; Zhang, Yan; Sobral, Bruno W.

    2014-01-01

    The Pathosystems Resource Integration Center (PATRIC) is the all-bacterial Bioinformatics Resource Center (BRC) (http://www.patricbrc.org). A joint effort by two of the original National Institute of Allergy and Infectious Diseases-funded BRCs, PATRIC provides researchers with an online resource that stores and integrates a variety of data types [e.g. genomics, transcriptomics, protein–protein interactions (PPIs), three-dimensional protein structures and sequence typing data] and associated metadata. Datatypes are summarized for individual genomes and across taxonomic levels. All genomes in PATRIC, currently more than 10 000, are consistently annotated using RAST, the Rapid Annotations using Subsystems Technology. Summaries of different data types are also provided for individual genes, where comparisons of different annotations are available, and also include available transcriptomic data. PATRIC provides a variety of ways for researchers to find data of interest and a private workspace where they can store both genomic and gene associations, and their own private data. Both private and public data can be analyzed together using a suite of tools to perform comparative genomic or transcriptomic analysis. PATRIC also includes integrated information related to disease and PPIs. All the data and integrated analysis and visualization tools are freely available. This manuscript describes updates to the PATRIC since its initial report in the 2007 NAR Database Issue. PMID:24225323

  7. Bioinformatics Analysis of MAPKKK Family Genes in Medicago truncatula

    Directory of Open Access Journals (Sweden)

    Wei Li

    2016-04-01

    Full Text Available Mitogen‐activated protein kinase kinase kinase (MAPKKK is a component of the MAPK cascade pathway that plays an important role in plant growth, development, and response to abiotic stress, the functions of which have been well characterized in several plant species, such as Arabidopsis, rice, and maize. In this study, we performed genome‐wide and systemic bioinformatics analysis of MAPKKK family genes in Medicago truncatula. In total, there were 73 MAPKKK family members identified by search of homologs, and they were classified into three subfamilies, MEKK, ZIK, and RAF. Based on the genomic duplication function, 72 MtMAPKKK genes were located throughout all chromosomes, but they cluster in different chromosomes. Using microarray data and high‐throughput sequencing‐data, we assessed their expression profiles in growth and development processes; these results provided evidence for exploring their important functions in developmental regulation, especially in the nodulation process. Furthermore, we investigated their expression in abiotic stresses by RNA‐seq, which confirmed their critical roles in signal transduction and regulation processes under stress. In summary, our genome‐wide, systemic characterization and expressional analysis of MtMAPKKK genes will provide insights that will be useful for characterizing the molecular functions of these genes in M. truncatula.

  8. PATRIC, the bacterial bioinformatics database and analysis resource.

    Science.gov (United States)

    Wattam, Alice R; Abraham, David; Dalay, Oral; Disz, Terry L; Driscoll, Timothy; Gabbard, Joseph L; Gillespie, Joseph J; Gough, Roger; Hix, Deborah; Kenyon, Ronald; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K; Olson, Robert; Overbeek, Ross; Pusch, Gordon D; Shukla, Maulik; Schulman, Julie; Stevens, Rick L; Sullivan, Daniel E; Vonstein, Veronika; Warren, Andrew; Will, Rebecca; Wilson, Meredith J C; Yoo, Hyun Seung; Zhang, Chengdong; Zhang, Yan; Sobral, Bruno W

    2014-01-01

    The Pathosystems Resource Integration Center (PATRIC) is the all-bacterial Bioinformatics Resource Center (BRC) (http://www.patricbrc.org). A joint effort by two of the original National Institute of Allergy and Infectious Diseases-funded BRCs, PATRIC provides researchers with an online resource that stores and integrates a variety of data types [e.g. genomics, transcriptomics, protein-protein interactions (PPIs), three-dimensional protein structures and sequence typing data] and associated metadata. Datatypes are summarized for individual genomes and across taxonomic levels. All genomes in PATRIC, currently more than 10,000, are consistently annotated using RAST, the Rapid Annotations using Subsystems Technology. Summaries of different data types are also provided for individual genes, where comparisons of different annotations are available, and also include available transcriptomic data. PATRIC provides a variety of ways for researchers to find data of interest and a private workspace where they can store both genomic and gene associations, and their own private data. Both private and public data can be analyzed together using a suite of tools to perform comparative genomic or transcriptomic analysis. PATRIC also includes integrated information related to disease and PPIs. All the data and integrated analysis and visualization tools are freely available. This manuscript describes updates to the PATRIC since its initial report in the 2007 NAR Database Issue.

  9. BioGPS descriptors for rational engineering of enzyme promiscuity and structure based bioinformatic analysis.

    Directory of Open Access Journals (Sweden)

    Valerio Ferrario

    Full Text Available A new bioinformatic methodology was developed founded on the Unsupervised Pattern Cognition Analysis of GRID-based BioGPS descriptors (Global Positioning System in Biological Space. The procedure relies entirely on three-dimensional structure analysis of enzymes and does not stem from sequence or structure alignment. The BioGPS descriptors account for chemical, geometrical and physical-chemical features of enzymes and are able to describe comprehensively the active site of enzymes in terms of "pre-organized environment" able to stabilize the transition state of a given reaction. The efficiency of this new bioinformatic strategy was demonstrated by the consistent clustering of four different Ser hydrolases classes, which are characterized by the same active site organization but able to catalyze different reactions. The method was validated by considering, as a case study, the engineering of amidase activity into the scaffold of a lipase. The BioGPS tool predicted correctly the properties of lipase variants, as demonstrated by the projection of mutants inside the BioGPS "roadmap".

  10. ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis.

    Science.gov (United States)

    Römer, Michael; Eichner, Johannes; Dräger, Andreas; Wrzodek, Clemens; Wrzodek, Finja; Zell, Andreas

    2016-01-01

    Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/.

  11. BioMaS: a modular pipeline for Bioinformatic analysis of Metagenomic AmpliconS.

    Science.gov (United States)

    Fosso, Bruno; Santamaria, Monica; Marzano, Marinella; Alonso-Alemany, Daniel; Valiente, Gabriel; Donvito, Giacinto; Monaco, Alfonso; Notarangelo, Pasquale; Pesole, Graziano

    2015-07-01

    Substantial advances in microbiology, molecular evolution and biodiversity have been carried out in recent years thanks to Metagenomics, which allows to unveil the composition and functions of mixed microbial communities in any environmental niche. If the investigation is aimed only at the microbiome taxonomic structure, a target-based metagenomic approach, here also referred as Meta-barcoding, is generally applied. This approach commonly involves the selective amplification of a species-specific genetic marker (DNA meta-barcode) in the whole taxonomic range of interest and the exploration of its taxon-related variants through High-Throughput Sequencing (HTS) technologies. The accessibility to proper computational systems for the large-scale bioinformatic analysis of HTS data represents, currently, one of the major challenges in advanced Meta-barcoding projects. BioMaS (Bioinformatic analysis of Metagenomic AmpliconS) is a new bioinformatic pipeline designed to support biomolecular researchers involved in taxonomic studies of environmental microbial communities by a completely automated workflow, comprehensive of all the fundamental steps, from raw sequence data upload and cleaning to final taxonomic identification, that are absolutely required in an appropriately designed Meta-barcoding HTS-based experiment. In its current version, BioMaS allows the analysis of both bacterial and fungal environments starting directly from the raw sequencing data from either Roche 454 or Illumina HTS platforms, following two alternative paths, respectively. BioMaS is implemented into a public web service available at https://recasgateway.ba.infn.it/ and is also available in Galaxy at http://galaxy.cloud.ba.infn.it:8080 (only for Illumina data). BioMaS is a friendly pipeline for Meta-barcoding HTS data analysis specifically designed for users without particular computing skills. A comparative benchmark, carried out by using a simulated dataset suitably designed to broadly represent

  12. Buying in to bioinformatics: an introduction to commercial sequence analysis software.

    Science.gov (United States)

    Smith, David Roy

    2015-07-01

    Advancements in high-throughput nucleotide sequencing techniques have brought with them state-of-the-art bioinformatics programs and software packages. Given the importance of molecular sequence data in contemporary life science research, these software suites are becoming an essential component of many labs and classrooms, and as such are frequently designed for non-computer specialists and marketed as one-stop bioinformatics toolkits. Although beautifully designed and powerful, user-friendly bioinformatics packages can be expensive and, as more arrive on the market each year, it can be difficult for researchers, teachers and students to choose the right software for their needs, especially if they do not have a bioinformatics background. This review highlights some of the currently available and most popular commercial bioinformatics packages, discussing their prices, usability, features and suitability for teaching. Although several commercial bioinformatics programs are arguably overpriced and overhyped, many are well designed, sophisticated and, in my opinion, worth the investment. If you are just beginning your foray into molecular sequence analysis or an experienced genomicist, I encourage you to explore proprietary software bundles. They have the potential to streamline your research, increase your productivity, energize your classroom and, if anything, add a bit of zest to the often dry detached world of bioinformatics. © The Author 2014. Published by Oxford University Press.

  13. DincRNA: a comprehensive web-based bioinformatics toolkit for exploring disease associations and ncRNA function.

    Science.gov (United States)

    Cheng, Liang; Hu, Yang; Sun, Jie; Zhou, Meng; Jiang, Qinghua

    2018-06-01

    DincRNA aims to provide a comprehensive web-based bioinformatics toolkit to elucidate the entangled relationships among diseases and non-coding RNAs (ncRNAs) from the perspective of disease similarity. The quantitative way to illustrate relationships of pair-wise diseases always depends on their molecular mechanisms, and structures of the directed acyclic graph of Disease Ontology (DO). Corresponding methods for calculating similarity of pair-wise diseases involve Resnik's, Lin's, Wang's, PSB and SemFunSim methods. Recently, disease similarity was validated suitable for calculating functional similarities of ncRNAs and prioritizing ncRNA-disease pairs, and it has been widely applied for predicting the ncRNA function due to the limited biological knowledge from wet lab experiments of these RNAs. For this purpose, a large number of algorithms and priori knowledge need to be integrated. e.g. 'pair-wise best, pairs-average' (PBPA) and 'pair-wise all, pairs-maximum' (PAPM) methods for calculating functional similarities of ncRNAs, and random walk with restart (RWR) method for prioritizing ncRNA-disease pairs. To facilitate the exploration of disease associations and ncRNA function, DincRNA implemented all of the above eight algorithms based on DO and disease-related genes. Currently, it provides the function to query disease similarity scores, miRNA and lncRNA functional similarity scores, and the prioritization scores of lncRNA-disease and miRNA-disease pairs. http://bio-annotation.cn:18080/DincRNAClient/. biofomeng@hotmail.com or qhjiang@hit.edu.cn. Supplementary data are available at Bioinformatics online.

  14. COMAN: a web server for comprehensive metatranscriptomics analysis.

    Science.gov (United States)

    Ni, Yueqiong; Li, Jun; Panagiotou, Gianni

    2016-08-11

    Microbiota-oriented studies based on metagenomic or metatranscriptomic sequencing have revolutionised our understanding on microbial ecology and the roles of both clinical and environmental microbes. The analysis of massive metatranscriptomic data requires extensive computational resources, a collection of bioinformatics tools and expertise in programming. We developed COMAN (Comprehensive Metatranscriptomics Analysis), a web-based tool dedicated to automatically and comprehensively analysing metatranscriptomic data. COMAN pipeline includes quality control of raw reads, removal of reads derived from non-coding RNA, followed by functional annotation, comparative statistical analysis, pathway enrichment analysis, co-expression network analysis and high-quality visualisation. The essential data generated by COMAN are also provided in tabular format for additional analysis and integration with other software. The web server has an easy-to-use interface and detailed instructions, and is freely available at http://sbb.hku.hk/COMAN/ CONCLUSIONS: COMAN is an integrated web server dedicated to comprehensive functional analysis of metatranscriptomic data, translating massive amount of reads to data tables and high-standard figures. It is expected to facilitate the researchers with less expertise in bioinformatics in answering microbiota-related biological questions and to increase the accessibility and interpretation of microbiota RNA-Seq data.

  15. Analysis of requirements for teaching materials based on the course bioinformatics for plant metabolism

    Science.gov (United States)

    Balqis, Widodo, Lukiati, Betty; Amin, Mohamad

    2017-05-01

    A way to improve the quality of learning in the course of Plant Metabolism in the Department of Biology, State University of Malang, is to develop teaching materials. This research evaluates the needs of bioinformatics-based teaching material in the course Plant Metabolism by the Analyze, Design, Develop, Implement, and Evaluate (ADDIE) development model. Data were collected through questionnaires distributed to the students in the Plant Metabolism course of the Department of Biology, University of Malang, and analysis of the plan of lectures semester (RPS). Learning gains of this course show that it is not yet integrated into the field of bioinformatics. All respondents stated that plant metabolism books do not include bioinformatics and fail to explain the metabolism of a chemical compound of a local plant in Indonesia. Respondents thought that bioinformatics can explain examples and metabolism of a secondary metabolite analysis techniques and discuss potential medicinal compounds from local plants. As many as 65% of the respondents said that the existing metabolism book could not be used to understand secondary metabolism in lectures of plant metabolism. Therefore, the development of teaching materials including plant metabolism-based bioinformatics is important to improve the understanding of the lecture material in plant metabolism.

  16. GProX, a User-Friendly Platform for Bioinformatics Analysis and Visualization of Quantitative Proteomics Data

    DEFF Research Database (Denmark)

    Rigbolt, Kristoffer T G; Vanselow, Jens T; Blagoev, Blagoy

    2011-01-01

    -friendly platform for comprehensive analysis, inspection and visualization of quantitative proteomics data we developed the Graphical Proteomics Data Explorer (GProX)(1). The program requires no special bioinformatics training, as all functions of GProX are accessible within its graphical user-friendly interface...... such as database querying, clustering based on abundance ratios, feature enrichment tests for e.g. GO terms and pathway analysis tools. A number of plotting options for visualization of quantitative proteomics data is available and most analysis functions in GProX create customizable high quality graphical...... displays in both vector and bitmap formats. The generic import requirements allow data originating from essentially all mass spectrometry platforms, quantitation strategies and software to be analyzed in the program. GProX represents a powerful approach to proteomics data analysis providing proteomics...

  17. Bioinformatics analysis identifies several intrinsically disordered human E3 ubiquitin-protein ligases

    DEFF Research Database (Denmark)

    Boomsma, Wouter Krogh; Nielsen, Sofie Vincents; Lindorff-Larsen, Kresten

    2016-01-01

    conduct a bioinformatics analysis to examine >600 human and S. cerevisiae E3 ligases to identify enzymes that are similar to San1 in terms of function and/or mechanism of substrate recognition. An initial sequence-based database search was found to detect candidates primarily based on the homology...

  18. The Revolution in Viral Genomics as Exemplified by the Bioinformatic Analysis of Human Adenoviruses

    Directory of Open Access Journals (Sweden)

    Sarah Torres

    2010-06-01

    Full Text Available Over the past 30 years, genomic and bioinformatic analysis of human adenoviruses has been achieved using a variety of DNA sequencing methods; initially with the use of restriction enzymes and more currently with the use of the GS FLX pyrosequencing technology. Following the conception of DNA sequencing in the 1970s, analysis of adenoviruses has evolved from 100 base pair mRNA fragments to entire genomes. Comparative genomics of adenoviruses made its debut in 1984 when nucleotides and amino acids of coding sequences within the hexon genes of two human adenoviruses (HAdV, HAdV–C2 and HAdV–C5, were compared and analyzed. It was determined that there were three different zones (1-393, 394-1410, 1411-2910 within the hexon gene, of which HAdV–C2 and HAdV–C5 shared zones 1 and 3 with 95% and 89.5% nucleotide identity, respectively. In 1992, HAdV-C5 became the first adenovirus genome to be fully sequenced using the Sanger method. Over the next seven years, whole genome analysis and characterization was completed using bioinformatic tools such as blastn, tblastx, ClustalV and FASTA, in order to determine key proteins in species HAdV-A through HAdV-F. The bioinformatic revolution was initiated with the introduction of a novel species, HAdV-G, that was typed and named by the use of whole genome sequencing and phylogenetics as opposed to traditional serology. HAdV bioinformatics will continue to advance as the latest sequencing technology enables scientists to add to and expand the resource databases. As a result of these advancements, how novel HAdVs are typed has changed. Bioinformatic analysis has become the revolutionary tool that has significantly accelerated the in-depth study of HAdV microevolution through comparative genomics.

  19. Workflows in bioinformatics: meta-analysis and prototype implementation of a workflow generator

    Directory of Open Access Journals (Sweden)

    Thoraval Samuel

    2005-04-01

    Full Text Available Abstract Background Computational methods for problem solving need to interleave information access and algorithm execution in a problem-specific workflow. The structures of these workflows are defined by a scaffold of syntactic, semantic and algebraic objects capable of representing them. Despite the proliferation of GUIs (Graphic User Interfaces in bioinformatics, only some of them provide workflow capabilities; surprisingly, no meta-analysis of workflow operators and components in bioinformatics has been reported. Results We present a set of syntactic components and algebraic operators capable of representing analytical workflows in bioinformatics. Iteration, recursion, the use of conditional statements, and management of suspend/resume tasks have traditionally been implemented on an ad hoc basis and hard-coded; by having these operators properly defined it is possible to use and parameterize them as generic re-usable components. To illustrate how these operations can be orchestrated, we present GPIPE, a prototype graphic pipeline generator for PISE that allows the definition of a pipeline, parameterization of its component methods, and storage of metadata in XML formats. This implementation goes beyond the macro capacities currently in PISE. As the entire analysis protocol is defined in XML, a complete bioinformatic experiment (linked sets of methods, parameters and results can be reproduced or shared among users. Availability: http://if-web1.imb.uq.edu.au/Pise/5.a/gpipe.html (interactive, ftp://ftp.pasteur.fr/pub/GenSoft/unix/misc/Pise/ (download. Conclusion From our meta-analysis we have identified syntactic structures and algebraic operators common to many workflows in bioinformatics. The workflow components and algebraic operators can be assimilated into re-usable software components. GPIPE, a prototype implementation of this framework, provides a GUI builder to facilitate the generation of workflows and integration of heterogeneous

  20. Bioinformatic analysis to discover putative drug targets against ...

    African Journals Online (AJOL)

    /

    2012-01-26

    Jan 26, 2012 ... JVIRTUAL GEL. GELBANK was available from the NCBI FTP server. This website incorporates only completed genomes and information pertinent to 2-DE. Link is available at www.gelbank.anl.gov. JVirGel is a software for the simulation and analysis of proteomics data (http://www.jvirgel.de/). The Java TM.

  1. A gene network bioinformatics analysis for pemphigoid autoimmune blistering diseases.

    Science.gov (United States)

    Barone, Antonio; Toti, Paolo; Giuca, Maria Rita; Derchi, Giacomo; Covani, Ugo

    2015-07-01

    In this theoretical study, a text mining search and clustering analysis of data related to genes potentially involved in human pemphigoid autoimmune blistering diseases (PAIBD) was performed using web tools to create a gene/protein interaction network. The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database was employed to identify a final set of PAIBD-involved genes and to calculate the overall significant interactions among genes: for each gene, the weighted number of links, or WNL, was registered and a clustering procedure was performed using the WNL analysis. Genes were ranked in class (leader, B, C, D and so on, up to orphans). An ontological analysis was performed for the set of 'leader' genes. Using the above-mentioned data network, 115 genes represented the final set; leader genes numbered 7 (intercellular adhesion molecule 1 (ICAM-1), interferon gamma (IFNG), interleukin (IL)-2, IL-4, IL-6, IL-8 and tumour necrosis factor (TNF)), class B genes were 13, whereas the orphans were 24. The ontological analysis attested that the molecular action was focused on extracellular space and cell surface, whereas the activation and regulation of the immunity system was widely involved. Despite the limited knowledge of the present pathologic phenomenon, attested by the presence of 24 genes revealing no protein-protein direct or indirect interactions, the network showed significant pathways gathered in several subgroups: cellular components, molecular functions, biological processes and the pathologic phenomenon obtained from the Kyoto Encyclopaedia of Genes and Genomes (KEGG) database. The molecular basis for PAIBD was summarised and expanded, which will perhaps give researchers promising directions for the identification of new therapeutic targets.

  2. Novel approaches for bioinformatic analysis of salivary RNA sequencing data for development.

    Science.gov (United States)

    Kaczor-Urbanowicz, Karolina Elzbieta; Kim, Yong; Li, Feng; Galeev, Timur; Kitchen, Rob R; Gerstein, Mark; Koyano, Kikuye; Jeong, Sung-Hee; Wang, Xiaoyan; Elashoff, David; Kang, So Young; Kim, Su Mi; Kim, Kyoung; Kim, Sung; Chia, David; Xiao, Xinshu; Rozowsky, Joel; Wong, David T W

    2018-01-01

    Analysis of RNA sequencing (RNA-Seq) data in human saliva is challenging. Lack of standardization and unification of the bioinformatic procedures undermines saliva's diagnostic potential. Thus, it motivated us to perform this study. We applied principal pipelines for bioinformatic analysis of small RNA-Seq data of saliva of 98 healthy Korean volunteers including either direct or indirect mapping of the reads to the human genome using Bowtie1. Analysis of alignments to exogenous genomes by another pipeline revealed that almost all of the reads map to bacterial genomes. Thus, salivary exRNA has fundamental properties that warrant the design of unique additional steps while performing the bioinformatic analysis. Our pipelines can serve as potential guidelines for processing of RNA-Seq data of human saliva. Processing and analysis results of the experimental data generated by the exceRpt (v4.6.3) small RNA-seq pipeline (github.gersteinlab.org/exceRpt) are available from exRNA atlas (exrna-atlas.org). Alignment to exogenous genomes and their quantification results were used in this paper for the analyses of small RNAs of exogenous origin. dtww@ucla.edu. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  3. Bioinformatics and biomarker discovery "Omic" data analysis for personalized medicine

    CERN Document Server

    Azuaje, Francisco

    2010-01-01

    This book is designed to introduce biologists, clinicians and computational researchers to fundamental data analysis principles, techniques and tools for supporting the discovery of biomarkers and the implementation of diagnostic/prognostic systems. The focus of the book is on how fundamental statistical and data mining approaches can support biomarker discovery and evaluation, emphasising applications based on different types of "omic" data. The book also discusses design factors, requirements and techniques for disease screening, diagnostic and prognostic applications. Readers are provided w

  4. Analysis of Ribosome Inactivating Protein (RIP): A Bioinformatics Approach

    Science.gov (United States)

    Jothi, G. Edward Gnana; Majilla, G. Sahaya Jose; Subhashini, D.; Deivasigamani, B.

    2012-10-01

    In spite of the medical advances in recent years, the world is in need of different sources to encounter certain health issues.Ribosome Inactivating Proteins (RIPs) were found to be one among them. In order to get easy access about RIPs, there is a need to analyse RIPs towards constructing a database on RIPs. Also, multiple sequence alignment was done towards screening for homologues of significant RIPs from rare sources against RIPs from easily available sources in terms of similarity. Protein sequences were retrieved from SWISS-PROT and are further analysed using pair wise and multiple sequence alignment.Analysis shows that, 151 RIPs have been characterized to date. Amongst them, there are 87 type I, 37 type II, 1 type III and 25 unknown RIPs. The sequence length information of various RIPs about the availability of full or partial sequence was also found. The multiple sequence alignment of 37 type I RIP using the online server Multalin, indicates the presence of 20 conserved residues. Pairwise alignment and multiple sequence alignment of certain selected RIPs in two groups namely Group I and Group II were carried out and the consensus level was found to be 98%, 98% and 90% respectively.

  5. GProX, a user-friendly platform for bioinformatics analysis and visualization of quantitative proteomics data.

    Science.gov (United States)

    Rigbolt, Kristoffer T G; Vanselow, Jens T; Blagoev, Blagoy

    2011-08-01

    Recent technological advances have made it possible to identify and quantify thousands of proteins in a single proteomics experiment. As a result of these developments, the analysis of data has become the bottleneck of proteomics experiment. To provide the proteomics community with a user-friendly platform for comprehensive analysis, inspection and visualization of quantitative proteomics data we developed the Graphical Proteomics Data Explorer (GProX)(1). The program requires no special bioinformatics training, as all functions of GProX are accessible within its graphical user-friendly interface which will be intuitive to most users. Basic features facilitate the uncomplicated management and organization of large data sets and complex experimental setups as well as the inspection and graphical plotting of quantitative data. These are complemented by readily available high-level analysis options such as database querying, clustering based on abundance ratios, feature enrichment tests for e.g. GO terms and pathway analysis tools. A number of plotting options for visualization of quantitative proteomics data is available and most analysis functions in GProX create customizable high quality graphical displays in both vector and bitmap formats. The generic import requirements allow data originating from essentially all mass spectrometry platforms, quantitation strategies and software to be analyzed in the program. GProX represents a powerful approach to proteomics data analysis providing proteomics experimenters with a toolbox for bioinformatics analysis of quantitative proteomics data. The program is released as open-source and can be freely downloaded from the project webpage at http://gprox.sourceforge.net.

  6. Bioinformatics Analysis Reveals Genes Involved in the Pathogenesis of Ameloblastoma and Keratocystic Odontogenic Tumor.

    Science.gov (United States)

    Santos, Eliane Macedo Sobrinho; Santos, Hércules Otacílio; Dos Santos Dias, Ivoneth; Santos, Sérgio Henrique; Batista de Paula, Alfredo Maurício; Feltenberger, John David; Sena Guimarães, André Luiz; Farias, Lucyana Conceição

    2016-01-01

    Pathogenesis of odontogenic tumors is not well known. It is important to identify genetic deregulations and molecular alterations. This study aimed to investigate, through bioinformatic analysis, the possible genes involved in the pathogenesis of ameloblastoma (AM) and keratocystic odontogenic tumor (KCOT). Genes involved in the pathogenesis of AM and KCOT were identified in GeneCards. Gene list was expanded, and the gene interactions network was mapped using the STRING software. "Weighted number of links" (WNL) was calculated to identify "leader genes" (highest WNL). Genes were ranked by K-means method and Kruskal-Wallis test was used (Preview data was used to corroborate the bioinformatics data. CDK1 was identified as leader gene for AM. In KCOT group, results show PCNA and TP53 . Both tumors exhibit a power law behavior. Our topological analysis suggested leader genes possibly important in the pathogenesis of AM and KCOT, by clustering coefficient calculated for both odontogenic tumors (0.028 for AM, zero for KCOT). The results obtained in the scatter diagram suggest an important relationship of these genes with the molecular processes involved in AM and KCOT. Ontological analysis for both AM and KCOT demonstrated different mechanisms. Bioinformatics analyzes were confirmed through literature review. These results may suggest the involvement of promising genes for a better understanding of the pathogenesis of AM and KCOT.

  7. Genome-wide analysis of regulatory proteases sequences identified through bioinformatics data mining in Taenia solium.

    Science.gov (United States)

    Yan, Hong-Bin; Lou, Zhong-Zi; Li, Li; Brindley, Paul J; Zheng, Yadong; Luo, Xuenong; Hou, Junling; Guo, Aijiang; Jia, Wan-Zhong; Cai, Xuepeng

    2014-06-04

    Cysticercosis remains a major neglected tropical disease of humanity in many regions, especially in sub-Saharan Africa, Central America and elsewhere. Owing to the emerging drug resistance and the inability of current drugs to prevent re-infection, identification of novel vaccines and chemotherapeutic agents against Taenia solium and related helminth pathogens is a public health priority. The T. solium genome and the predicted proteome were reported recently, providing a wealth of information from which new interventional targets might be identified. In order to characterize and classify the entire repertoire of protease-encoding genes of T. solium, which act fundamental biological roles in all life processes, we analyzed the predicted proteins of this cestode through a combination of bioinformatics tools. Functional annotation was performed to yield insights into the signaling processes relevant to the complex developmental cycle of this tapeworm and to highlight a suite of the proteases as potential intervention targets. Within the genome of this helminth parasite, we identified 200 open reading frames encoding proteases from five clans, which correspond to 1.68% of the 11,902 protein-encoding genes predicted to be present in its genome. These proteases include calpains, cytosolic, mitochondrial signal peptidases, ubiquitylation related proteins, and others. Many not only show significant similarity to proteases in the Conserved Domain Database but have conserved active sites and catalytic domains. KEGG Automatic Annotation Server (KAAS) analysis indicated that ~60% of these proteases share strong sequence identities with proteins of the KEGG database, which are involved in human disease, metabolic pathways, genetic information processes, cellular processes, environmental information processes and organismal systems. Also, we identified signal peptides and transmembrane helices through comparative analysis with classes of important regulatory proteases

  8. LXtoo: an integrated live Linux distribution for the bioinformatics community.

    Science.gov (United States)

    Yu, Guangchuang; Wang, Li-Gen; Meng, Xiao-Hua; He, Qing-Yu

    2012-07-19

    Recent advances in high-throughput technologies dramatically increase biological data generation. However, many research groups lack computing facilities and specialists. This is an obstacle that remains to be addressed. Here, we present a Linux distribution, LXtoo, to provide a flexible computing platform for bioinformatics analysis. Unlike most of the existing live Linux distributions for bioinformatics limiting their usage to sequence analysis and protein structure prediction, LXtoo incorporates a comprehensive collection of bioinformatics software, including data mining tools for microarray and proteomics, protein-protein interaction analysis, and computationally complex tasks like molecular dynamics. Moreover, most of the programs have been configured and optimized for high performance computing. LXtoo aims to provide well-supported computing environment tailored for bioinformatics research, reducing duplication of efforts in building computing infrastructure. LXtoo is distributed as a Live DVD and freely available at http://bioinformatics.jnu.edu.cn/LXtoo.

  9. Applying Instructional Design Theories to Bioinformatics Education in Microarray Analysis and Primer Design Workshops

    Science.gov (United States)

    Shachak, Aviv; Ophir, Ron; Rubin, Eitan

    2005-01-01

    The need to support bioinformatics training has been widely recognized by scientists, industry, and government institutions. However, the discussion of instructional methods for teaching bioinformatics is only beginning. Here we report on a systematic attempt to design two bioinformatics workshops for graduate biology students on the basis of…

  10. Expression and Bioinformatic Analysis of Ornithine Aminotransferase 
in Non-small Cell Lung Cancer

    Directory of Open Access Journals (Sweden)

    Danfei ZHOU

    2012-09-01

    Full Text Available Background and objective It has been proven that ornithine aminotransferase (OAT might play an important role in the oncogenesis and progression of numerous malignant tumors. The aim of this study is to detect the mRNA and protein expression of OAT in non-small cell lung cancer (NSCLC, as well as to analyze the bioinformatic features and binary interactions. Methods OAT mRNA expression was detected in A549 and 16HBE cell lines by reverse transcription-polymerase chain reaction. OAT protein expression was determined in 55 cases of NSCLC and 17 cases of adjacent non-tumor lung tissues by immunohistochemical staining. The bioinformatic features and binary interactions of OAT were analyzed. Gene ontology annotation and signal pathway analysis were performed. Results OAT mRNA expression in A549 cells was 2.85-fold lower than that in 16HBE cells. OAT protein expression was significantly higher in NSCLC tissues than that in adjacent non-tumor lung tissues. A significant difference of OAT protein expression was existed between squamous cell lung cancer and adenocarcinoma (P<0.05, but was not correlated with the gender, age, lymph node metastasis, tumor size, and TNM stages. Bioinformatic analysis suggested that OAT was a highly homologous and stable protein located in the mitochondria. An aminotran-3 domain and several sites of phosphorylation, which may function in signal transduction, gene transcription, and molecular transit, were found. In the 54 selected binary interactions of OAT, TNF and TRAF6 play roles in the NF-κB pathway. Conclusion OAT may play an important role in the oncogenesis and progression of NSCLC. Thus, OAT may be a novel biomarker for the diagnosis of NSCLC or a new target for its treatment.

  11. Identification of differentially expressed genes and signaling pathways in ovarian cancer by integrated bioinformatics analysis

    Directory of Open Access Journals (Sweden)

    Yang X

    2018-03-01

    Full Text Available Xiao Yang,1 Shaoming Zhu,2 Li Li,3 Li Zhang,1 Shu Xian,1 Yanqing Wang,1 Yanxiang Cheng1 1Department of Obstetrics and Gynecology, 2Department of Urology, Renmin Hospital of Wuhan University, 3Department of Pharmacology, Wuhan University Health Science Center, Wuhan, Hubei, People’s Republic of China Background: The mortality rate associated with ovarian cancer ranks the highest among gynecological malignancies. However, the cause and underlying molecular events of ovarian cancer are not clear. Here, we applied integrated bioinformatics to identify key pathogenic genes involved in ovarian cancer and reveal potential molecular mechanisms. Results: The expression profiles of GDS3592, GSE54388, and GSE66957 were downloaded from the Gene Expression Omnibus (GEO database, which contained 115 samples, including 85 cases of ovarian cancer samples and 30 cases of normal ovarian samples. The three microarray datasets were integrated to obtain differentially expressed genes (DEGs and were deeply analyzed by bioinformatics methods. The gene ontology (GO and Kyoto Encyclopedia of Genes and Genomes (KEGG pathway enrichments of DEGs were performed by DAVID and KOBAS online analyses, respectively. The protein–protein interaction (PPI networks of the DEGs were constructed from the STRING database. A total of 190 DEGs were identified in the three GEO datasets, of which 99 genes were upregulated and 91 genes were downregulated. GO analysis showed that the biological functions of DEGs focused primarily on regulating cell proliferation, adhesion, and differentiation and intracellular signal cascades. The main cellular components include cell membranes, exosomes, the cytoskeleton, and the extracellular matrix. The molecular functions include growth factor activity, protein kinase regulation, DNA binding, and oxygen transport activity. KEGG pathway analysis showed that these DEGs were mainly involved in the Wnt signaling pathway, amino acid metabolism, and the

  12. MEDICAL DIAGNOSTICS BY MICROSTRUCTURAL ANALYSIS OF BIOLOGICAL LIQUID DRIED PATTERNS AS A PROBLEM OF BIOINFORMATICS

    Directory of Open Access Journals (Sweden)

    Petr Vladimirovich Lebedev-Stepanov, Dr.

    2018-02-01

    Full Text Available Motivation: It is important to develop the high-precision computerized methods for medical rapid diagnostic which is generalizing the unique clinical experience obtained in the past decade as specialized solutions for diagnostic problems of control of specific diseases and, potentially, for a wide health monitoring of virtually healthy population, identify the reserves of human health and take the actions to prevent of these reserves depletion. In this work we present one of the new directions in bioinformatics, i.e. medical diagnostics by automated expert system on basis of morphology analysis of digital image of biological liquid dried pattern. Results: Proposed method is combination of bioinformatics and biochemistry approaches for obtaining diagnostic information from a morphological analysis of standardized dried patterns of biological liquid sessile drop. We have carried out own research in collaboration with medical diagnostic centers and formed the electronic database for recognition the following types of diseases: candidiasis; neoplasms; diabetes mellitus; diseases of the circulatory system; cerebrovascular disease; diseases of the digestive system; diseases of the genitourinary system; infectious diseases; factors relevant to the work; factors associated with environmental pollution; factors related to lifestyle. The laboratory setup for diagnostics of the human body in pathology states is developed. The diagnostic results are considered. Availability: Access to testing the software can be obtained on request to the contact email below.

  13. Importance of databases of nucleic acids for bioinformatic analysis focused to genomics

    Science.gov (United States)

    Jimenez-Gutierrez, L. R.; Barrios-Hernández, C. J.; Pedraza-Ferreira, G. R.; Vera-Cala, L.; Martinez-Perez, F.

    2016-08-01

    Recently, bioinformatics has become a new field of science, indispensable in the analysis of millions of nucleic acids sequences, which are currently deposited in international databases (public or private); these databases contain information of genes, RNA, ORF, proteins, intergenic regions, including entire genomes from some species. The analysis of this information requires computer programs; which were renewed in the use of new mathematical methods, and the introduction of the use of artificial intelligence. In addition to the constant creation of supercomputing units trained to withstand the heavy workload of sequence analysis. However, it is still necessary the innovation on platforms that allow genomic analyses, faster and more effectively, with a technological understanding of all biological processes.

  14. Galaxy Workflows for Web-based Bioinformatics Analysis of Aptamer High-throughput Sequencing Data

    Directory of Open Access Journals (Sweden)

    William H Thiel

    2016-01-01

    Full Text Available Development of RNA and DNA aptamers for diagnostic and therapeutic applications is a rapidly growing field. Aptamers are identified through iterative rounds of selection in a process termed SELEX (Systematic Evolution of Ligands by EXponential enrichment. High-throughput sequencing (HTS revolutionized the modern SELEX process by identifying millions of aptamer sequences across multiple rounds of aptamer selection. However, these vast aptamer HTS datasets necessitated bioinformatics techniques. Herein, we describe a semiautomated approach to analyze aptamer HTS datasets using the Galaxy Project, a web-based open source collection of bioinformatics tools that were originally developed to analyze genome, exome, and transcriptome HTS data. Using a series of Workflows created in the Galaxy webserver, we demonstrate efficient processing of aptamer HTS data and compilation of a database of unique aptamer sequences. Additional Workflows were created to characterize the abundance and persistence of aptamer sequences within a selection and to filter sequences based on these parameters. A key advantage of this approach is that the online nature of the Galaxy webserver and its graphical interface allow for the analysis of HTS data without the need to compile code or install multiple programs.

  15. Deep learning in bioinformatics.

    Science.gov (United States)

    Min, Seonwoo; Lee, Byunghan; Yoon, Sungroh

    2017-09-01

    In the era of big data, transformation of biomedical big data into valuable knowledge has been one of the most important challenges in bioinformatics. Deep learning has advanced rapidly since the early 2000s and now demonstrates state-of-the-art performance in various fields. Accordingly, application of deep learning in bioinformatics to gain insight from data has been emphasized in both academia and industry. Here, we review deep learning in bioinformatics, presenting examples of current research. To provide a useful and comprehensive perspective, we categorize research both by the bioinformatics domain (i.e. omics, biomedical imaging, biomedical signal processing) and deep learning architecture (i.e. deep neural networks, convolutional neural networks, recurrent neural networks, emergent architectures) and present brief descriptions of each study. Additionally, we discuss theoretical and practical issues of deep learning in bioinformatics and suggest future research directions. We believe that this review will provide valuable insights and serve as a starting point for researchers to apply deep learning approaches in their bioinformatics studies. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  16. Bioinformatics analysis of the phytoene synthase gene in cabbage (Brassica oleracea var. capitata)

    Science.gov (United States)

    Sun, Bo; Jiang, Min; Xue, Shengling; Zheng, Aihong; Zhang, Fen; Tang, Haoru

    2018-04-01

    Phytoene Synthase (PSY) is an important enzyme in carotenoid biosynthesis. Here, the Brassica oleracea var. capitata PSY (BocPSY) gene sequences were obtained from Brassica database (BRAD), and preformed for bioinformatics analysis. The BocPSY1, BocPSY2 and BocPSY3 genes mapped to chromosomes 2,3 and 9, and contains an open reading frame of 1,248 bp, 1,266 bp and 1,275 bp that encodes a 415, 421, 424 amino acid protein, respectively. Subcellular localization predicted all BocPSY genes were in the chloroplast. The conserved domain of the BocPSY protein is PLN02632. Homology analysis indicates that the levels of identity among BocPSYs were all more than 85%, and the PSY protein is apparently conserved during plant evolution. The findings of the present study provide a molecular basis for the elucidation of PSY gene function in cabbage.

  17. OPTSDNA: Performance evaluation of an efficient distributed bioinformatics system for DNA sequence analysis.

    Science.gov (United States)

    Khan, Mohammad Ibrahim; Sheel, Chotan

    2013-01-01

    Storage of sequence data is a big concern as the amount of data generated is exponential in nature at several locations. Therefore, there is a need to develop techniques to store data using compression algorithm. Here we describe optimal storage algorithm (OPTSDNA) for storing large amount of DNA sequences of varying length. This paper provides performance analysis of optimal storage algorithm (OPTSDNA) of a distributed bioinformatics computing system for analysis of DNA sequences. OPTSDNA algorithm is used for storing various sizes of DNA sequences into database. DNA sequences of different lengths were stored by using this algorithm. These input DNA sequences are varied in size from very small to very large. Storage size is calculated by this algorithm. Response time is also calculated in this work. The efficiency and performance of the algorithm is high (in size calculation with percentage) when compared with other known with sequential approach.

  18. Adaptation of a Bioinformatics Microarray Analysis Workflow for a Toxicogenomic Study in Rainbow Trout.

    Directory of Open Access Journals (Sweden)

    Sophie Depiereux

    Full Text Available Sex steroids play a key role in triggering sex differentiation in fish, the use of exogenous hormone treatment leading to partial or complete sex reversal. This phenomenon has attracted attention since the discovery that even low environmental doses of exogenous steroids can adversely affect gonad morphology (ovotestis development and induce reproductive failure. Modern genomic-based technologies have enhanced opportunities to find out mechanisms of actions (MOA and identify biomarkers related to the toxic action of a compound. However, high throughput data interpretation relies on statistical analysis, species genomic resources, and bioinformatics tools. The goals of this study are to improve the knowledge of feminisation in fish, by the analysis of molecular responses in the gonads of rainbow trout fry after chronic exposure to several doses (0.01, 0.1, 1 and 10 μg/L of ethynylestradiol (EE2 and to offer target genes as potential biomarkers of ovotestis development. We successfully adapted a bioinformatics microarray analysis workflow elaborated on human data to a toxicogenomic study using rainbow trout, a fish species lacking accurate functional annotation and genomic resources. The workflow allowed to obtain lists of genes supposed to be enriched in true positive differentially expressed genes (DEGs, which were subjected to over-representation analysis methods (ORA. Several pathways and ontologies, mostly related to cell division and metabolism, sexual reproduction and steroid production, were found significantly enriched in our analyses. Moreover, two sets of potential ovotestis biomarkers were selected using several criteria. The first group displayed specific potential biomarkers belonging to pathways/ontologies highlighted in the experiment. Among them, the early ovarian differentiation gene foxl2a was overexpressed. The second group, which was highly sensitive but not specific, included the DEGs presenting the highest fold change and

  19. Identification of the intestinal type gastric adenocarcinoma transcriptomic markers using bioinformatic and gene expression analysis

    Directory of Open Access Journals (Sweden)

    V. V. Volkomorov

    2017-01-01

    Full Text Available Introduction. Searching for specific and sensitive molecular tumor markers is one of the important tasks of modern oncology. These markers can be used for early tumor diagnosis and prognosis as well as for prediction of therapeutic response, estimation of tumor volume or to assess disease recurrence through monitoring. Gene expression data base mining followed by experimental validation of results obtained is one of the promising approaches for searching of that kind.Objective: to identify several membrane proteins which can be used for serum diagnosis of intestinal type of gastric adenocarcinoma.Materials and methods. We used bioinformatic-driven search using Gene Ontology and The Cancer Genome Atlas (TCGA data to identify mRNA up-regulated in gastric cancer (GC. Then, the expression levels of the mRNAs in 55 pare clinical specimens were investigated using reverse transcription polymerase chain reaction.Results. Comparative analysis of the mRNA levels in normal and tumor tissues using a new bioinformatics algorithm allowed to identify 3 high-copy transcripts (SULF1, PMEPA1 and SPARC, intracellular content of which markedly increased in GC. Expression analysis of these genes in clinical specimens showed significantly higher mRNA levels of PMEPA1 and SPARC in tumor as compared to normal gastric tissue. Interestingly more than twofold increase in expression level of these genes was observed in 75 % of intestinal-type GC. The same results were found only in 25 and 38 % of diffuse-type GC respectively.Conclusions. As a result of original bioinforamtic analysis using TCGA data base two genes (PMEPA1 and SPARC were shown to be significantly upregulated in intestinal-type gastric adenocarcinoma. The findings show the importance of further investigation to clarify the clinical value of their expression level in stomach tumors as well as their role in carcinogenesis.

  20. Identification of new serum markers of pathological states by bioinformatic tools for the analysis of serum proteomics expression profiles

    International Nuclear Information System (INIS)

    Malorni, A.; Facchiano, A.

    2009-01-01

    We have developed new bioinformatic tools and strategies, aimed to the identification and characterization of proteins as markers of pathological states, for the analysis of data derived from protein expression profiles obtained by mass spectrometry techniques, for the study of structural and functional properties of the proteins, and for the analysis of data from omics approaches

  1. Cloning and bioinformatic analysis of lovastatin biosynthesis regulatory gene lovE.

    Science.gov (United States)

    Huang, Xin; Li, Hao-ming

    2009-08-05

    Lovastatin is an effective drug for treatment of hyperlipidemia. This study aimed to clone lovastatin biosynthesis regulatory gene lovE and analyze the structure and function of its encoding protein. According to the lovastatin synthase gene sequence from genebank, primers were designed to amplify and clone the lovastatin biosynthesis regulatory gene lovE from Aspergillus terrus genomic DNA. Bioinformatic analysis of lovE and its encoding animo acid sequence was performed through internet resources and software like DNAMAN. Target fragment lovE, almost 1500 bp in length, was amplified from Aspergillus terrus genomic DNA and the secondary and three-dimensional structures of LovE protein were predicted. In the lovastatin biosynthesis process lovE is a regulatory gene and LovE protein is a GAL4-like transcriptional factor.

  2. Bioinformatics analysis of breast cancer bone metastasis related gene-CXCR4

    Institute of Scientific and Technical Information of China (English)

    Heng-Wei Zhang; Xian-Fu Sun; Ya-Ning He; Jun-Tao Li; Xu-Hui Guo; Hui Liu

    2013-01-01

    Objective: To analyze breast cancer bone metastasis related gene-CXCR4. Methods: This research screened breast cancer bone metastasis related genes by high-flux gene chip. Results:It was found that the expressions of 396 genes were different including 165 up-regulations and 231 down-regulations. The expression of chemokine receptor CXCR4 was obviously up-regulated in the tissue with breast cancer bone metastasis. Compared with the tissue without bone metastasis, there was significant difference, which indicated that CXCR4 played a vital role in breast cancer bone metastasis. Conclusions: The bioinformatics analysis of CXCR4 can provide a certain basis for the occurrence and diagnosis of breast cancer bone metastasis, target gene therapy and evaluation of prognosis.

  3. Experimental Design and Bioinformatics Analysis for the Application of Metagenomics in Environmental Sciences and Biotechnology.

    Science.gov (United States)

    Ju, Feng; Zhang, Tong

    2015-11-03

    Recent advances in DNA sequencing technologies have prompted the widespread application of metagenomics for the investigation of novel bioresources (e.g., industrial enzymes and bioactive molecules) and unknown biohazards (e.g., pathogens and antibiotic resistance genes) in natural and engineered microbial systems across multiple disciplines. This review discusses the rigorous experimental design and sample preparation in the context of applying metagenomics in environmental sciences and biotechnology. Moreover, this review summarizes the principles, methodologies, and state-of-the-art bioinformatics procedures, tools and database resources for metagenomics applications and discusses two popular strategies (analysis of unassembled reads versus assembled contigs/draft genomes) for quantitative or qualitative insights of microbial community structure and functions. Overall, this review aims to facilitate more extensive application of metagenomics in the investigation of uncultured microorganisms, novel enzymes, microbe-environment interactions, and biohazards in biotechnological applications where microbial communities are engineered for bioenergy production, wastewater treatment, and bioremediation.

  4. Systematic Identification and Bioinformatic Analysis of MicroRNAs in Response to Infections of Coxsackievirus A16 and Enterovirus 71.

    Science.gov (United States)

    Zhu, Zheng; Qi, Yuhua; Fan, Huan; Cui, Lunbiao; Shi, Zhiyang

    2016-01-01

    Hand, foot, and mouth disease (HFMD), mainly caused by coxsackievirus A16 (CVA16) and enterovirus 71 (EV71) infections, remains a serious public health issue with thousands of newly diagnostic cases each year since 2008 in China. The mechanisms underlying viral infection, however, are elusive to date. In the present study, we systematically investigated the host cellular microRNA (miRNA) expression patterns in response to CVA16 and EV71 infections. Through microarray examination, 27 miRNAs (15 upregulated and 12 downregulated) were found to be coassociated with the replication process of two viruses, while the expression levels of 15 and 5 miRNAs were significantly changed in CVA16- and EV71-infected cells, respectively. A great number of target genes of 27 common differentially expressed miRNAs were predicted by combined use of two computational target prediction algorithms, TargetScan and MiRanda. Comprehensive bioinformatic analysis of target genes in GO categories and KEGG pathways indicated the involvement of diverse biological functions and signaling pathways during viral infection. These results provide an overview of the roles of miRNAs in virus-host interaction, which will contribute to further understanding of HFMD pathological mechanisms.

  5. Bioinformatic analysis of Msx1 and Msx2 involved in craniofacial development.

    Science.gov (United States)

    Dai, Jiewen; Mou, Zhifang; Shen, Shunyao; Dong, Yuefu; Yang, Tong; Shen, Steve Guofang

    2014-01-01

    Msx1 and Msx2 were revealed to be candidate genes for some craniofacial deformities, such as cleft lip with/without cleft palate (CL/P) and craniosynostosis. Many other genes were demonstrated to have a cross-talk with MSX genes in causing these defects. However, there is no systematic evaluation for these MSX gene-related factors. In this study, we performed systematic bioinformatic analysis for MSX genes by combining using GeneDecks, DAVID, and STRING database, and the results showed that there were numerous genes related to MSX genes, such as Irf6, TP63, Dlx2, Dlx5, Pax3, Pax9, Bmp4, Tgf-beta2, and Tgf-beta3 that have been demonstrated to be involved in CL/P, and Fgfr2, Fgfr1, Fgfr3, and Twist1 that were involved in craniosynostosis. Many of these genes could be enriched into different gene groups involved in different signaling ways, different craniofacial deformities, and different biological process. These findings could make us analyze the function of MSX gens in a gene network. In addition, our findings showed that Sumo, a novel gene whose polymorphisms were demonstrated to be associated with nonsyndromic CL/P by genome-wide association study, has protein-protein interaction with MSX1, which may offer us an alternative method to perform bioinformatic analysis for genes found by genome-wide association study and can make us predict the disrupted protein function due to the mutation in a gene DNA sequence. These findings may guide us to perform further functional studies in the future.

  6. GENEASE: Real time bioinformatics tool for multi-omics and disease ontology exploration, analysis and visualization.

    Science.gov (United States)

    Ghandikota, Sudhir; Hershey, Gurjit K Khurana; Mersha, Tesfaye B

    2018-03-24

    Advances in high-throughput sequencing technologies have made it possible to generate multiple omics data at an unprecedented rate and scale. The accumulation of these omics data far outpaces the rate at which biologists can mine and generate new hypothesis to test experimentally. There is an urgent need to develop a myriad of powerful tools to efficiently and effectively search and filter these resources to address specific post-GWAS functional genomics questions. However, to date, these resources are scattered across several databases and often lack a unified portal for data annotation and analytics. In addition, existing tools to analyze and visualize these databases are highly fragmented, resulting researchers to access multiple applications and manual interventions for each gene or variant in an ad hoc fashion until all the questions are answered. In this study, we present GENEASE, a web-based one-stop bioinformatics tool designed to not only query and explore multi-omics and phenotype databases (e.g., GTEx, ClinVar, dbGaP, GWAS Catalog, ENCODE, Roadmap Epigenomics, KEGG, Reactome, Gene and Phenotype Ontology) in a single web interface but also to perform seamless post genome-wide association downstream functional and overlap analysis for non-coding regulatory variants. GENEASE accesses over 50 different databases in public domain including model organism-specific databases to facilitate gene/variant and disease exploration, enrichment and overlap analysis in real time. It is a user-friendly tool with point-and-click interface containing links for support information including user manual and examples. GENEASE can be accessed freely at http://research.cchmc.org/mershalab/genease_new/login.html. Tesfaye.Mersha@cchmc.org, Sudhir.Ghandikota@cchmc.org. Supplementary data are available at Bioinformatics online.

  7. Bioinformatics analysis identify novel OB fold protein coding genes in C. elegans.

    Directory of Open Access Journals (Sweden)

    Daryanaz Dargahi

    Full Text Available BACKGROUND: The C. elegans genome has been extensively annotated by the WormBase consortium that uses state of the art bioinformatics pipelines, functional genomics and manual curation approaches. As a result, the identification of novel genes in silico in this model organism is becoming more challenging requiring new approaches. The Oligonucleotide-oligosaccharide binding (OB fold is a highly divergent protein family, in which protein sequences, in spite of having the same fold, share very little sequence identity (5-25%. Therefore, evidence from sequence-based annotation may not be sufficient to identify all the members of this family. In C. elegans, the number of OB-fold proteins reported is remarkably low (n=46 compared to other evolutionary-related eukaryotes, such as yeast S. cerevisiae (n=344 or fruit fly D. melanogaster (n=84. Gene loss during evolution or differences in the level of annotation for this protein family, may explain these discrepancies. METHODOLOGY/PRINCIPAL FINDINGS: This study examines the possibility that novel OB-fold coding genes exist in the worm. We developed a bioinformatics approach that uses the most sensitive sequence-sequence, sequence-profile and profile-profile similarity search methods followed by 3D-structure prediction as a filtering step to eliminate false positive candidate sequences. We have predicted 18 coding genes containing the OB-fold that have remarkably partially been characterized in C. elegans. CONCLUSIONS/SIGNIFICANCE: This study raises the possibility that the annotation of highly divergent protein fold families can be improved in C. elegans. Similar strategies could be implemented for large scale analysis by the WormBase consortium when novel versions of the genome sequence of C. elegans, or other evolutionary related species are being released. This approach is of general interest to the scientific community since it can be used to annotate any genome.

  8. In Silico Identification, Phylogenetic and Bioinformatic Analysis of Argonaute Genes in Plants

    Directory of Open Access Journals (Sweden)

    Khaled Mirzaei

    2014-01-01

    Full Text Available Argonaute protein family is the key players in pathways of gene silencing and small regulatory RNAs in different organisms. Argonaute proteins can bind small noncoding RNAs and control protein synthesis, affect messenger RNA stability, and even participate in the production of new forms of small RNAs. The aim of this study was to characterize and perform bioinformatic analysis of Argonaute proteins in 32 plant species that their genome was sequenced. A total of 437 Argonaute genes were identified and were analyzed based on lengths, gene structure, and protein structure. Results showed that Argonaute proteins were highly conserved across plant kingdom. Phylogenic analysis divided plant Argonautes into three classes. Argonaute proteins have three conserved domains PAZ, MID and PIWI. In addition to three conserved domains namely, PAZ, MID, and PIWI, we identified few more domains in AGO of some plant species. Expression profile analysis of Argonaute proteins showed that expression of these genes varies in most of tissues, which means that these proteins are involved in regulation of most pathways of the plant system. Numbers of alternative transcripts of Argonaute genes were highly variable among the plants. A thorough analysis of large number of putative Argonaute genes revealed several interesting aspects associated with this protein and brought novel information with promising usefulness for both basic and biotechnological applications.

  9. [BIOINFORMATIC SEARCH AND PHYLOGENETIC ANALYSIS OF THE CELLULOSE SYNTHASE GENES OF FLAX (LINUM USITATISSIMUM)].

    Science.gov (United States)

    Pydiura, N A; Bayer, G Ya; Galinousky, D V; Yemets, A I; Pirko, Ya V; Podvitski, T A; Anisimova, N V; Khotyleva, L V; Kilchevsky, A V; Blume, Ya B

    2015-01-01

    A bioinformatic search of sequences encoding cellulose synthase genes in the flax genome, and their comparison to dicots orthologs was carried out. The analysis revealed 32 cellulose synthase gene candidates, 16 of which are highly likely to encode cellulose synthases, and the remaining 16--cellulose synthase-like proteins (Csl). Phylogenetic analysis of gene products of cellulose synthase genes allowed distinguishing 6 groups of cellulose synthase genes of different classes: CesA1/10, CesA3, CesA4, CesA5/6/2/9, CesA7 and CesA8. Paralogous sequences within classes CesA1/10 and CesA5/6/2/9 which are associated with the primary cell wall formation are characterized by a greater similarity within these classes than orthologous sequences. Whereas the genes controlling the biosynthesis of secondary cell wall cellulose form distinct clades: CesA4, CesA7, and CesA8. The analysis of 16 identified flax cellulose synthase gene candidates shows the presence of at least 12 different cellulose synthase gene variants in flax genome which are represented in all six clades of cellulose synthase genes. Thus, at this point genes of all ten known cellulose synthase classes are identify in flax genome, but their correct classification requires additional research.

  10. Structural and Phylogenetic Analysis of Laccases from Trichoderma: A Bioinformatic Approach

    Science.gov (United States)

    Cázares-García, Saila Viridiana; Vázquez-Garcidueñas, Ma. Soledad; Vázquez-Marrufo, Gerardo

    2013-01-01

    The genus Trichoderma includes species of great biotechnological value, both for their mycoparasitic activities and for their ability to produce extracellular hydrolytic enzymes. Although activity of extracellular laccase has previously been reported in Trichoderma spp., the possible number of isoenzymes is still unknown, as are the structural and functional characteristics of both the genes and the putative proteins. In this study, the system of laccases sensu stricto in the Trichoderma species, the genomes of which are publicly available, were analyzed using bioinformatic tools. The intron/exon structure of the genes and the identification of specific motifs in the sequence of amino acids of the proteins generated in silico allow for clear differentiation between extracellular and intracellular enzymes. Phylogenetic analysis suggests that the common ancestor of the genus possessed a functional gene for each one of these enzymes, which is a characteristic preserved in T. atroviride and T. virens. This analysis also reveals that T. harzianum and T. reesei only retained the intracellular activity, whereas T. asperellum added an extracellular isoenzyme acquired through horizontal gene transfer during the mycoparasitic process. The evolutionary analysis shows that in general, extracellular laccases are subjected to purifying selection, and intracellular laccases show neutral evolution. The data provided by the present study will enable the generation of experimental approximations to better understand the physiological role of laccases in the genus Trichoderma and to increase their biotechnological potential. PMID:23383142

  11. Analysis of microRNA profile of Anopheles sinensis by deep sequencing and bioinformatic approaches.

    Science.gov (United States)

    Feng, Xinyu; Zhou, Xiaojian; Zhou, Shuisen; Wang, Jingwen; Hu, Wei

    2018-03-12

    microRNAs (miRNAs) are small non-coding RNAs widely identified in many mosquitoes. They are reported to play important roles in development, differentiation and innate immunity. However, miRNAs in Anopheles sinensis, one of the Chinese malaria mosquitoes, remain largely unknown. We investigated the global miRNA expression profile of An. sinensis using Illumina Hiseq 2000 sequencing. Meanwhile, we applied a bioinformatic approach to identify potential miRNAs in An. sinensis. The identified miRNA profiles were compared and analyzed by two approaches. The selected miRNAs from the sequencing result and the bioinformatic approach were confirmed with qRT-PCR. Moreover, target prediction, GO annotation and pathway analysis were carried out to understand the role of miRNAs in An. sinensis. We identified 49 conserved miRNAs and 12 novel miRNAs by next-generation high-throughput sequencing technology. In contrast, 43 miRNAs were predicted by the bioinformatic approach, of which two were assigned as novel. Comparative analysis of miRNA profiles by two approaches showed that 21 miRNAs were shared between them. Twelve novel miRNAs did not match any known miRNAs of any organism, indicating that they are possibly species-specific. Forty miRNAs were found in many mosquito species, indicating that these miRNAs are evolutionally conserved and may have critical roles in the process of life. Both the selected known and novel miRNAs (asi-miR-281, asi-miR-184, asi-miR-14, asi-miR-nov5, asi-miR-nov4, asi-miR-9383, and asi-miR-2a) could be detected by quantitative real-time PCR (qRT-PCR) in the sequenced sample, and the expression patterns of these miRNAs measured by qRT-PCR were in concordance with the original miRNA sequencing data. The predicted targets for the known and the novel miRNAs covered many important biological roles and pathways indicating the diversity of miRNA functions. We also found 21 conserved miRNAs and eight counterparts of target immune pathway genes in An. sinensis

  12. [Comparison of gut microbiotal compositional analysis of patients with irritable bowel syndrome through different bioinformatics pipelines].

    Science.gov (United States)

    Zhu, S W; Liu, Z J; Li, M; Zhu, H Q; Duan, L P

    2018-04-18

    To assess whether the same biological conclusion, diagnostic or curative effects regarding microbial composition of irritable bowel syndrome (IBS) patients could be reached through different bioinformatics pipelines, we used two common bioinformatics pipelines (Uparse V2.0 and Mothur V1.39.5)to analyze the same fecal microbial 16S rRNA high-throughput sequencing data. The two pipelines were used to analyze the diversity and richness of fecal microbial 16S rRNA high-throughput sequencing data of 27 samples, including 9 healthy controls (HC group), 9 diarrhea IBS patients before (IBS group) and after Rifaximin treatment (IBS-treatment, IBSt group). Analyses such as microbial diversity, principal co-ordinates analysis (PCoA), nonmetric multidimensional scaling (NMDS) and linear discriminant analysis effect size (LEfSe) were used to find out the microbial differences among HC group vs. IBS group and IBS group vs. IBSt group. (1) Microbial composition comparison of the 27 samples in the two pipelines showed significant variations at both family and genera levels while no significant variations at phylum level; (2) There was no significant difference in the comparison of HC vs. IBS or IBS vs. IBSt (Uparse: HC vs. IBS, F=0.98, P=0.445; IBS vs. IBSt, F=0.47,P=0.926; Mothur: HC vs.IBS, F=0.82, P=0.646; IBS vs. IBSt, F=0.37, P=0.961). The Shannon index was significantly decreased in IBSt; (3) Both workshops distinguished the significantly enriched genera between HC and IBS groups. For example, Nitrosomonas and Paraprevotella increased while Pseudoalteromonadaceae and Anaerotruncus decreased in HC group through Uparse pipeline, nevertheless Roseburia 62 increased while Butyricicoccus and Moraxellaceae decreased in HC group through Mothur pipeline.Only Uparse pipeline could pick out significant genera between IBS and IBSt, such as Pseudobutyricibrio, Clostridiaceae 1 and Clostridiumsensustricto 1. There were taxonomic and phylogenetic diversity differences between the two

  13. Bioinformatics analysis of differentially expressed proteins in prostate cancer based on proteomics data

    Directory of Open Access Journals (Sweden)

    Chen C

    2016-03-01

    Full Text Available Chen Chen,1 Li-Guo Zhang,1 Jian Liu,1 Hui Han,1 Ning Chen,1 An-Liang Yao,1 Shao-San Kang,1 Wei-Xing Gao,1 Hong Shen,2 Long-Jun Zhang,1 Ya-Peng Li,1 Feng-Hong Cao,1 Zhi-Guo Li3 1Department of Urology, North China University of Science and Technology Affiliated Hospital, 2Department of Modern Technology and Education Center, 3Department of Medical Research Center, International Science and Technology Cooperation Base of Geriatric Medicine, North China University of Science and Technology, Tangshan, People’s Republic of China Abstract: We mined the literature for proteomics data to examine the occurrence and metastasis of prostate cancer (PCa through a bioinformatics analysis. We divided the differentially expressed proteins (DEPs into two groups: the group consisting of PCa and benign tissues (P&b and the group presenting both high and low PCa metastatic tendencies (H&L. In the P&b group, we found 320 DEPs, 20 of which were reported more than three times, and DES was the most commonly reported. Among these DEPs, the expression levels of FGG, GSN, SERPINC1, TPM1, and TUBB4B have not yet been correlated with PCa. In the H&L group, we identified 353 DEPs, 13 of which were reported more than three times. Among these DEPs, MDH2 and MYH9 have not yet been correlated with PCa metastasis. We further confirmed that DES was differentially expressed between 30 cancer and 30 benign tissues. In addition, DEPs associated with protein transport, regulation of actin cytoskeleton, and the extracellular matrix (ECM–receptor interaction pathway were prevalent in the H&L group and have not yet been studied in detail in this context. Proteins related to homeostasis, the wound-healing response, focal adhesions, and the complement and coagulation pathways were overrepresented in both groups. Our findings suggest that the repeatedly reported DEPs in the two groups may function as potential biomarkers for detecting PCa and predicting its aggressiveness. Furthermore

  14. Molecular and bioinformatic analysis of the FB-NOF transposable element.

    Science.gov (United States)

    Badal, Martí; Portela, Anna; Xamena, Noel; Cabré, Oriol

    2006-04-12

    The Drosophila melanogaster transposable element FB-NOF is known to play a role in genome plasticity through the generation of all sort of genomic rearrangements. Moreover, several insertional mutants due to FB mobilizations have been reported. Its structure and sequence, however, have been poorly studied mainly as a consequence of the long, complex and repetitive sequence of FB inverted repeats. This repetitive region is composed of several 154 bp blocks, each with five almost identical repeats. In this paper, we report the sequencing process of 2 kb long FB inverted repeats of a complete FB-NOF element, with high precision and reliability. This achievement has been possible using a new map of the FB repetitive region, which identifies unambiguously each repeat with new features that can be used as landmarks. With this new vision of the element, a list of FB-NOF in the D. melanogaster genomic clones has been done, improving previous works that used only bioinformatic algorithms. The availability of many FB and FB-NOF sequences allowed an analysis of the FB insertion sequences that showed no sequence specificity, but a preference for A/T rich sequences. The position of NOF into FB is also studied, revealing that it is always located after a second repeat in a random block. With the results of this analysis, we propose a model of transposition in which NOF jumps from FB to FB, using an unidentified transposase enzyme that should specifically recognize the second repeat end of the FB blocks.

  15. Bioinformatics analysis of transcriptome dynamics during growth in angus cattle longissimus muscle.

    Science.gov (United States)

    Moisá, Sonia J; Shike, Daniel W; Graugnard, Daniel E; Rodriguez-Zas, Sandra L; Everts, Robin E; Lewin, Harris A; Faulkner, Dan B; Berger, Larry L; Loor, Juan J

    2013-01-01

    Transcriptome dynamics in the longissimus muscle (LM) of young Angus cattle were evaluated at 0, 60, 120, and 220 days from early-weaning. Bioinformatic analysis was performed using the dynamic impact approach (DIA) by means of Kyoto Encyclopedia of Genes and Genomes (KEGG) and Database for Annotation, Visualization and Integrated Discovery (DAVID) databases. Between 0 to 120 days (growing phase) most of the highly-impacted pathways (eg, ascorbate and aldarate metabolism, drug metabolism, cytochrome P450 and Retinol metabolism) were inhibited. The phase between 120 to 220 days (finishing phase) was characterized by the most striking differences with 3,784 differentially expressed genes (DEGs). Analysis of those DEGs revealed that the most impacted KEGG canonical pathway was glycosylphosphatidylinositol (GPI)-anchor biosynthesis, which was inhibited. Furthermore, inhibition of calpastatin and activation of tyrosine aminotransferase ubiquitination at 220 days promotes proteasomal degradation, while the concurrent activation of ribosomal proteins promotes protein synthesis. Therefore, the balance of these processes likely results in a steady-state of protein turnover during the finishing phase. Results underscore the importance of transcriptome dynamics in LM during growth.

  16. BATMAN-TCM: a Bioinformatics Analysis Tool for Molecular mechANism of Traditional Chinese Medicine

    Science.gov (United States)

    Liu, Zhongyang; Guo, Feifei; Wang, Yong; Li, Chun; Zhang, Xinlei; Li, Honglei; Diao, Lihong; Gu, Jiangyong; Wang, Wei; Li, Dong; He, Fuchu

    2016-02-01

    Traditional Chinese Medicine (TCM), with a history of thousands of years of clinical practice, is gaining more and more attention and application worldwide. And TCM-based new drug development, especially for the treatment of complex diseases is promising. However, owing to the TCM’s diverse ingredients and their complex interaction with human body, it is still quite difficult to uncover its molecular mechanism, which greatly hinders the TCM modernization and internationalization. Here we developed the first online Bioinformatics Analysis Tool for Molecular mechANism of TCM (BATMAN-TCM). Its main functions include 1) TCM ingredients’ target prediction; 2) functional analyses of targets including biological pathway, Gene Ontology functional term and disease enrichment analyses; 3) the visualization of ingredient-target-pathway/disease association network and KEGG biological pathway with highlighted targets; 4) comparison analysis of multiple TCMs. Finally, we applied BATMAN-TCM to Qishen Yiqi dripping Pill (QSYQ) and combined with subsequent experimental validation to reveal the functions of renin-angiotensin system responsible for QSYQ’s cardioprotective effects for the first time. BATMAN-TCM will contribute to the understanding of the “multi-component, multi-target and multi-pathway” combinational therapeutic mechanism of TCM, and provide valuable clues for subsequent experimental validation, accelerating the elucidation of TCM’s molecular mechanism. BATMAN-TCM is available at http://bionet.ncpsb.org/batman-tcm.

  17. Bioinformatics analysis of the ς-carotene desaturase gene in cabbage (Brassica oleracea var. capitata)

    Science.gov (United States)

    Sun, Bo; Zheng, Aihong; Jiang, Min; Xue, Shengling; Zhang, Fen; Tang, Haoru

    2018-04-01

    ς-carotene desaturase (ZDS) is an important enzyme in carotenoid biosynthesis. Here, the Brassica oleracea var. capitata ZDS (BocZDS) gene sequences were obtained from Brassica database (BRAD), and preformed for bioinformatics analysis. The BocZDS gene mapped to Scaffold000363, and contains an open reading frame of 1,686 bp that encodes a 561-amino acid protein with a calculated molecular mass of 62.00 kD and an isoelectric point (pI) of 8.2. Subcellular localization predicted the BocZDS gene was in the chloroplast. The conserved domain of the BocZDS protein is PLN02487, indicating that it belongs the member of zeta-carotene desaturase. Homology analysis indicates that the ZDS protein is apparently conserved during plant evolution and is most closely related to B. oleracea var. oleracea, B. napus, and B. rapa. The findings of the present study provide a molecular basis for the elucidation of ZDS gene function in cabbage.

  18. Bioinformatics tools for quantitative and functional metagenome and metatranscriptome data analysis in microbes.

    Science.gov (United States)

    Niu, Sheng-Yong; Yang, Jinyu; McDermaid, Adam; Zhao, Jing; Kang, Yu; Ma, Qin

    2017-05-08

    Metagenomic and metatranscriptomic sequencing approaches are more frequently being used to link microbiota to important diseases and ecological changes. Many analyses have been used to compare the taxonomic and functional profiles of microbiota across habitats or individuals. While a large portion of metagenomic analyses focus on species-level profiling, some studies use strain-level metagenomic analyses to investigate the relationship between specific strains and certain circumstances. Metatranscriptomic analysis provides another important insight into activities of genes by examining gene expression levels of microbiota. Hence, combining metagenomic and metatranscriptomic analyses will help understand the activity or enrichment of a given gene set, such as drug-resistant genes among microbiome samples. Here, we summarize existing bioinformatics tools of metagenomic and metatranscriptomic data analysis, the purpose of which is to assist researchers in deciding the appropriate tools for their microbiome studies. Additionally, we propose an Integrated Meta-Function mapping pipeline to incorporate various reference databases and accelerate functional gene mapping procedures for both metagenomic and metatranscriptomic analyses. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  19. Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center

    Science.gov (United States)

    Wattam, Alice R.; Davis, James J.; Assaf, Rida; Boisvert, Sébastien; Brettin, Thomas; Bun, Christopher; Conrad, Neal; Dietrich, Emily M.; Disz, Terry; Gabbard, Joseph L.; Gerdes, Svetlana; Henry, Christopher S.; Kenyon, Ronald W.; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K.; Olsen, Gary J.; Murphy-Olson, Daniel E.; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D.; Shukla, Maulik; Vonstein, Veronika; Warren, Andrew; Xia, Fangfang; Yoo, Hyunseung; Stevens, Rick L.

    2017-01-01

    The Pathosystems Resource Integration Center (PATRIC) is the bacterial Bioinformatics Resource Center (https://www.patricbrc.org). Recent changes to PATRIC include a redesign of the web interface and some new services that provide users with a platform that takes them from raw reads to an integrated analysis experience. The redesigned interface allows researchers direct access to tools and data, and the emphasis has changed to user-created genome-groups, with detailed summaries and views of the data that researchers have selected. Perhaps the biggest change has been the enhanced capability for researchers to analyze their private data and compare it to the available public data. Researchers can assemble their raw sequence reads and annotate the contigs using RASTtk. PATRIC also provides services for RNA-Seq, variation, model reconstruction and differential expression analysis, all delivered through an updated private workspace. Private data can be compared by ‘virtual integration’ to any of PATRIC's public data. The number of genomes available for comparison in PATRIC has expanded to over 80 000, with a special emphasis on genomes with antimicrobial resistance data. PATRIC uses this data to improve both subsystem annotation and k-mer classification, and tags new genomes as having signatures that indicate susceptibility or resistance to specific antibiotics. PMID:27899627

  20. Bioinformatic analysis of phage AB3, a phiKMV-like virus infecting Acinetobacter baumannii.

    Science.gov (United States)

    Zhang, J; Liu, X; Li, X-J

    2015-01-16

    The phages of Acinetobacter baumannii has drawn increasing attention because of the multi-drug resistance of A. baumanni. The aim of this study was to sequence Acinetobacter baumannii phage AB3 and conduct bioinformatic analysis to lay a foundation for genome remodeling and phage therapy. We isolated and sequenced A. baumannii phage AB3 and attempted to annotate and analyze its genome. The results showed that the genome is a double-stranded DNA with a total length of 31,185 base pairs (bp) and 97 open reading frames greater than 100 bp. The genome includes 28 predicted genes, of which 24 are homologous to phage AB1. The entire coding sequence is located on the negative strand, representing 90.8% of the total length. The G+C mol% was 39.18%, without areas of high G+C content over 200 bp in length. No GC island, tRNA gene, or repeated sequence was identified. Gene lengths were 120-3099 bp, with an average of 1011 bp. Six genes were found to be greater than 2000 bp in length. Genomic alignment and phylogenetic analysis of the RNA polymerase gene showed that similar to phage AB1, phage AB3 is a phiKMV-like virus in the T7 phage family.

  1. Aligning experimental design with bioinformatics analysis to meet discovery research objectives.

    Science.gov (United States)

    Kane, Michael D

    2002-01-01

    The utility of genomic technology and bioinformatic analytical support to provide new and needed insight into the molecular basis of disease, development, and diversity continues to grow as more research model systems and populations are investigated. Yet deriving results that meet a specific set of research objectives requires aligning or coordinating the design of the experiment, the laboratory techniques, and the data analysis. The following paragraphs describe several important interdependent factors that need to be considered to generate high quality data from the microarray platform. These factors include aligning oligonucleotide probe design with the sample labeling strategy if oligonucleotide probes are employed, recognizing that compromises are inherent in different sample procurement methods, normalizing 2-color microarray raw data, and distinguishing the difference between gene clustering and sample clustering. These factors do not represent an exhaustive list of technical variables in microarray-based research, but this list highlights those variables that span both experimental execution and data analysis. Copyright 2001 Wiley-Liss, Inc.

  2. Consensus strategy in genes prioritization and combined bioinformatics analysis for preeclampsia pathogenesis.

    Science.gov (United States)

    Tejera, Eduardo; Cruz-Monteagudo, Maykel; Burgos, Germán; Sánchez, María-Eugenia; Sánchez-Rodríguez, Aminael; Pérez-Castillo, Yunierkis; Borges, Fernanda; Cordeiro, Maria Natália Dias Soeiro; Paz-Y-Miño, César; Rebelo, Irene

    2017-08-08

    Preeclampsia is a multifactorial disease with unknown pathogenesis. Even when recent studies explored this disease using several bioinformatics tools, the main objective was not directed to pathogenesis. Additionally, consensus prioritization was proved to be highly efficient in the recognition of genes-disease association. However, not information is available about the consensus ability to early recognize genes directly involved in pathogenesis. Therefore our aim in this study is to apply several theoretical approaches to explore preeclampsia; specifically those genes directly involved in the pathogenesis. We firstly evaluated the consensus between 12 prioritization strategies to early recognize pathogenic genes related to preeclampsia. A communality analysis in the protein-protein interaction network of previously selected genes was done including further enrichment analysis. The enrichment analysis includes metabolic pathways as well as gene ontology. Microarray data was also collected and used in order to confirm our results or as a strategy to weight the previously enriched pathways. The consensus prioritized gene list was rationally filtered to 476 genes using several criteria. The communality analysis showed an enrichment of communities connected with VEGF-signaling pathway. This pathway is also enriched considering the microarray data. Our result point to VEGF, FLT1 and KDR as relevant pathogenic genes, as well as those connected with NO metabolism. Our results revealed that consensus strategy improve the detection and initial enrichment of pathogenic genes, at least in preeclampsia condition. Moreover the combination of the first percent of the prioritized genes with protein-protein interaction network followed by communality analysis reduces the gene space. This approach actually identifies well known genes related with pathogenesis. However, genes like HSP90, PAK2, CD247 and others included in the first 1% of the prioritized list need to be further

  3. Molecular cloing and bioinformatics analysis of lactate dehydrogenase from Taenia multiceps.

    Science.gov (United States)

    Guo, Cheng; Wang, Yu; Huang, Xing; Wang, Ning; Yan, Ming; He, Ran; Gu, Xiaobin; Xie, Yue; Lai, Weimin; Jing, Bo; Peng, Xuerong; Yang, Guangyou

    2017-10-01

    Coenurus cerebralis, the larval stage (metacestode or coenurus) of Taenia multiceps, parasitizes sheep, goats, and other ruminants and causes coenurosis. In this study, we isolated and characterized complementary DNAs that encode lactate dehydrogenase A (Tm-LDHA) and B (Tm-LDHB) from the transcriptome of T. multiceps and expressed recombinant Tm-LDHB (rTm-LDHB) in Escherichia coli. Bioinformatic analysis showed that both Tm-LDH genes (LDHA and LDHB) contain a 996-bp open reading frame and encode a protein of 331 amino acids. After determination of the immunogenicity of the recombinant Tm-LDHB, an indirect enzyme-linked immunosorbent assay (ELISA) was developed for preliminary evaluation of the serodiagnostic potential of rTm-LDHB in goats. However, the rTm-LDHB-based indirect ELISA developed here exhibited specificity of only 71.42% (10/14) and sensitivity of 1:3200 in detection of goats infected with T. multiceps in the field. This study is the first to describe LDHA and LDHB of T. multiceps; meanwhile, our results indicate that rTm-LDHB is not a specific antigen candidate for immunodiagnosis of T. multiceps infection in goats.

  4. Effect of Wnt3a on Keratinocytes Utilizing in Vitro and Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Ju-Suk Nam

    2014-03-01

    Full Text Available Wingless-type (Wnt signaling proteins participate in various cell developmental processes. A suppressive role of Wnt5a on keratinocyte growth has already been observed. However, the role of other Wnt proteins in proliferation and differentiation of keratinocytes remains unknown. Here, we investigated the effects of the Wnt ligand, Wnt3a, on proliferation and differentiation of keratinocytes. Keratinocytes from normal human skin were cultured and treated with recombinant Wnt3a alone or in combination with the inflammatory cytokine, tumor necrosis factor α (TNFα. Furthermore, using bioinformatics, we analyzed the biochemical parameters, molecular evolution, and protein–protein interaction network for the Wnt family. Application of recombinant Wnt3a showed an anti-proliferative effect on keratinocytes in a dose-dependent manner. After treatment with TNFα, Wnt3a still demonstrated an anti-proliferative effect on human keratinocytes. Exogenous treatment of Wnt3a was unable to alter mRNA expression of differentiation markers of keratinocytes, whereas an altered expression was observed in TNFα-stimulated keratinocytes. In silico phylogenetic, biochemical, and protein–protein interaction analysis showed several close relationships among the family members of the Wnt family. Moreover, a close phylogenetic and biochemical similarity was observed between Wnt3a and Wnt5a. Finally, we proposed a hypothetical mechanism to illustrate how the Wnt3a protein may inhibit the process of proliferation in keratinocytes, which would be useful for future researchers.

  5. Prokaryotic Expression of Rice Ospgip1 Gene and Bioinformatic Analysis of Encoded Product

    Directory of Open Access Journals (Sweden)

    Xi-jun CHEN

    2011-12-01

    Full Text Available Using the reference sequences of pgip genes in GenBank, a fragment of 930 bp covering the open reading frame (ORF of rice Ospgip1 (Oryza sativa polygalacturonase-inhibiting protein 1 was amplified. The prokaryotic expression product of the gene inhibited the growth of Rhizoctonia solani, the causal agent of rice sheath blight, and reduced its polygalacturonase activity. Bioinformatic analysis showed that OsPGIP1 is a hydrophobic protein with a molecular weight of 32.8 kDa and an isoelectric point (pI of 7.26. The protein is mainly located in the cell wall of rice, and its signal peptide cleavage site is located between the 17th and 18th amino acids. There are four cysteines in both the N- and C-termini of the deduced protein, which can form three disulfide bonds (between the 56th and 63rd, the 278th and 298th, and the 300th and 308th amino acids. The protein has a typical leucine-rich repeat (LRR domain, and its secondary structure comprises α-helices, β-sheets and irregular coils. Compared with polygalacturonase-inhibiting proteins (PGIPs from other plants, the 7th LRR is absent in OsPGIP1. The nine LRRs could form a cleft that might associate with proteins from pathogenic fungi, such as polygalacturonase.

  6. Identification of Missense Mutation (I12T in the BSND Gene and Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Hina Iqbal

    2011-01-01

    Full Text Available Nonsyndromic hearing loss is a paradigm of genetic heterogeneity with 85 loci and 39 nuclear disease genes reported so far. Mutations of BSND have been shown to cause Bartter syndrome type IV, characterized by significant renal abnormalities and deafness and nonsyndromic nearing loss. We studied a Pakistani consanguineous family. Clinical examinations of affected individuals did not reveal the presence of any associated signs, which are hallmarks of the Bartter syndrome type IV. Linkage analysis identified an area of 18.36 Mb shared by all affected individuals between markers D1S2706 and D1S1596. A maximum two-point LOD score of 2.55 with markers D1S2700 and multipoint LOD score of 3.42 with marker D1S1661 were obtained. BSND mutation, that is, p.I12T, cosegregated in all extant members of our pedigree. BSND mutations can cause nonsyndromic hearing loss, and it is a second report for this mutation. The respected protein, that is, BSND, was first modeled, and then, the identified mutation was further analyzed by using different bioinformatics tools; finally, this protein and its mutant was docked with CLCNKB and REN, interactions of BSND, respectively.

  7. COMPARISON OF POPULAR BIOINFORMATICS DATABASES

    OpenAIRE

    Abdulganiyu Abdu Yusuf; Zahraddeen Sufyanu; Kabir Yusuf Mamman; Abubakar Umar Suleiman

    2016-01-01

    Bioinformatics is the application of computational tools to capture and interpret biological data. It has wide applications in drug development, crop improvement, agricultural biotechnology and forensic DNA analysis. There are various databases available to researchers in bioinformatics. These databases are customized for a specific need and are ranged in size, scope, and purpose. The main drawbacks of bioinformatics databases include redundant information, constant change, data spread over m...

  8. Bioinformatics analysis of RNA-seq data revealed critical genes in colon adenocarcinoma.

    Science.gov (United States)

    Xi, W-D; Liu, Y-J; Sun, X-B; Shan, J; Yi, L; Zhang, T-T

    2017-07-01

    RNA-seq data of colon adenocarcinoma (COAD) were analyzed with bioinformatics tools to discover critical genes in the disease. Relevant small molecule drugs, transcription factors (TFs) and microRNAs (miRNAs) were also investigated. RNA-seq data of COAD were downloaded from The Cancer Genome Atlas (TCGA). Differential analysis was performed with package edgeR. False positive discovery (FDR) 1 were set as the cut-offs to screen out differentially expressed genes (DEGs). Gene coexpression network was constructed with package Ebcoexpress. GO enrichment analysis was performed for the DEGs in the gene coexpression network with DAVID. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis was also performed for the genes with KOBASS 2.0. Modules were identified with MCODE of Cytoscape. Relevant small molecules drugs were predicted by Connectivity map. Relevant miRNAs and TFs were searched by WebGestalt. A total of 457 DEGs, including 255 up-regulated and 202 down-regulated genes, were identified from 437 COAD and 39 control samples. A gene coexpression network was constructed containing 40 DEGs and 101 edges. The genes were mainly associated with collagen fibril organization, extracellular matrix organization and translation. Two modules were identified from the gene coexpression network, which were implicated in muscle contraction and extracellular matrix organization, respectively. Several critical genes were disclosed, such as MYH11, COL5A2 and ribosomal proteins. Nine relevant small molecule drugs were identified, such as scriptaid and STOCK1N-35874. Accordingly, a total of 17 TFs and 10 miRNAs related to COAD were acquired, such as ETS2, NFAT, AP4, miR-124A, MiR-9, miR-96 and let-7. Several critical genes and relevant drugs, TFs and miRNAs were revealed in COAD. These findings could advance the understanding of the disease and benefit therapy development.

  9. [Gene cloning and bioinformatics analysis of SABATH methyltransferase in Lonicera japonica var. chinensis].

    Science.gov (United States)

    Yu, Xiao-Dan; Jiang, Chao; Huang, Lu-Qi; Qin, Shuang-Shuang; Zeng, Xiang-Mei; Chen, Ping; Yuan, Yuan

    2013-08-01

    To clone SABATH methyltransferase (rLjSABATHMT) gene in Lonicera japonica var. chinensis, and compare the gene expression and intron sequence of SABATH methyltransferase orthologous in L. japonica with L. japonica var. chinensis. It provide a basis for gene regulate the formation of L. japonica floral scents. The cDNA and genome sequences of LjSABATHMT from L. japonica var. chinensis were cloned according to the gene fragments in cDNA library. The LjSABATHMT protein was characterized by bioinformatics analysis. SABATH family phylogenetic tree were built by MEGA 5.0. The transcripted level of SABATHMT orthologous were analyzed in different organs and different flower periods of L. japonica and L. japonica var. chinensis using RT-PCR analysis. Intron sequences of SABATHMT orthologous were also analyzied. The cDNA of LjSABATHMT was 1 251 bp, had a complete coding frame with 365 amino acids. The protein had the conservative SABATHMT domain, and phylogenetic tree showed that it may be a salicylic acid/benzoic acid methyltransferase. Higher expression of SABATH methyltransferase orthologous was found in flower. The intron sequence of L. japonica and L. japonica var. chinensis had rich polymorphism, and two SNP are unique genotype of L. japonica var. chinensis. The motif elements in two orthologous genes were significant differences. The intron difference of SABATH methyltransferase orthologous could be inducing to difference of gene expression between L. japonica and L. japonica var. chinensis. These results will provide important base on regulating active compounds of L. japonica.

  10. ANLN functions as a key candidate gene in cervical cancer as determined by integrated bioinformatic analysis

    Directory of Open Access Journals (Sweden)

    Xia L

    2018-04-01

    DEGs PPI network complex, contained 305 nodes and 4,962 edges, and 8 clusters were calculated according to k-core =2. Among them, cluster 1, which had 65 nodes and 1,780 edges, had the highest score in these clusters. In coexpression analysis, there were 86 hubgenes from the Brown modules that were chosen for further analysis. Sixty-one key genes were identified as the intersecting genes of the Brown module of WGCNA and DEGs. In survival analysis, only ANLN was a prognostic factor, and the survival was significantly better in the low-expression ANLN group.Conclusion: Our study suggested that ANLN may be a potential tumor oncogene and could serve as a biomarker for predicting the prognosis of cervical cancer patients. Keywords: bioinformatics analysis, cervical cancer, WGCNA, ANLN 

  11. Evaluating Dynamic Analysis Techniques for Program Comprehension

    NARCIS (Netherlands)

    Cornelissen, S.G.M.

    2009-01-01

    Program comprehension is an essential part of software development and software maintenance, as software must be sufficiently understood before it can be properly modified. One of the common approaches in getting to understand a program is the study of its execution, also known as dynamic analysis.

  12. Introduction to bioinformatics.

    Science.gov (United States)

    Can, Tolga

    2014-01-01

    Bioinformatics is an interdisciplinary field mainly involving molecular biology and genetics, computer science, mathematics, and statistics. Data intensive, large-scale biological problems are addressed from a computational point of view. The most common problems are modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution usually involves the following steps: Collect statistics from biological data. Build a computational model. Solve a computational modeling problem. Test and evaluate a computational algorithm. This chapter gives a brief introduction to bioinformatics by first providing an introduction to biological terminology and then discussing some classical bioinformatics problems organized by the types of data sources. Sequence analysis is the analysis of DNA and protein sequences for clues regarding function and includes subproblems such as identification of homologs, multiple sequence alignment, searching sequence patterns, and evolutionary analyses. Protein structures are three-dimensional data and the associated problems are structure prediction (secondary and tertiary), analysis of protein structures for clues regarding function, and structural alignment. Gene expression data is usually represented as matrices and analysis of microarray data mostly involves statistics analysis, classification, and clustering approaches. Biological networks such as gene regulatory networks, metabolic pathways, and protein-protein interaction networks are usually modeled as graphs and graph theoretic approaches are used to solve associated problems such as construction and analysis of large-scale networks.

  13. Identification of microRNAs from Eugenia uniflora by high-throughput sequencing and bioinformatics analysis.

    Science.gov (United States)

    Guzman, Frank; Almerão, Mauricio P; Körbes, Ana P; Loss-Morais, Guilherme; Margis, Rogerio

    2012-01-01

    microRNAs or miRNAs are small non-coding regulatory RNAs that play important functions in the regulation of gene expression at the post-transcriptional level by targeting mRNAs for degradation or inhibiting protein translation. Eugenia uniflora is a plant native to tropical America with pharmacological and ecological importance, and there have been no previous studies concerning its gene expression and regulation. To date, no miRNAs have been reported in Myrtaceae species. Small RNA and RNA-seq libraries were constructed to identify miRNAs and pre-miRNAs in Eugenia uniflora. Solexa technology was used to perform high throughput sequencing of the library, and the data obtained were analyzed using bioinformatics tools. From 14,489,131 small RNA clean reads, we obtained 1,852,722 mature miRNA sequences representing 45 conserved families that have been identified in other plant species. Further analysis using contigs assembled from RNA-seq allowed the prediction of secondary structures of 25 known and 17 novel pre-miRNAs. The expression of twenty-seven identified miRNAs was also validated using RT-PCR assays. Potential targets were predicted for the most abundant mature miRNAs in the identified pre-miRNAs based on sequence homology. This study is the first large scale identification of miRNAs and their potential targets from a species of the Myrtaceae family without genomic sequence resources. Our study provides more information about the evolutionary conservation of the regulatory network of miRNAs in plants and highlights species-specific miRNAs.

  14. Comprehensive analysis of transport aircraft flight performance

    Science.gov (United States)

    Filippone, Antonio

    2008-04-01

    This paper reviews the state-of-the art in comprehensive performance codes for fixed-wing aircraft. The importance of system analysis in flight performance is discussed. The paper highlights the role of aerodynamics, propulsion, flight mechanics, aeroacoustics, flight operation, numerical optimisation, stochastic methods and numerical analysis. The latter discipline is used to investigate the sensitivities of the sub-systems to uncertainties in critical state parameters or functional parameters. The paper discusses critically the data used for performance analysis, and the areas where progress is required. Comprehensive analysis codes can be used for mission fuel planning, envelope exploration, competition analysis, a wide variety of environmental studies, marketing analysis, aircraft certification and conceptual aircraft design. A comprehensive program that uses the multi-disciplinary approach for transport aircraft is presented. The model includes a geometry deck, a separate engine input deck with the main parameters, a database of engine performance from an independent simulation, and an operational deck. The comprehensive code has modules for deriving the geometry from bitmap files, an aerodynamics model for all flight conditions, a flight mechanics model for flight envelopes and mission analysis, an aircraft noise model and engine emissions. The model is validated at different levels. Validation of the aerodynamic model is done against the scale models DLR-F4 and F6. A general model analysis and flight envelope exploration are shown for the Boeing B-777-300 with GE-90 turbofan engines with intermediate passenger capacity (394 passengers in 2 classes). Validation of the flight model is done by sensitivity analysis on the wetted area (or profile drag), on the specific air range, the brake-release gross weight and the aircraft noise. A variety of results is shown, including specific air range charts, take-off weight-altitude charts, payload-range performance

  15. Bioinformatic analysis of functional differences between the immunoproteasome and the constitutive proteasome

    DEFF Research Database (Denmark)

    Kesmir, Can; van Noort, V.; de Boer, R.J.

    2003-01-01

    not yet been quantified how different the specificity of two forms of the proteasome are. The main question, which still lacks direct evidence, is whether the immunoproteasome generates more MHC ligands. Here we use bioinformatics tools to quantify these differences and show that the immunoproteasome...

  16. A Critical Analysis of Assessment Quality in Genomics and Bioinformatics Education Research

    Science.gov (United States)

    Campbell, Chad E.; Nehm, Ross H.

    2013-01-01

    The growing importance of genomics and bioinformatics methods and paradigms in biology has been accompanied by an explosion of new curricula and pedagogies. An important question to ask about these educational innovations is whether they are having a meaningful impact on students' knowledge, attitudes, or skills. Although assessments are…

  17. PipeCraft: Flexible open-source toolkit for bioinformatics analysis of custom high-throughput amplicon sequencing data.

    Science.gov (United States)

    Anslan, Sten; Bahram, Mohammad; Hiiesalu, Indrek; Tedersoo, Leho

    2017-11-01

    High-throughput sequencing methods have become a routine analysis tool in environmental sciences as well as in public and private sector. These methods provide vast amount of data, which need to be analysed in several steps. Although the bioinformatics may be applied using several public tools, many analytical pipelines allow too few options for the optimal analysis for more complicated or customized designs. Here, we introduce PipeCraft, a flexible and handy bioinformatics pipeline with a user-friendly graphical interface that links several public tools for analysing amplicon sequencing data. Users are able to customize the pipeline by selecting the most suitable tools and options to process raw sequences from Illumina, Pacific Biosciences, Ion Torrent and Roche 454 sequencing platforms. We described the design and options of PipeCraft and evaluated its performance by analysing the data sets from three different sequencing platforms. We demonstrated that PipeCraft is able to process large data sets within 24 hr. The graphical user interface and the automated links between various bioinformatics tools enable easy customization of the workflow. All analytical steps and options are recorded in log files and are easily traceable. © 2017 John Wiley & Sons Ltd.

  18. Bioinformatics for cancer immunotherapy target discovery

    DEFF Research Database (Denmark)

    Olsen, Lars Rønn; Campos, Benito; Barnkob, Mike Stein

    2014-01-01

    therapy target discovery in a bioinformatics analysis pipeline. We describe specialized bioinformatics tools and databases for three main bottlenecks in immunotherapy target discovery: the cataloging of potentially antigenic proteins, the identification of potential HLA binders, and the selection epitopes...

  19. Gene expression profile analysis of colorectal cancer to investigate potential mechanisms using bioinformatics

    Directory of Open Access Journals (Sweden)

    Kou YB

    2015-04-01

    Full Text Available Yubin Kou,1,2* Suya Zhang,3* Xiaoping Chen,2 Sanyuan Hu1 1Department of General Surgery, Qilu Hospital of Shandong University, Jinan, People’s Republic of China; 2Department of General Surgery, 3Department of Neurology, Shuguang Hospital Baoshan Branch, Shanghai, People’s Republic of China *These authors contributed equally to this work Abstract: This study aimed to explore the underlying molecular mechanisms of colorectal cancer (CRC using bioinformatics analysis. Using GSE4107 datasets downloaded from the Gene Expression Omnibus, the differentially expressed genes (DEGs were screened by comparing the RNA expression from the colonic mucosa between 12 CRC patients and ten healthy controls using a paired t-test. The Gene Ontology (GO functional and pathway enrichment analyses of DEGs were performed using the Database for Annotation, Visualization and Integrated Discovery (DAVID software followed by the construction of a protein–protein interaction (PPI network. In addition, hub gene identification and GO functional and pathway enrichment analyses of the modules were performed. A total of 612 up- and 639 downregulated genes were identified. The upregulated DEGs were mainly involved in the regulation of cell growth, migration, and the MAPK signaling pathway. The downregulated DEGs were significantly associated with oxidative phosphorylation, Alzheimer’s disease, and Parkinson’s disease. Moreover, FOS, FN1, PPP1CC, and CYP2B6 were selected as hub genes in the PPI networks. Two modules (up-A and up-B in the upregulated PPI network and three modules (d-A, d-B, and d-C in the downregulated PPI were identified with the threshold of Molecular Complex Detection (MCODE Molecular Complex Detection (MCODE score ≥4 and nodes ≥6. The genes in module up-A were significantly enriched in neuroactive ligand–receptor interactions and the calcium signaling pathway. The genes in module d-A were enriched in four pathways, including oxidative

  20. Bioinformatics analysis identifies several intrinsically disordered human E3 ubiquitin-protein ligases

    Directory of Open Access Journals (Sweden)

    Wouter Boomsma

    2016-02-01

    Full Text Available The ubiquitin-proteasome system targets misfolded proteins for degradation. Since the accumulation of such proteins is potentially harmful for the cell, their prompt removal is important. E3 ubiquitin-protein ligases mediate substrate ubiquitination by bringing together the substrate with an E2 ubiquitin-conjugating enzyme, which transfers ubiquitin to the substrate. For misfolded proteins, substrate recognition is generally delegated to molecular chaperones that subsequently interact with specific E3 ligases. An important exception is San1, a yeast E3 ligase. San1 harbors extensive regions of intrinsic disorder, which provide both conformational flexibility and sites for direct recognition of misfolded targets of vastly different conformations. So far, no mammalian ortholog of San1 is known, nor is it clear whether other E3 ligases utilize disordered regions for substrate recognition. Here, we conduct a bioinformatics analysis to examine >600 human and S. cerevisiae E3 ligases to identify enzymes that are similar to San1 in terms of function and/or mechanism of substrate recognition. An initial sequence-based database search was found to detect candidates primarily based on the homology of their ordered regions, and did not capture the unique disorder patterns that encode the functional mechanism of San1. However, by searching specifically for key features of the San1 sequence, such as long regions of intrinsic disorder embedded with short stretches predicted to be suitable for substrate interaction, we identified several E3 ligases with these characteristics. Our initial analysis revealed that another remarkable trait of San1 is shared with several candidate E3 ligases: long stretches of complete lysine suppression, which in San1 limits auto-ubiquitination. We encode these characteristic features into a San1 similarity-score, and present a set of proteins that are plausible candidates as San1 counterparts in humans. In conclusion, our work

  1. [Bioinformatics Analysis of Clustered Regularly Interspaced Short Palindromic Repeats in the Genomes of Shigella].

    Science.gov (United States)

    Wang, Pengfei; Wang, Yingfang; Duan, Guangcai; Xue, Zerun; Wang, Linlin; Guo, Xiangjiao; Yang, Haiyan; Xi, Yuanlin

    2015-04-01

    This study was aimed to explore the features of clustered regularly interspaced short palindromic repeats (CRISPR) structures in Shigella by using bioinformatics. We used bioinformatics methods, including BLAST, alignment and RNA structure prediction, to analyze the CRISPR structures of Shigella genomes. The results showed that the CRISPRs existed in the four groups of Shigella, and the flanking sequences of upstream CRISPRs could be classified into the same group with those of the downstream. We also found some relatively conserved palindromic motifs in the leader sequences. Repeat sequences had the same group with corresponding flanking sequences, and could be classified into two different types by their RNA secondary structures, which contain "stem" and "ring". Some spacers were found to homologize with part sequences of plasmids or phages. The study indicated that there were correlations between repeat sequences and flanking sequences, and the repeats might act as a kind of recognition mechanism to mediate the interaction between foreign genetic elements and Cas proteins.

  2. Bioinformatic Analysis of Potential Biomarkers for Spinal Cord Injured Patients With Intractable Neuropathic Pain.

    Science.gov (United States)

    Wang, Yimin; Ye, Fang; Huang, Chanyan; Xue, Faling; Li, Yingyuan; Gao, Shaowei; Qiu, Zeting; Li, Si; Chen, Qinchang; Zhou, Huaqiang; Song, Yiyan; Huang, Wenqi; Tan, Wulin; Wang, Zhongxing

    2018-03-15

    Neuropathic pain is one of the common complications after spinal cord injury (SCI), affecting patients' life quality. The molecular mechanism for neuropathic pain after SCI is still unclear. We aimed to discover potential genes and MicroRNAs(miRNAs) related to neuropathic pain by bioinformatics method. Microarray data of GSE69901 were obtained from Gene Expression Omnibus (GEO) database. Peripheral blood samples from patients with or without neuropathic pain after spinal cord injury (SCI) were collected. 12 samples with neuropathic pain and 13 samples without pain as control were included in the downloaded microarray. Differentially expressed genes (DEGs) between neuropathic pain group and control group were detected using GEO2R online tool. Functional enrichment analysis of DEGs was performed using DAVID database. Protein-protein interaction (PPI) network was constructed from STRING database. MiRNAs targeting these DEGs were obtained from miRNet database. A merged miRNA-DEG network was constructed and analyzed with Cytoscape software. Total 1134 DEGs were identified between patients with or without neuropathic pain(case and control) and 454 biological processes were enriched. We identified 4 targeted miRNAs, including mir-204-5p, mir-519d-3p, mir-20b-5p, mir-6838-5p, which may be the potential biomarker for SCI patients. Protein modification and regulation biological process of central nervous system may be a risk factor of in SCI patients. Certain genes and miRNAs may be potential biomarkers for the prediction of and potential targets for prevention and treatment of neuropathic pain after SCI.This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND), where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal. http

  3. Bioinformatics Analysis of NBS-LRR Encoding Resistance Genes in Setaria italica.

    Science.gov (United States)

    Zhao, Yan; Weng, Qiaoyun; Song, Jinhui; Ma, Hailian; Yuan, Jincheng; Dong, Zhiping; Liu, Yinghui

    2016-06-01

    In plants, resistance (R) genes are involved in pathogen recognition and subsequent activation of innate immune responses. The nucleotide-binding site-leucine-rich repeat (NBS-LRR) genes family forms the largest R-gene family among plant genomes and play an important role in plant disease resistance. In this paper, comprehensive analysis of NBS-encoding genes is performed in the whole Setaria italica genome. A total of 96 NBS-LRR genes are identified, and comprehensive overview of the NBS-LRR genes is undertaken, including phylogenetic analysis, chromosome locations, conserved motifs of proteins, and gene expression. Based on the domain, these genes are divided into two groups and distributed in all Setaria italica chromosomes. Most NBS-LRR genes are located at the distal tip of the long arms of the chromosomes. Setaria italica NBS-LRR proteins share at least one nucleotide-biding domain and one leucine-rich repeat domain. Our results also show the duplication of NBS-LRR genes in Setaria italica is related to their gene structure.

  4. Bioinformatics and Phylogenetic Analysis of Mitochondrial COX3 Gene in Iranian Camelus Dromedaries and Camelus Bactrianus

    Directory of Open Access Journals (Sweden)

    Tooba Abbassi-Daloii

    2016-11-01

    Full Text Available Introduction Camels belong to the family of Camelidae, suborder of Tylopoda, order of artiodactyla and class of mammalians. The family Camelidae has two old world species, double-humped camel (CAMELUS BACTRIANUS and single-humped camel (CAMELUS DROMEDARIES and four new world (tribe Lamini species, guanaco (LAMA GUANICOE, llama (LAMA GLAMA, alpaca (LAMA PACOS and vicuna (LAMA VICUGNA or VICUGNA VICUGNA at present time. The single-humped camel inhabits Afro-Arabia, Ethiopia and west Central Asia while the double-humped inhabits eastern Central Asia and China. Camel has been historically and economically an important species worldwide especially in the Africa and Asia. Camel has unique characteristics enable it to adapt its desert environment. The total worldwide camel population at present estimated to be about 23 million in the world. Somalia and Sudan together hold approximately 50% of the whole camel population. In the last 40 years, the number of camels has increased by almost 45%. Iranian native species are considered as part of the national capital so their preservation is so important. Due to severe decrease in their population in some areas, more attention to conservation genetics perspective of these species is very important. The aim of this study was to bioinformatics and phylogenetic analysis of mitochondrial sequence of cytochrome c oxidase subunit 3 (COX3 in Iranian Camelus dromedaries and Camelus bactrianus. Materials and Methods For this purpose 10 blood samples were collected from each species (totally 20 samples. After DNA extraction, the fragment with 979 bp length from mitochondrial DNA was amplified using polymerase chain reaction. Sequencing was performed by automated Sanger methods then the obtained sequences were compared with sequences from other studies. The nucleotide sequences obtained were edited using the PHRED software (http://www.phrap.org /phredphrapconsed.html. After editing, basic local alignment search tool

  5. The Cytotoxicity Mechanism of 6-Shogaol-Treated HeLa Human Cervical Cancer Cells Revealed by Label-Free Shotgun Proteomics and Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Qun Liu

    2012-01-01

    Full Text Available Cervical cancer is one of the most common cancers among women in the world. 6-Shogaol is a natural compound isolated from the rhizome of ginger (Zingiber officinale. In this paper, we demonstrated that 6-shogaol induced apoptosis and G2/M phase arrest in human cervical cancer HeLa cells. Endoplasmic reticulum stress and mitochondrial pathway were involved in 6-shogaol-mediated apoptosis. Proteomic analysis based on label-free strategy by liquid chromatography chip quadrupole time-of-flight mass spectrometry was subsequently proposed to identify, in a non-target-biased manner, the molecular changes in cellular proteins in response to 6-shogaol treatment. A total of 287 proteins were differentially expressed in response to 24 h treatment with 15 μM 6-shogaol in HeLa cells. Significantly changed proteins were subjected to functional pathway analysis by multiple analyzing software. Ingenuity pathway analysis (IPA suggested that 14-3-3 signaling is a predominant canonical pathway involved in networks which may be significantly associated with the process of apoptosis and G2/M cell cycle arrest induced by 6-shogaol. In conclusion, this work developed an unbiased protein analysis strategy by shotgun proteomics and bioinformatics analysis. Data observed provide a comprehensive analysis of the 6-shogaol-treated HeLa cell proteome and reveal protein alterations that are associated with its anticancer mechanism.

  6. Towards understanding the lifespan extension by reduced insulin signaling: bioinformatics analysis of DAF-16/FOXO direct targets in Caenorhabditis elegans.

    Science.gov (United States)

    Li, Yan-Hui; Zhang, Gai-Gai

    2016-04-12

    DAF-16, the C. elegans FOXO transcription factor, is an important determinant in aging and longevity. In this work, we manually curated FOXODB http://lyh.pkmu.cn/foxodb/, a database of FOXO direct targets. It now covers 208 genes. Bioinformatics analysis on 109 DAF-16 direct targets in C. elegans found interesting results. (i) DAF-16 and transcription factor PQM-1 co-regulate some targets. (ii) Seventeen targets directly regulate lifespan. (iii) Four targets are involved in lifespan extension induced by dietary restriction. And (iv) DAF-16 direct targets might play global roles in lifespan regulation.

  7. MLH1 Promoter Methylation and Prediction/Prognosis of Gastric Cancer: A Systematic Review and Meta and Bioinformatic Analysis.

    Science.gov (United States)

    Shen, Shixuan; Chen, Xiaohui; Li, Hao; Sun, Liping; Yuan, Yuan

    2018-01-01

    Background: The promoter methylation of MLH1 gene and gastric cancer (GC)has been investigated previously. To get a more credible conclusion, we performed a systematic review and meta and bioinformatic analysis to clarify the role of MLH1 methylation in the prediction and prognosis of GC. Methods: Eligible studies were targeted after searching the PubMed, Web of Science, Embase, BIOSIS, CNKI and Wanfang Data to collect the information of MLH1 methylation and GC. The link strength between the two was estimated by odds ratio with its 95% confidence interval. The Newcastle-Ottawa scale was used for quantity assessment . Subgroup and sensitivity analysis were conducted to explore sources of heterogeneity. The Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) were employed for bioinformatics analysis on the correlation between MLH1 methylation and GC risk, clinicopathological behavior as well as prognosis. Results: 2365 GC and 1563 controls were included in the meta-analysis. The pooled OR of MLH1 methylation in GC was 4.895 (95% CI: 3.149-7.611, PMLH1 methylation enhanced GC risk but might not related with GC clinicopathological features and prognosis. Conclusion: MLH1 methylation is an alive biomarker for the prediction of GC and it might not affect GC behavior. Further study could be conducted to verify the impact of MLH1 methylation on GC prognosis.

  8. Bioinformatics of genomic association mapping

    NARCIS (Netherlands)

    Vaez Barzani, Ahmad

    2015-01-01

    In this thesis we present an overview of bioinformatics-based approaches for genomic association mapping, with emphasis on human quantitative traits and their contribution to complex diseases. We aim to provide a comprehensive walk-through of the classic steps of genomic association mapping

  9. When process mining meets bioinformatics

    NARCIS (Netherlands)

    Jagadeesh Chandra Bose, R.P.; Aalst, van der W.M.P.; Nurcan, S.

    2011-01-01

    Process mining techniques can be used to extract non-trivial process related knowledge and thus generate interesting insights from event logs. Similarly, bioinformatics aims at increasing the understanding of biological processes through the analysis of information associated with biological

  10. Comprehensive proteomic analysis of human pancreatic juice

    DEFF Research Database (Denmark)

    Grønborg, Mads; Bunkenborg, Jakob; Kristiansen, Troels Zakarias

    2004-01-01

    Proteomic technologies provide an excellent means for analysis of body fluids for cataloging protein constituents and identifying biomarkers for early detection of cancers. The biomarkers currently available for pancreatic cancer, such as CA19-9, lack adequate sensitivity and specificity...... contributing to late diagnosis of this deadly disease. In this study, we carried out a comprehensive characterization of the "pancreatic juice proteome" in patients with pancreatic adenocarcinoma. Pancreatic juice was first fractionated by 1-dimensional gel electrophoresis and subsequently analyzed by liquid...... in this study could be directly assessed for their potential as biomarkers for pancreatic cancer by quantitative proteomics methods or immunoassays....

  11. Design and bioinformatics analysis of novel biomimetic peptides as nanocarriers for gene transfer

    Directory of Open Access Journals (Sweden)

    Asia Majidi

    2015-01-01

    Full Text Available Objective(s: The introduction of nucleic acids into cells for therapeutic objectives is significantly hindered by the size and charge of these molecules and therefore requires efficient vectors that assist cellular uptake. For several years great efforts have been devoted to the study of development of recombinant vectors based on biological domains with potential applications in gene therapy. Such vectors have been synthesized in genetically engineered approach, resulting in biomacromolecules with new properties that are not present in nature. Materials and Methods: In this study, we have designed new peptides using homology modeling with the purpose of overcoming the cell barriers for successful gene delivery through Bioinformatics tools. Three different carriers were designed and one of those with better score through Bioinformatics tools was cloned, expressed and its affinity for pDNA was monitored. Results: The resultszz demonstrated that the vector can effectively condense pDNAinto nanoparticles with the average sizes about 100 nm. Conclusion: We hope these peptides can overcome the biological barriers associated with gene transfer, and mediate efficient gene delivery.

  12. A Critical Analysis of Assessment Quality in Genomics and Bioinformatics Education Research

    Science.gov (United States)

    Campbell, Chad E.; Nehm, Ross H.

    2013-01-01

    The growing importance of genomics and bioinformatics methods and paradigms in biology has been accompanied by an explosion of new curricula and pedagogies. An important question to ask about these educational innovations is whether they are having a meaningful impact on students’ knowledge, attitudes, or skills. Although assessments are necessary tools for answering this question, their outputs are dependent on their quality. Our study 1) reviews the central importance of reliability and construct validity evidence in the development and evaluation of science assessments and 2) examines the extent to which published assessments in genomics and bioinformatics education (GBE) have been developed using such evidence. We identified 95 GBE articles (out of 226) that contained claims of knowledge increases, affective changes, or skill acquisition. We found that 1) the purpose of most of these studies was to assess summative learning gains associated with curricular change at the undergraduate level, and 2) a minority (<10%) of studies provided any reliability or validity evidence, and only one study out of the 95 sampled mentioned both validity and reliability. Our findings raise concerns about the quality of evidence derived from these instruments. We end with recommendations for improving assessment quality in GBE. PMID:24006400

  13. BioQueue: a novel pipeline framework to accelerate bioinformatics analysis.

    Science.gov (United States)

    Yao, Li; Wang, Heming; Song, Yuanyuan; Sui, Guangchao

    2017-10-15

    With the rapid development of Next-Generation Sequencing, a large amount of data is now available for bioinformatics research. Meanwhile, the presence of many pipeline frameworks makes it possible to analyse these data. However, these tools concentrate mainly on their syntax and design paradigms, and dispatch jobs based on users' experience about the resources needed by the execution of a certain step in a protocol. As a result, it is difficult for these tools to maximize the potential of computing resources, and avoid errors caused by overload, such as memory overflow. Here, we have developed BioQueue, a web-based framework that contains a checkpoint before each step to automatically estimate the system resources (CPU, memory and disk) needed by the step and then dispatch jobs accordingly. BioQueue possesses a shell command-like syntax instead of implementing a new script language, which means most biologists without computer programming background can access the efficient queue system with ease. BioQueue is freely available at https://github.com/liyao001/BioQueue. The extensive documentation can be found at http://bioqueue.readthedocs.io. li_yao@outlook.com or gcsui@nefu.edu.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  14. Bioinformatics analysis of the predicted polyprenol reductase genes in higher plants

    Science.gov (United States)

    Basyuni, M.; Wati, R.

    2018-03-01

    The present study evaluates the bioinformatics methods to analyze twenty-four predicted polyprenol reductase genes from higher plants on GenBank as well as predicted the structure, composition, similarity, subcellular localization, and phylogenetic. The physicochemical properties of plant polyprenol showed diversity among the observed genes. The percentage of the secondary structure of plant polyprenol genes followed the ratio order of α helix > random coil > extended chain structure. The values of chloroplast but not signal peptide were too low, indicated that few chloroplast transit peptide in plant polyprenol reductase genes. The possibility of the potential transit peptide showed variation among the plant polyprenol reductase, suggested the importance of understanding the variety of peptide components of plant polyprenol genes. To clarify this finding, a phylogenetic tree was drawn. The phylogenetic tree shows several branches in the tree, suggested that plant polyprenol reductase genes grouped into divergent clusters in the tree.

  15. Genetic and bioinformatics analysis of four novel GCK missense variants detected in Caucasian families with GCK-MODY phenotype.

    Science.gov (United States)

    Costantini, S; Malerba, G; Contreas, G; Corradi, M; Marin Vargas, S P; Giorgetti, A; Maffeis, C

    2015-05-01

    Heterozygous loss-of-function mutations in the glucokinase (GCK) gene cause maturity-onset diabetes of the young (MODY) subtype GCK (GCK-MODY/MODY2). GCK sequencing revealed 16 distinct mutations (13 missense, 1 nonsense, 1 splice site, and 1 frameshift-deletion) co-segregating with hyperglycaemia in 23 GCK-MODY families. Four missense substitutions (c.718A>G/p.Asn240Asp, c.757G>T/p.Val253Phe, c.872A>C/p.Lys291Thr, and c.1151C>T/p.Ala384Val) were novel and a founder effect for the nonsense mutation (c.76C>T/p.Gln26*) was supposed. We tested whether an accurate bioinformatics approach could strengthen family-genetic evidence for missense variant pathogenicity in routine diagnostics, where wet-lab functional assays are generally unviable. In silico analyses of the novel missense variants, including orthologous sequence conservation, amino acid substitution (AAS)-pathogenicity predictors, structural modeling and splicing predictors, suggested that the AASs and/or the underlying nucleotide changes are likely to be pathogenic. This study shows how a careful bioinformatics analysis could provide effective suggestions to help molecular-genetic diagnosis in absence of wet-lab validations. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  16. Bioinformatics, interaction network analysis, and neural networks to characterize gene expression of radicular cyst and periapical granuloma.

    Science.gov (United States)

    Poswar, Fabiano de Oliveira; Farias, Lucyana Conceição; Fraga, Carlos Alberto de Carvalho; Bambirra, Wilson; Brito-Júnior, Manoel; Sousa-Neto, Manoel Damião; Santos, Sérgio Henrique Souza; de Paula, Alfredo Maurício Batista; D'Angelo, Marcos Flávio Silveira Vasconcelos; Guimarães, André Luiz Sena

    2015-06-01

    Bioinformatics has emerged as an important tool to analyze the large amount of data generated by research in different diseases. In this study, gene expression for radicular cysts (RCs) and periapical granulomas (PGs) was characterized based on a leader gene approach. A validated bioinformatics algorithm was applied to identify leader genes for RCs and PGs. Genes related to RCs and PGs were first identified in PubMed, GenBank, GeneAtlas, and GeneCards databases. The Web-available STRING software (The European Molecular Biology Laboratory [EMBL], Heidelberg, Baden-Württemberg, Germany) was used in order to build the interaction map among the identified genes by a significance score named weighted number of links. Based on the weighted number of links, genes were clustered using k-means. The genes in the highest cluster were considered leader genes. Multilayer perceptron neural network analysis was used as a complementary supplement for gene classification. For RCs, the suggested leader genes were TP53 and EP300, whereas PGs were associated with IL2RG, CCL2, CCL4, CCL5, CCR1, CCR3, and CCR5 genes. Our data revealed different gene expression for RCs and PGs, suggesting that not only the inflammatory nature but also other biological processes might differentiate RCs and PGs. Copyright © 2015 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.

  17. iSmaRT: a toolkit for a comprehensive analysis of small RNA-Seq data.

    Science.gov (United States)

    Panero, Riccardo; Rinaldi, Antonio; Memoli, Domenico; Nassa, Giovanni; Ravo, Maria; Rizzo, Francesca; Tarallo, Roberta; Milanesi, Luciano; Weisz, Alessandro; Giurato, Giorgio

    2017-03-15

    The interest in investigating the biological roles of small non-coding RNAs (sncRNAs) is increasing, due to the pleiotropic effects of these molecules exert in many biological contexts. While several methods and tools are available to study microRNAs (miRNAs), only few focus on novel classes of sncRNAs, in particular PIWI-interacting RNAs (piRNAs). To overcome these limitations, we implemented iSmaRT ( i ntegrative Sm all R NA T ool-kit), an automated pipeline to analyze smallRNA-Seq data. iSmaRT is a collection of bioinformatics tools and own algorithms, interconnected through a Graphical User Interface (GUI). In addition to performing comprehensive analyses on miRNAs, it implements specific computational modules to analyze piRNAs, predicting novel ones and identifying their RNA targets. A smallRNA-Seq dataset generated from brain samples of Huntington's Disease patients was used here to illustrate iSmaRT performances, demonstrating how the pipeline can provide, in a rapid and user friendly way, a comprehensive analysis of different classes of sncRNAs. iSmaRT is freely available on the web at ftp://labmedmolge-1.unisa.it (User: iSmart - Password: password). aweisz@unisa.it or ggiurato@unisa.it. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  18. Functional and bioinformatics analysis of an exopolysaccharide-related gene (epsN) from Lactobacillus kefiranofaciens ZW3.

    Science.gov (United States)

    Wang, Jingrui; Tang, Wei; Zheng, Yongna; Xing, Zhuqing; Wang, Yanping

    2016-09-01

    A novel lactic acid bacteria strain Lactobacillus kefiranofaciens ZW3 exhibited the characteristics of high production of exopolysaccharide (EPS). The epsN gene, located in the eps gene cluster of this strain, is associated with EPS biosynthesis. Bioinformatics analysis of this gene was performed. The conserved domain analysis showed that the EpsN protein contained MATE-Wzx-like domains. Then the epsN gene was amplified to construct the recombinant expression vector pMG36e-epsN. The results showed that the EPS yields of the recombinants were significantly improved. By determining the yields of EPS and intracellular polysaccharide, it was considered that epsN gene could play its Wzx flippase role in the EPS biosynthesis. This is the first time to prove the effect of EpsN on L. kefiranofaciens EPS biosynthesis and further prove its functional property.

  19. Crowdsourcing for bioinformatics.

    Science.gov (United States)

    Good, Benjamin M; Su, Andrew I

    2013-08-15

    Bioinformatics is faced with a variety of problems that require human involvement. Tasks like genome annotation, image analysis, knowledge-base population and protein structure determination all benefit from human input. In some cases, people are needed in vast quantities, whereas in others, we need just a few with rare abilities. Crowdsourcing encompasses an emerging collection of approaches for harnessing such distributed human intelligence. Recently, the bioinformatics community has begun to apply crowdsourcing in a variety of contexts, yet few resources are available that describe how these human-powered systems work and how to use them effectively in scientific domains. Here, we provide a framework for understanding and applying several different types of crowdsourcing. The framework considers two broad classes: systems for solving large-volume 'microtasks' and systems for solving high-difficulty 'megatasks'. Within these classes, we discuss system types, including volunteer labor, games with a purpose, microtask markets and open innovation contests. We illustrate each system type with successful examples in bioinformatics and conclude with a guide for matching problems to crowdsourcing solutions that highlights the positives and negatives of different approaches.

  20. Entropy-based analysis and bioinformatics-inspired integration of global economic information transfer.

    Directory of Open Access Journals (Sweden)

    Jinkyu Kim

    Full Text Available The assessment of information transfer in the global economic network helps to understand the current environment and the outlook of an economy. Most approaches on global networks extract information transfer based mainly on a single variable. This paper establishes an entirely new bioinformatics-inspired approach to integrating information transfer derived from multiple variables and develops an international economic network accordingly. In the proposed methodology, we first construct the transfer entropies (TEs between various intra- and inter-country pairs of economic time series variables, test their significances, and then use a weighted sum approach to aggregate information captured in each TE. Through a simulation study, the new method is shown to deliver better information integration compared to existing integration methods in that it can be applied even when intra-country variables are correlated. Empirical investigation with the real world data reveals that Western countries are more influential in the global economic network and that Japan has become less influential following the Asian currency crisis.

  1. Bioinformatics analysis of the oxidosqualene cyclase gene and the amino acid sequence in mangrove plants

    Science.gov (United States)

    Basyuni, M.; Wati, R.

    2017-01-01

    This study described the bioinformatics methods to analyze seven oxidosqualene cyclase (OSC) genes from mangrove plants on DDBJ/EMBL/GenBank as well as predicted the structure, composition, similarity, subcellular localization and phylogenetic. The physical and chemical properties of seven mangrove OSC showed variation among the genes. The percentage of the secondary structure of seven mangrove OSC genes followed the order of a helix > random coil > extended chain structure. The values of chloroplast or signal peptide were too low, indicated that no chloroplast transit peptide or signal peptide of secretion pathway in mangrove OSC genes. The target peptide value of mitochondria varied from 0.163 to 0.430, indicated it was possible to exist. These results suggested the importance of understanding the diversity and functional of properties of the different amino acids in mangrove OSC genes. To clarify the relationship among the mangrove OSC gene, a phylogenetic tree was constructed. The phylogenetic tree shows that there are three clusters, Kandelia KcMS join with Bruguiera BgLUS, Rhizophora RsM1 was close to Bruguiera BgbAS, and Rhizophora RcCAS join with Kandelia KcCAS. The present study, therefore, supported the previous results that plant OSC genes form distinct clusters in the tree.

  2. Phylogenetic and bioinformatic analysis of gap junction-related proteins, innexins, pannexins and connexins.

    Science.gov (United States)

    Fushiki, Daisuke; Hamada, Yasuo; Yoshimura, Ryoichi; Endo, Yasuhisa

    2010-04-01

    All multi-cellular animals, including hydra, insects and vertebrates, develop gap junctions, which communicate directly with neighboring cells. Gap junctions consist of protein families called connexins in vertebrates and innexins in invertebrates. Connexins and innexins have no homology in their amino acid sequence, but both are thought to have some similar characteristics, such as a tetra-membrane-spanning structure, formation of a channel by hexamer, and transmission of small molecules (e.g. ions) to neighboring cells. Pannexins were recently identified as a homolog of innexins in vertebrate genomes. Although pannexins are thought to share the function of intercellular communication with connexins and innexins, there is little information about the relationship among these three protein families of gap junctions. We phylgenetically and bioinformatically examined these protein families and other tetra-membrane-spanning proteins using a database and three analytical softwares. The clades formed by pannexin families do not belong to the species classification but do to paralogs of each member of pannexins. Amino acid sequences of pannexins are closely related to those of innexins but less to those of connexins. These data suggest that innexins and pannexins have a common origin, but the relationship between innexins/pannexins and connexins is as slight as that of other tetra-membrane-spanning members.

  3. Bioinformatics analysis and construction of phylogenetic tree of aquaporins from Echinococcus granulosus.

    Science.gov (United States)

    Wang, Fen; Ye, Bin

    2016-09-01

    Cyst echinococcosis caused by the matacestodal larvae of Echinococcus granulosus (Eg), is a chronic, worldwide, and severe zoonotic parasitosis. The treatment of cyst echinococcosis is still difficult since surgery cannot fit the needs of all patients, and drugs can lead to serious adverse events as well as resistance. The screen of target proteins interacted with new anti-hydatidosis drugs is urgently needed to meet the prevailing challenges. Here, we analyzed the sequences and structure properties, and constructed a phylogenetic tree by bioinformatics methods. The MIP family signature and Protein kinase C phosphorylation sites were predicted in all nine EgAQPs. α-helix and random coil were the main secondary structures of EgAQPs. The numbers of transmembrane regions were three to six, which indicated that EgAQPs contained multiple hydrophobic regions. A neighbor-joining tree indicated that EgAQPs were divided into two branches, seven EgAQPs formed a clade with AQP1 from human, a "strict" aquaporins, other two EgAQPs formed a clade with AQP9 from human, an aquaglyceroporins. Unfortunately, homology modeling of EgAQPs was aborted. These results provide a foundation for understanding and researches of the biological function of E. granulosus.

  4. Entropy-based analysis and bioinformatics-inspired integration of global economic information transfer.

    Science.gov (United States)

    Kim, Jinkyu; Kim, Gunn; An, Sungbae; Kwon, Young-Kyun; Yoon, Sungroh

    2013-01-01

    The assessment of information transfer in the global economic network helps to understand the current environment and the outlook of an economy. Most approaches on global networks extract information transfer based mainly on a single variable. This paper establishes an entirely new bioinformatics-inspired approach to integrating information transfer derived from multiple variables and develops an international economic network accordingly. In the proposed methodology, we first construct the transfer entropies (TEs) between various intra- and inter-country pairs of economic time series variables, test their significances, and then use a weighted sum approach to aggregate information captured in each TE. Through a simulation study, the new method is shown to deliver better information integration compared to existing integration methods in that it can be applied even when intra-country variables are correlated. Empirical investigation with the real world data reveals that Western countries are more influential in the global economic network and that Japan has become less influential following the Asian currency crisis.

  5. Novel SINEs families in Medicago truncatula and Lotus japonicus: bioinformatic analysis.

    Science.gov (United States)

    Gadzalski, Marek; Sakowicz, Tomasz

    2011-07-01

    Although short interspersed elements (SINEs) were discovered nearly 30 years ago, the studies of these genomic repeats were mostly limited to animal genomes. Very little is known about SINEs in legumes--one of the most important plant families. Here we report identification, genomic distribution and molecular features of six novel SINE elements in Lotus japonicus (named LJ_SINE-1, -2, -3) and Medicago truncatula (MT_SINE-1, -2, -3), model species of legume. They possess all the structural features commonly found in short interspersed elements including RNA polymerase III promoter, polyA tail and flanking repeats. SINEs described here are present in low to moderate copy numbers from 150 to 3000. Bioinformatic analyses were used to searched public databases, we have shown that three of new SINE elements from M. truncatula seem to be characteristic of Medicago and Trifolium genera. Two SINE families have been found in L. japonicus and one is present in both M. truncatula and L. japonicus. In addition, we are discussing potential activities of the described elements. Copyright © 2011 Elsevier B.V. All rights reserved.

  6. Deep Artificial Neural Networks and Neuromorphic Chips for Big Data Analysis: Pharmaceutical and Bioinformatics Applications

    Science.gov (United States)

    Pastur-Romay, Lucas Antón; Cedrón, Francisco; Pazos, Alejandro; Porto-Pazos, Ana Belén

    2016-01-01

    Over the past decade, Deep Artificial Neural Networks (DNNs) have become the state-of-the-art algorithms in Machine Learning (ML), speech recognition, computer vision, natural language processing and many other tasks. This was made possible by the advancement in Big Data, Deep Learning (DL) and drastically increased chip processing abilities, especially general-purpose graphical processing units (GPGPUs). All this has created a growing interest in making the most of the potential offered by DNNs in almost every field. An overview of the main architectures of DNNs, and their usefulness in Pharmacology and Bioinformatics are presented in this work. The featured applications are: drug design, virtual screening (VS), Quantitative Structure–Activity Relationship (QSAR) research, protein structure prediction and genomics (and other omics) data mining. The future need of neuromorphic hardware for DNNs is also discussed, and the two most advanced chips are reviewed: IBM TrueNorth and SpiNNaker. In addition, this review points out the importance of considering not only neurons, as DNNs and neuromorphic chips should also include glial cells, given the proven importance of astrocytes, a type of glial cell which contributes to information processing in the brain. The Deep Artificial Neuron–Astrocyte Networks (DANAN) could overcome the difficulties in architecture design, learning process and scalability of the current ML methods. PMID:27529225

  7. Deep Artificial Neural Networks and Neuromorphic Chips for Big Data Analysis: Pharmaceutical and Bioinformatics Applications.

    Science.gov (United States)

    Pastur-Romay, Lucas Antón; Cedrón, Francisco; Pazos, Alejandro; Porto-Pazos, Ana Belén

    2016-08-11

    Over the past decade, Deep Artificial Neural Networks (DNNs) have become the state-of-the-art algorithms in Machine Learning (ML), speech recognition, computer vision, natural language processing and many other tasks. This was made possible by the advancement in Big Data, Deep Learning (DL) and drastically increased chip processing abilities, especially general-purpose graphical processing units (GPGPUs). All this has created a growing interest in making the most of the potential offered by DNNs in almost every field. An overview of the main architectures of DNNs, and their usefulness in Pharmacology and Bioinformatics are presented in this work. The featured applications are: drug design, virtual screening (VS), Quantitative Structure-Activity Relationship (QSAR) research, protein structure prediction and genomics (and other omics) data mining. The future need of neuromorphic hardware for DNNs is also discussed, and the two most advanced chips are reviewed: IBM TrueNorth and SpiNNaker. In addition, this review points out the importance of considering not only neurons, as DNNs and neuromorphic chips should also include glial cells, given the proven importance of astrocytes, a type of glial cell which contributes to information processing in the brain. The Deep Artificial Neuron-Astrocyte Networks (DANAN) could overcome the difficulties in architecture design, learning process and scalability of the current ML methods.

  8. Deep Artificial Neural Networks and Neuromorphic Chips for Big Data Analysis: Pharmaceutical and Bioinformatics Applications

    Directory of Open Access Journals (Sweden)

    Lucas Antón Pastur-Romay

    2016-08-01

    Full Text Available Over the past decade, Deep Artificial Neural Networks (DNNs have become the state-of-the-art algorithms in Machine Learning (ML, speech recognition, computer vision, natural language processing and many other tasks. This was made possible by the advancement in Big Data, Deep Learning (DL and drastically increased chip processing abilities, especially general-purpose graphical processing units (GPGPUs. All this has created a growing interest in making the most of the potential offered by DNNs in almost every field. An overview of the main architectures of DNNs, and their usefulness in Pharmacology and Bioinformatics are presented in this work. The featured applications are: drug design, virtual screening (VS, Quantitative Structure–Activity Relationship (QSAR research, protein structure prediction and genomics (and other omics data mining. The future need of neuromorphic hardware for DNNs is also discussed, and the two most advanced chips are reviewed: IBM TrueNorth and SpiNNaker. In addition, this review points out the importance of considering not only neurons, as DNNs and neuromorphic chips should also include glial cells, given the proven importance of astrocytes, a type of glial cell which contributes to information processing in the brain. The Deep Artificial Neuron–Astrocyte Networks (DANAN could overcome the difficulties in architecture design, learning process and scalability of the current ML methods.

  9. Bioinformatics analysis for structure and function of CPR of Plasmodium falciparum.

    Science.gov (United States)

    Fan, Zhigang; Zhang, Lingmin; Yan, Guogang; Wu, Qiang; Gan, Xiufeng; Zhong, Saifeng; Lin, Guifen

    2011-02-01

    To analyse the structure and function of NADPH-cytochrome p450 reductase (CYPOR or CPR) from Plasmodium falciparum (Pf), and to predict its' drug target and vaccine target. The structure, function, drug target and vaccine target of CPR from Plasmodium falciparum were analyzed and predicted by bioinformatics methods. PfCPR, which was older CPR, had close relationship with the CPR from other Plasmodium species, but it was distant from its hosts, such as Homo sapiens and Anopheles. PfCPR was located in the cellular nucleus of Plasmodium falciparum. 335aa-352aa and 591aa - 608aa were inserted the interior side of the nuclear membrane, while 151aa-265aa was located in the nucleolus organizer regions. PfCPR had 40 function sites and 44 protein-protein binding sites in amino acid sequence. The teriary structure of 1aa-700aa was forcep-shaped with wings. 15 segments of PfCPR had no homology with Homo sapien CPR and most were exposed on the surface of the protein. These segments had 25 protein-protein binding sites. While 13 other segments all possessed function sites. The evolution or genesis of Plasmodium falciparum is earlier than those of Homo sapiens. PfCPR is a possible resistance site of antimalarial drug and may involve immune evasion, which is associated with parasite of sporozoite in hepatocytes. PfCPR is unsuitable as vaccine target, but it has at least 13 ideal drug targets. Copyright © 2011 Hainan Medical College. Published by Elsevier B.V. All rights reserved.

  10. Decoding options and accuracy of translation of developmentally regulated UUA codon in Streptomyces: bioinformatic analysis.

    Science.gov (United States)

    Rokytskyy, Ihor; Koshla, Oksana; Fedorenko, Victor; Ostash, Bohdan

    2016-01-01

    The gene bldA for leucyl [Formula: see text] is known for almost 30 years as a key regulator of morphogenesis and secondary metabolism in genus Streptomyces. Codon UUA is the rarest one in Streptomyces genomes and is present exclusively in genes with auxiliary functions. Delayed accumulation of translation-competent [Formula: see text] is believed to confine the expression of UUA-containing transcripts to stationary phase. Implicit to the regulatory function of UUA codon is the assumption about high accuracy of its translation, e.g. the latter should not occur in the absence of cognate [Formula: see text]. However, a growing body of facts points to the possibility of mistranslation of UUA-containing transcripts in the bldA-deficient mutants. It is not known what type of near-cognate tRNA(s) may decode UUA in the absence of cognate tRNA in Streptomyces, and whether UUA possesses certain inherent properties (such as increased/decreased accuracy of decoding) that would favor its use for regulatory purposes. Here we took bioinformatic approach to address these questions. We catalogued the entire complement of tRNA genes from several relevant Streptomyces and identified genes for posttranscriptional modifications of tRNA that might be involved in UUA decoding by cognate and near-cognate tRNAs. Based on tRNA gene content in Streptomyces genomes, we propose possible scenarios of UUA codon mistranslation. UUA is not associated with an increased rate of missense errors as compared to other leucyl codons, contrasting general belief that low-abundant codons are more error-prone than the high-abundant ones.

  11. Bioinformatic analysis suggests that the Cypovirus 1 major core protein cistron harbours an overlapping gene

    Directory of Open Access Journals (Sweden)

    Atkins John F

    2008-05-01

    Full Text Available Abstract Members of the genus Cypovirus (family Reoviridae are common pathogens of insects. These viruses have linear dsRNA genomes divided into 10–11 segments, which have generally been assumed to be monocistronic. Here, bioinformatic evidence is presented for a short overlapping coding sequence (CDS in the cypovirus genome segment encoding the major core capsid protein VP1, overlapping the 5'-terminal region of the VP1 ORF in the +1 reading frame. In Cypovirus type 1 (CPV-1, a 62-codon AUG-initiated open reading frame (hereafter ORFX is present in all four available segment 1 sequences. The pattern of base variations across the sequence alignment indicates that ORFX is subject to functional constraints at the amino acid level (even when the constraints due to coding in the overlapping VP1 reading frame are taken into account; MLOGD software. In fact the translated ORFX shows greater amino acid conservation than the overlapping region of VP1. The genomic location of ORFX is consistent with translation via leaky scanning. A 62–64 codon AUG-initiated ORF is present in a corresponding location and reading frame in other available cypovirus sequences (2 CPV-14, 1 CPV-15 and an 87-codon ORFX homologue may also be present in Aedes pseudoscutellaris reovirus. The ORFX amino acid sequences are hydrophilic and basic, with between 12 and 16 Arg/Lys residues in each though, at 7.5–10.2 kDa, the putative ORFX product is too small to appear on typical published protein gels.

  12. Real analysis a comprehensive course in analysis, part 1

    CERN Document Server

    Simon, Barry

    2015-01-01

    A Comprehensive Course in Analysis by Poincaré Prize winner Barry Simon is a five-volume set that can serve as a graduate-level analysis textbook with a lot of additional bonus information, including hundreds of problems and numerous notes that extend the text and provide important historical background. Depth and breadth of exposition make this set a valuable reference source for almost all areas of classical analysis. Part 1 is devoted to real analysis. From one point of view, it presents the infinitesimal calculus of the twentieth century with the ultimate integral calculus (measure theory)

  13. Web-based bioinformatics workflows for end-to-end RNA-seq data computation and analysis in agricultural animal species

    Science.gov (United States)

    Remarkable advances in next-generation sequencing (NGS) technologies, bioinformatics algorithms, and computational technologies have significantly accelerated genomic research. However, complicated NGS data analysis still remains as a major bottleneck. RNA-seq, as one of the major area in the NGS fi...

  14. Taking Bioinformatics to Systems Medicine.

    Science.gov (United States)

    van Kampen, Antoine H C; Moerland, Perry D

    2016-01-01

    Systems medicine promotes a range of approaches and strategies to study human health and disease at a systems level with the aim of improving the overall well-being of (healthy) individuals, and preventing, diagnosing, or curing disease. In this chapter we discuss how bioinformatics critically contributes to systems medicine. First, we explain the role of bioinformatics in the management and analysis of data. In particular we show the importance of publicly available biological and clinical repositories to support systems medicine studies. Second, we discuss how the integration and analysis of multiple types of omics data through integrative bioinformatics may facilitate the determination of more predictive and robust disease signatures, lead to a better understanding of (patho)physiological molecular mechanisms, and facilitate personalized medicine. Third, we focus on network analysis and discuss how gene networks can be constructed from omics data and how these networks can be decomposed into smaller modules. We discuss how the resulting modules can be used to generate experimentally testable hypotheses, provide insight into disease mechanisms, and lead to predictive models. Throughout, we provide several examples demonstrating how bioinformatics contributes to systems medicine and discuss future challenges in bioinformatics that need to be addressed to enable the advancement of systems medicine.

  15. Integrative Bioinformatic Analysis of Transcriptomic Data Identifies Conserved Molecular Pathways Underlying Ionizing Radiation-Induced Bystander Effects (RIBE

    Directory of Open Access Journals (Sweden)

    Constantinos Yeles

    2017-11-01

    Full Text Available Ionizing radiation-induced bystander effects (RIBE encompass a number of effects with potential for a plethora of damages in adjacent non-irradiated tissue. The cascade of molecular events is initiated in response to the exposure to ionizing radiation (IR, something that may occur during diagnostic or therapeutic medical applications. In order to better investigate these complex response mechanisms, we employed a unified framework integrating statistical microarray analysis, signal normalization, and translational bioinformatics functional analysis techniques. This approach was applied to several microarray datasets from Gene Expression Omnibus (GEO related to RIBE. The analysis produced lists of differentially expressed genes, contrasting bystander and irradiated samples versus sham-irradiated controls. Furthermore, comparative molecular analysis through BioInfoMiner, which integrates advanced statistical enrichment and prioritization methodologies, revealed discrete biological processes, at the cellular level. For example, the negative regulation of growth, cellular response to Zn2+-Cd2+, and Wnt and NIK/NF-kappaB signaling, thus refining the description of the phenotypic landscape of RIBE. Our results provide a more solid understanding of RIBE cell-specific response patterns, especially in the case of high-LET radiations, like α-particles and carbon-ions.

  16. Harmonic analysis a comprehensive course in analysis, part 3

    CERN Document Server

    Simon, Barry

    2015-01-01

    A Comprehensive Course in Analysis by Poincaré Prize winner Barry Simon is a five-volume set that can serve as a graduate-level analysis textbook with a lot of additional bonus information, including hundreds of problems and numerous notes that extend the text and provide important historical background. Depth and breadth of exposition make this set a valuable reference source for almost all areas of classical analysis. Part 3 returns to the themes of Part 1 by discussing pointwise limits (going beyond the usual focus on the Hardy-Littlewood maximal function by including ergodic theorems and m

  17. Fundamentals of bioinformatics and computational biology methods and exercises in matlab

    CERN Document Server

    Singh, Gautam B

    2015-01-01

    This book offers comprehensive coverage of all the core topics of bioinformatics, and includes practical examples completed using the MATLAB bioinformatics toolbox™. It is primarily intended as a textbook for engineering and computer science students attending advanced undergraduate and graduate courses in bioinformatics and computational biology. The book develops bioinformatics concepts from the ground up, starting with an introductory chapter on molecular biology and genetics. This chapter will enable physical science students to fully understand and appreciate the ultimate goals of applying the principles of information technology to challenges in biological data management, sequence analysis, and systems biology. The first part of the book also includes a survey of existing biological databases, tools that have become essential in today’s biotechnology research. The second part of the book covers methodologies for retrieving biological information, including fundamental algorithms for sequence compar...

  18. Molecular Cloning, Bioinformatic Analysis, and Expression of Bombyx mori Lebocin 5 Gene Related to Beauveria bassiana Infection.

    Science.gov (United States)

    Lü, Dingding; Hou, Chengxiang; Qin, Guangxing; Gao, Kun; Chen, Tian; Guo, Xijie

    2017-01-01

    A full-length cDNA of lebocin 5 (BmLeb5) was first cloned from silkworm, Bombyx mori , by rapid amplification of cDNA ends. The BmLeb5 gene is 808 bp in length and the open reading frame encodes a 179-amino acid hydroxyproline-rich peptide. Bioinformatic analysis results showed that BmLeb5 owns an O-glycosylation site and four RXXR motifs as other lebocins. Sequence similarity and phylogenic analysis results indicated that lebocins form a multiple gene family in silkworm as cecropins. Quantitative real-time PCR analysis revealed that BmLeb5 was highest expressed in the fat body. In the silkworm larvae infected by Beauveria bassiana , the expression level of BmLeb5 was upregulated in the fat body and hemolymph which are the most important immune tissues in silkworm. The recombinant protein of BmLeb5 was for the first time successfully expressed with prokaryotic expression system and purified. There are no reports so far that the expression of lebocins could be induced by entomopathogenic fungus. Our study suggested that BmLeb5 might play an important role in the immune response of silkworm to defend B. bassiana infection. The results also provided helpful information for further studying the lebocin family functioned in antifungal immune response in the silkworm.

  19. PinAPL-Py: A comprehensive web-application for the analysis of CRISPR/Cas9 screens.

    Science.gov (United States)

    Spahn, Philipp N; Bath, Tyler; Weiss, Ryan J; Kim, Jihoon; Esko, Jeffrey D; Lewis, Nathan E; Harismendy, Olivier

    2017-11-20

    Large-scale genetic screens using CRISPR/Cas9 technology have emerged as a major tool for functional genomics. With its increased popularity, experimental biologists frequently acquire large sequencing datasets for which they often do not have an easy analysis option. While a few bioinformatic tools have been developed for this purpose, their utility is still hindered either due to limited functionality or the requirement of bioinformatic expertise. To make sequencing data analysis of CRISPR/Cas9 screens more accessible to a wide range of scientists, we developed a Platform-independent Analysis of Pooled Screens using Python (PinAPL-Py), which is operated as an intuitive web-service. PinAPL-Py implements state-of-the-art tools and statistical models, assembled in a comprehensive workflow covering sequence quality control, automated sgRNA sequence extraction, alignment, sgRNA enrichment/depletion analysis and gene ranking. The workflow is set up to use a variety of popular sgRNA libraries as well as custom libraries that can be easily uploaded. Various analysis options are offered, suitable to analyze a large variety of CRISPR/Cas9 screening experiments. Analysis output includes ranked lists of sgRNAs and genes, and publication-ready plots. PinAPL-Py helps to advance genome-wide screening efforts by combining comprehensive functionality with user-friendly implementation. PinAPL-Py is freely accessible at http://pinapl-py.ucsd.edu with instructions and test datasets.

  20. Identification of key genes and pathways associated with neuropathic pain in uninjured dorsal root ganglion by using bioinformatic analysis

    Directory of Open Access Journals (Sweden)

    Chen CJ

    2017-11-01

    Full Text Available Chao-Jin Chen,* De-Zhao Liu,* Wei-Feng Yao, Yu Gu, Fei Huang, Zi-Qing Hei, Xiang Li Department of Anesthesiology, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, People’s Republic of China *These authors contributed equally to this work Purpose: Neuropathic pain is a complex chronic condition occurring post-nervous system damage. The transcriptional reprogramming of injured dorsal root ganglia (DRGs drives neuropathic pain. However, few comparative analyses using high-throughput platforms have investigated uninjured DRG in neuropathic pain, and potential interactions among differentially expressed genes (DEGs and pathways were not taken into consideration. The aim of this study was to identify changes in genes and pathways associated with neuropathic pain in uninjured L4 DRG after L5 spinal nerve ligation (SNL by using bioinformatic analysis.Materials and methods: The microarray profile GSE24982 was downloaded from the Gene Expression Omnibus database to identify DEGs between DRGs in SNL and sham rats. The prioritization for these DEGs was performed using the Toppgene database followed by gene ontology and pathway enrichment analyses. The relationships among DEGs from the protein interactive perspective were analyzed using protein–protein interaction (PPI network and module analysis. Real-time polymerase chain reaction (PCR and Western blotting were used to confirm the expression of DEGs in the rodent neuropathic pain model.Results: A total of 206 DEGs that might play a role in neuropathic pain were identified in L4 DRG, of which 75 were upregulated and 131 were downregulated. The upregulated DEGs were enriched in biological processes related to transcription regulation and molecular functions such as DNA binding, cell cycle, and the FoxO signaling pathway. Ctnnb1 protein had the highest connectivity degrees in the PPI network. The in vivo studies also validated that mRNA and protein levels of Ctnnb1 were

  1. A Reference Viral Database (RVDB) To Enhance Bioinformatics Analysis of High-Throughput Sequencing for Novel Virus Detection.

    Science.gov (United States)

    Goodacre, Norman; Aljanahi, Aisha; Nandakumar, Subhiksha; Mikailov, Mike; Khan, Arifa S

    2018-01-01

    Detection of distantly related viruses by high-throughput sequencing (HTS) is bioinformatically challenging because of the lack of a public database containing all viral sequences, without abundant nonviral sequences, which can extend runtime and obscure viral hits. Our reference viral database (RVDB) includes all viral, virus-related, and virus-like nucleotide sequences (excluding bacterial viruses), regardless of length, and with overall reduced cellular sequences. Semantic selection criteria (SEM-I) were used to select viral sequences from GenBank, resulting in a first-generation viral database (VDB). This database was manually and computationally reviewed, resulting in refined, semantic selection criteria (SEM-R), which were applied to a new download of updated GenBank sequences to create a second-generation VDB. Viral entries in the latter were clustered at 98% by CD-HIT-EST to reduce redundancy while retaining high viral sequence diversity. The viral identity of the clustered representative sequences (creps) was confirmed by BLAST searches in NCBI databases and HMMER searches in PFAM and DFAM databases. The resulting RVDB contained a broad representation of viral families, sequence diversity, and a reduced cellular content; it includes full-length and partial sequences and endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Testing of RVDBv10.2, with an in-house HTS transcriptomic data set indicated a significantly faster run for virus detection than interrogating the entirety of the NCBI nonredundant nucleotide database, which contains all viral sequences but also nonviral sequences. RVDB is publically available for facilitating HTS analysis, particularly for novel virus detection. It is meant to be updated on a regular basis to include new viral sequences added to GenBank. IMPORTANCE To facilitate bioinformatics analysis of high-throughput sequencing (HTS) data for the detection of both known and novel viruses, we have

  2. A Comprehensive Analysis of Marketing Journal Rankings

    Science.gov (United States)

    Steward, Michelle D.; Lewis, Bruce R.

    2010-01-01

    The purpose of this study is to offer a comprehensive assessment of journal standings in Marketing from two perspectives. The discipline perspective of rankings is obtained from a collection of published journal ranking studies during the past 15 years. The studies in the published ranking stream are assessed for reliability by examining internal…

  3. [Expression profiles and bioinformatic analysis of miRNA in human dental pulp cells during endothelial differentiation].

    Science.gov (United States)

    Gong, Qimei; Jiang, Hongwei; Wang, Jinming; Ling, Junqi

    2014-05-01

    To investigate the differential expression profile and bioinformatic analysis of microRNA (miRNA) in human dental pulp cells (DPC) during endothelial differentiation. DPC were cultured in endothelial induction medium (50 µg/L vascular endothelial growth factor, 10 µg/L basic fibroblast growth factor and 2% fetal calf serum) for 7 days. Meanwhile non-induced DPC were used as control.Quantitative real-time PCR (qRT-PCR) was applied to detect vascular endothelial marker genes [CD31, von Willebrand factor (vWF) and vascular endothelial-cadherin (VE-cadherin)] and in vitro tube formation on matrigel was used to analyze the angiogenic ability of differentiated cells. And then miRNA expression profiles of DPC were examined using miRNA microarray and then the differentially expressed miRNA were validated by qRT-PCR. Furthermore, bioinformatic analysis was employed to predict the target genes of miRNA and to analyze the possible biological functions and signaling pathways that were involved in DPC after induction. The relative mRNA level of CD31, vWF and VE-cadherin in the control group were (3.48 ± 0.22) ×10(-4), (3.13 ± 0.31) ×10(-4) and (39.60 ± 2.36) ×10(-4), and (19.57 ± 2.20) ×10(-4), (48.13 ± 0.54) ×10(-4) and (228.00 ± 8.89) ×10(-4) in the induced group. The expressions of CD31, vWF and VE-cadherin were increased significantly in endothelial induced DPC compared to the control group (P functions, such as the regulation of transcription, cell motion, blood vessel morphogenesis, angiogenesis and cytoskeletal protein, and signaling pathways including the mitogen-activated protein kinase (MAPK) and the Wnt signaling pathway. The differential miRNA expression identified in this study may be involved in governing DPC endothelial differentiation, thus contributing to the future research on regulatory mechanisms in dental pulp angiogenesis.

  4. Bioinformatics tools for the analysis of NMR metabolomics studies focused on the identification of clinically relevant biomarkers.

    Science.gov (United States)

    Puchades-Carrasco, Leonor; Palomino-Schätzlein, Martina; Pérez-Rambla, Clara; Pineda-Lucena, Antonio

    2016-05-01

    Metabolomics, a systems biology approach focused on the global study of the metabolome, offers a tremendous potential in the analysis of clinical samples. Among other applications, metabolomics enables mapping of biochemical alterations involved in the pathogenesis of diseases, and offers the opportunity to noninvasively identify diagnostic, prognostic and predictive biomarkers that could translate into early therapeutic interventions. Particularly, metabolomics by Nuclear Magnetic Resonance (NMR) has the ability to simultaneously detect and structurally characterize an abundance of metabolic components, even when their identities are unknown. Analysis of the data generated using this experimental approach requires the application of statistical and bioinformatics tools for the correct interpretation of the results. This review focuses on the different steps involved in the metabolomics characterization of biofluids for clinical applications, ranging from the design of the study to the biological interpretation of the results. Particular emphasis is devoted to the specific procedures required for the processing and interpretation of NMR data with a focus on the identification of clinically relevant biomarkers. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  5. Identification of Differentially Expressed Genes Induced by Aberrant Methylation in Oral Squamous Cell Carcinomas Using Integrated Bioinformatic Analysis

    Directory of Open Access Journals (Sweden)

    Xiaoqi Zhang

    2018-06-01

    Full Text Available Oral squamous cell carcinoma (OSCC is a malignant disease. Methylation plays a key role in the etiology and pathogenesis of OSCC. The goal of this study was to identify aberrantly methylated differentially expressed genes (DEGs in OSCCs, and to explore the underlying mechanisms of tumorigenesis by using integrated bioinformatic analysis. Gene expression profiles (GSE30784 and GSE38532 were analyzed using the R software to obtain aberrantly methylated DEGs. Functional enrichment analysis of screened genes was performed using the DAVID software. Protein–protein interaction (PPI networks were constructed using the STRING database. The cBioPortal software was used to exhibit the alterations of genes. Lastly, we validated the results with the Cancer Genome Atlas (TCGA data. Twenty-eight upregulated hypomethylated genes and 24 downregulated hypermethylated genes were identified. These genes were enriched in the biological process of regulation in immune response, and were mainly involved in the PI3K-AKT and EMT pathways. Additionally, three upregulated hypomethylated oncogenes and four downregulated hypermethylated tumor suppressor genes (TSGs were identified. In conclusion, our study indicated possible aberrantly methylated DEGs and pathways in OSCCs, which could improve the understanding of the underlying molecular mechanisms. Aberrantly methylated oncogenes and TSGs may also serve as biomarkers and therapeutic targets for the precise diagnosis and treatment of OSCCs in the future.

  6. The Comprehension Problems of Children with Poor Reading Comprehension despite Adequate Decoding: A Meta-Analysis.

    Science.gov (United States)

    Spencer, Mercedes; Wagner, Richard K

    2018-06-01

    The purpose of this meta-analysis was to examine the comprehension problems of children who have a specific reading comprehension deficit (SCD), which is characterized by poor reading comprehension despite adequate decoding. The meta-analysis included 86 studies of children with SCD who were assessed in reading comprehension and oral language (vocabulary, listening comprehension, storytelling ability, and semantic and syntactic knowledge). Results indicated that children with SCD had deficits in oral language ( d = -0.78, 95% CI [-0.89, -0.68], but these deficits were not as severe as their deficit in reading comprehension ( d = -2.78, 95% CI [-3.01, -2.54]). When compared to reading comprehension age-matched normal readers, the oral language skills of the two groups were comparable ( d = 0.32, 95% CI [-0.49, 1.14]), which suggests that the oral language weaknesses of children with SCD represent a developmental delay rather than developmental deviance. Theoretical and practical implications of these findings are discussed.

  7. Generalized Centroid Estimators in Bioinformatics

    Science.gov (United States)

    Hamada, Michiaki; Kiryu, Hisanori; Iwasaki, Wataru; Asai, Kiyoshi

    2011-01-01

    In a number of estimation problems in bioinformatics, accuracy measures of the target problem are usually given, and it is important to design estimators that are suitable to those accuracy measures. However, there is often a discrepancy between an employed estimator and a given accuracy measure of the problem. In this study, we introduce a general class of efficient estimators for estimation problems on high-dimensional binary spaces, which represent many fundamental problems in bioinformatics. Theoretical analysis reveals that the proposed estimators generally fit with commonly-used accuracy measures (e.g. sensitivity, PPV, MCC and F-score) as well as it can be computed efficiently in many cases, and cover a wide range of problems in bioinformatics from the viewpoint of the principle of maximum expected accuracy (MEA). It is also shown that some important algorithms in bioinformatics can be interpreted in a unified manner. Not only the concept presented in this paper gives a useful framework to design MEA-based estimators but also it is highly extendable and sheds new light on many problems in bioinformatics. PMID:21365017

  8. Cloning and bioinformatics analysis of an ubiquitin gene of the rice ...

    African Journals Online (AJOL)

    Subcellular localization analysis showed that CsUB protein of cytoplasm, cell nucleus, mitochondrion, cell skeleton and plasma membrane occupied about 47.80, 26.10, 17.40, 4.30 and 4.30%, respectively. Sequence, homology and structural analysis confirmed that CsUB gene was highly conserved during evolution and ...

  9. Bioinformatics-Aided Venomics

    Directory of Open Access Journals (Sweden)

    Quentin Kaas

    2015-06-01

    Full Text Available Venomics is a modern approach that combines transcriptomics and proteomics to explore the toxin content of venoms. This review will give an overview of computational approaches that have been created to classify and consolidate venomics data, as well as algorithms that have helped discovery and analysis of toxin nucleic acid and protein sequences, toxin three-dimensional structures and toxin functions. Bioinformatics is used to tackle specific challenges associated with the identification and annotations of toxins. Recognizing toxin transcript sequences among second generation sequencing data cannot rely only on basic sequence similarity because toxins are highly divergent. Mass spectrometry sequencing of mature toxins is challenging because toxins can display a large number of post-translational modifications. Identifying the mature toxin region in toxin precursor sequences requires the prediction of the cleavage sites of proprotein convertases, most of which are unknown or not well characterized. Tracing the evolutionary relationships between toxins should consider specific mechanisms of rapid evolution as well as interactions between predatory animals and prey. Rapidly determining the activity of toxins is the main bottleneck in venomics discovery, but some recent bioinformatics and molecular modeling approaches give hope that accurate predictions of toxin specificity could be made in the near future.

  10. Analysis of Comprehensive Utilization of Coconut Waste

    OpenAIRE

    Zheng, Kan; Liang, Dong; Zhang, Xirui

    2013-01-01

    This paper describes and analyzes the coconut cultivation in China, and the current comprehensive utilization of waste resources generated during cultivation and processing of coconut. The wastes generated in the process of cultivation include old coconut tree trunk, roots, withered coconut leaves, coconut flower and fallen cracking coconut, mainly used for biogas extraction, direct combustion and power generation, brewing, pharmacy, and processing of building materials; the wastes generated ...

  11. Genetic and bioinformatic analysis of 41C and the 2R heterochromatin of Drosophila melanogaster: a window on the heterochromatin-euchromatin junction.

    OpenAIRE

    Myster, Steven H; Wang, Fei; Cavallo, Robert; Christian, Whitney; Bhotika, Seema; Anderson, Charles T; Peifer, Mark

    2004-01-01

    Genomic sequences provide powerful new tools in genetic analysis, making it possible to combine classical genetics with genomics to characterize the genes in a particular chromosome region. These approaches have been applied successfully to the euchromatin, but analysis of the heterochromatin has lagged somewhat behind. We describe a combined genetic and bioinformatics approach to the base of the right arm of the Drosophila melanogaster second chromosome, at the boundary between pericentric h...

  12. Bioinformatics Analysis for the Antirheumatic Effects of Huang-Lian-Jie-Du-Tang from a Network Perspective

    Directory of Open Access Journals (Sweden)

    Haiyang Fang

    2013-01-01

    Full Text Available Huang-Lian-Jie-Du-Tang (HLJDT is a classic TCM formula to clear “heat” and “poison” that exhibits antirheumatic activity. Here we investigated the therapeutic mechanisms of HLJDT at protein network level using bioinformatics approach. It was found that HLJDT shares 5 target proteins with 3 types of anti-RA drugs, and several pathways in immune system and bone formation are significantly regulated by HLJDT’s components, suggesting the therapeutic effect of HLJDT on RA. By defining an antirheumatic effect score to quantitatively measure the therapeutic effect, we found that the score of each HLJDT’s component is very low, while the whole HLJDT achieves a much higher effect score, suggesting a synergistic effect of HLJDT achieved by its multiple components acting on multiple targets. At last, topological analysis on the RA-associated PPI network was conducted to illustrate key roles of HLJDT’s target proteins on this network. Integrating our findings with TCM theory suggests that HLJDT targets on hub nodes and main pathway in the Hot ZENG network, and thus it could be applied as adjuvant treatment for Hot-ZENG-related RA. This study may facilitate our understanding of antirheumatic effect of HLJDT and it may suggest new approach for the study of TCM pharmacology.

  13. Transcriptomic and bioinformatics analysis of the early time-course of the response to prostaglandin F2 alpha in the bovine corpus luteum

    Directory of Open Access Journals (Sweden)

    Heather Talbott

    2017-10-01

    Full Text Available RNA expression analysis was performed on the corpus luteum tissue at five time points after prostaglandin F2 alpha treatment of midcycle cows using an Affymetrix Bovine Gene v1 Array. The normalized linear microarray data was uploaded to the NCBI GEO repository (GSE94069. Subsequent statistical analysis determined differentially expressed transcripts ± 1.5-fold change from saline control with P ≤ 0.05. Gene ontology of differentially expressed transcripts was annotated by DAVID and Panther. Physiological characteristics of the study animals are presented in a figure. Bioinformatic analysis by Ingenuity Pathway Analysis was curated, compiled, and presented in tables. A dataset comparison with similar microarray analyses was performed and bioinformatics analysis by Ingenuity Pathway Analysis, DAVID, Panther, and String of differentially expressed genes from each dataset as well as the differentially expressed genes common to all three datasets were curated, compiled, and presented in tables. Finally, a table comparing four bioinformatics tools’ predictions of functions associated with genes common to all three datasets is presented. These data have been further analyzed and interpreted in the companion article “Early transcriptome responses of the bovine mid-cycle corpus luteum to prostaglandin F2 alpha includes cytokine signaling” [1].

  14. Investigating the Mechanisms of Action of Depside Salt from Salvia miltiorrhiza Using Bioinformatic Analysis

    Directory of Open Access Journals (Sweden)

    Hua Li

    2017-01-01

    Full Text Available Salvia miltiorrhiza is a traditional Chinese medicinal herb used for treating cardiovascular diseases. Depside salt from S. miltiorrhiza (DSSM contains the following active components: magnesium lithospermate B, lithospermic acid, and rosmarinic acid. This study aimed to reveal the mechanisms of action of DSSM. After searching for DSSM-associated genes in GeneCards, Search Tool for Interacting Chemicals, SuperTarget, PubChem, and Comparative Toxicogenomics Database, they were subjected to enrichment analysis using Multifaceted Analysis Tool for Human Transcriptome. A protein-protein interaction (PPI network was visualised; module analysis was conducted using the Cytoscape software. Finally, a transcriptional regulatory network was constructed using the TRRUST database and Cytoscape. Seventy-three DSSM-associated genes were identified. JUN, TNF, NFKB1, and FOS were hub nodes in the PPI network. Modules 1 and 2 were identified from the PPI network, with pathway enrichment analysis, showing that the presence of NFKB1 and BCL2 in module 1 was indicative of a particular association with the NF-κB signalling pathway. JUN, TNF, NFKB1, FOS, and BCL2 exhibited notable interactions among themselves in the PPI network. Several regulatory relationships (such as JUN → TNF/FOS, FOS → NFKB1 and NFKB1 → BCL2/TNF were also found in the regulatory network. Thus, DSSM exerts effects against cardiovascular diseases by targeting JUN, TNF, NFKB1, FOS, and BCL2.

  15. Cloning and bioinformatics analysis of an ubiquitin gene of the rice ...

    African Journals Online (AJOL)

    Jane

    2011-10-05

    Oct 5, 2011 ... 2Institute of Plant Protection, Anhui Academy of Agricultural ... localization analysis showed that CsUB protein of cytoplasm, cell ... and plasma membrane occupied about 47.80, 26.10, 17.40, 4.30 and 4.30%, respectively.

  16. Bioinformatics and moonlighting proteins

    Directory of Open Access Journals (Sweden)

    Sergio eHernández

    2015-06-01

    Full Text Available Multitasking or moonlighting is the capability of some proteins to execute two or more biochemical functions. Usually, moonlighting proteins are experimentally revealed by serendipity. For this reason, it would be helpful that Bioinformatics could predict this multifunctionality, especially because of the large amounts of sequences from genome projects. In the present work, we analyse and describe several approaches that use sequences, structures, interactomics and current bioinformatics algorithms and programs to try to overcome this problem. Among these approaches are: a remote homology searches using Psi-Blast, b detection of functional motifs and domains, c analysis of data from protein-protein interaction databases (PPIs, d match the query protein sequence to 3D databases (i.e., algorithms as PISITE, e mutation correlation analysis between amino acids by algorithms as MISTIC. Programs designed to identify functional motif/domains detect mainly the canonical function but usually fail in the detection of the moonlighting one, Pfam and ProDom being the best methods. Remote homology search by Psi-Blast combined with data from interactomics databases (PPIs have the best performance. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can only be used in very specific situations –it requires the existence of multialigned family protein sequences - but can suggest how the evolutionary process of second function acquisition took place. The multitasking protein database MultitaskProtDB (http://wallace.uab.es/multitask/, previously published by our group, has been used as a benchmark for the all of the analyses.

  17. Closha: bioinformatics workflow system for the analysis of massive sequencing data.

    Science.gov (United States)

    Ko, GunHwan; Kim, Pan-Gyu; Yoon, Jongcheol; Han, Gukhee; Park, Seong-Jin; Song, Wangho; Lee, Byungwook

    2018-02-19

    While next-generation sequencing (NGS) costs have fallen in recent years, the cost and complexity of computation remain substantial obstacles to the use of NGS in bio-medical care and genomic research. The rapidly increasing amounts of data available from the new high-throughput methods have made data processing infeasible without automated pipelines. The integration of data and analytic resources into workflow systems provides a solution to the problem by simplifying the task of data analysis. To address this challenge, we developed a cloud-based workflow management system, Closha, to provide fast and cost-effective analysis of massive genomic data. We implemented complex workflows making optimal use of high-performance computing clusters. Closha allows users to create multi-step analyses using drag and drop functionality and to modify the parameters of pipeline tools. Users can also import the Galaxy pipelines into Closha. Closha is a hybrid system that enables users to use both analysis programs providing traditional tools and MapReduce-based big data analysis programs simultaneously in a single pipeline. Thus, the execution of analytics algorithms can be parallelized, speeding up the whole process. We also developed a high-speed data transmission solution, KoDS, to transmit a large amount of data at a fast rate. KoDS has a file transfer speed of up to 10 times that of normal FTP and HTTP. The computer hardware for Closha is 660 CPU cores and 800 TB of disk storage, enabling 500 jobs to run at the same time. Closha is a scalable, cost-effective, and publicly available web service for large-scale genomic data analysis. Closha supports the reliable and highly scalable execution of sequencing analysis workflows in a fully automated manner. Closha provides a user-friendly interface to all genomic scientists to try to derive accurate results from NGS platform data. The Closha cloud server is freely available for use from http://closha.kobic.re.kr/ .

  18. Bioinformatics and the Undergraduate Curriculum

    Science.gov (United States)

    Maloney, Mark; Parker, Jeffrey; LeBlanc, Mark; Woodard, Craig T.; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of…

  19. Identification of Key Pathways and Genes in Advanced Coronary Atherosclerosis Using Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Xiaowen Tan

    2017-01-01

    Full Text Available Background. Coronary artery atherosclerosis is a chronic inflammatory disease. This study aimed to identify the key changes of gene expression between early and advanced carotid atherosclerotic plaque in human. Methods. Gene expression dataset GSE28829 was downloaded from Gene Expression Omnibus (GEO, including 16 advanced and 13 early stage atherosclerotic plaque samples from human carotid. Differentially expressed genes (DEGs were analyzed. Results. 42,450 genes were obtained from the dataset. Top 100 up- and downregulated DEGs were listed. Functional enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG identification were performed. The result of functional and pathway enrichment analysis indicted that the immune system process played a critical role in the progression of carotid atherosclerotic plaque. Protein-protein interaction (PPI networks were performed either. Top 10 hub genes were identified from PPI network and top 6 modules were inferred. These genes were mainly involved in chemokine signaling pathway, cell cycle, B cell receptor signaling pathway, focal adhesion, and regulation of actin cytoskeleton. Conclusion. The present study indicated that analysis of DEGs would make a deeper understanding of the molecular mechanisms of atherosclerosis development and they might be used as molecular targets and diagnostic biomarkers for the treatment of atherosclerosis.

  20. MiR-139 in digestive system tumor diagnosis and detection: Bioinformatics and meta-analysis.

    Science.gov (United States)

    Wang, Yu-Hui; Ji, Jia; Weng, Hong; Wang, Bi-Cheng; Wang, Fu-Bing

    2018-06-05

    Accumulating evidence has indicated that microRNAs play important roles in the initiation and progression of digestive system tumors. However, previous studies suggest that the accuracy of miRNA detection in digestive system tumors was inconsistent. The candidate miRNAs were obtained from The Cancer Genome Atlas (TCGA). Meta-analysis was performed to evaluate the diagnostic value of these miRNAs in digestive system tumors. Furthermore, the potential target genes of the miRNAs were predicted and assessed with functional analysis. According to TCGA data, miR-139 was a common biomarker of digestive system tumors. It was markedly reduced in tumor tissues as compared with non-cancerous tissues in digestive system tumors. In the meta-analysis, the pooled diagnostic odds ratio (DOR) and AUC was 57.51 (95% CI: 14.25-232.04) and 0.96 (95% CI: 0.94-0.97), respectively. Furthermore, the overall sensitivity and specificity was 0.89 (95% CI: 0.73-0.96) and 0.91 (95% CI: 0.75-0.97), respectively. The diagnostic value of tissue miR-139 was higher than the diagnostic value of blood miR-139. In particular, miR-139 was a superior marker for distinguishing colorectal cancer. miR-139 could be a potential biomarker for diagnosis of digestive system tumors especially colorectal cancer. Copyright © 2017. Published by Elsevier B.V.

  1. Strategic Integration of Multiple Bioinformatics Resources for System Level Analysis of Biological Networks.

    Science.gov (United States)

    D'Souza, Mark; Sulakhe, Dinanath; Wang, Sheng; Xie, Bing; Hashemifar, Somaye; Taylor, Andrew; Dubchak, Inna; Conrad Gilliam, T; Maltsev, Natalia

    2017-01-01

    Recent technological advances in genomics allow the production of biological data at unprecedented tera- and petabyte scales. Efficient mining of these vast and complex datasets for the needs of biomedical research critically depends on a seamless integration of the clinical, genomic, and experimental information with prior knowledge about genotype-phenotype relationships. Such experimental data accumulated in publicly available databases should be accessible to a variety of algorithms and analytical pipelines that drive computational analysis and data mining.We present an integrated computational platform Lynx (Sulakhe et al., Nucleic Acids Res 44:D882-D887, 2016) ( http://lynx.cri.uchicago.edu ), a web-based database and knowledge extraction engine. It provides advanced search capabilities and a variety of algorithms for enrichment analysis and network-based gene prioritization. It gives public access to the Lynx integrated knowledge base (LynxKB) and its analytical tools via user-friendly web services and interfaces. The Lynx service-oriented architecture supports annotation and analysis of high-throughput experimental data. Lynx tools assist the user in extracting meaningful knowledge from LynxKB and experimental data, and in the generation of weighted hypotheses regarding the genes and molecular mechanisms contributing to human phenotypes or conditions of interest. The goal of this integrated platform is to support the end-to-end analytical needs of various translational projects.

  2. Integrated bioinformatics analysis reveals key candidate genes and pathways in breast cancer.

    Science.gov (United States)

    Wang, Yuzhi; Zhang, Yi; Huang, Qian; Li, Chengwen

    2018-04-19

    Breast cancer (BC) is the leading malignancy in women worldwide, yet relatively little is known about the genes and signaling pathways involved in BC tumorigenesis and progression. The present study aimed to elucidate potential key candidate genes and pathways in BC. Five gene expression profile data sets (GSE22035, GSE3744, GSE5764, GSE21422 and GSE26910) were downloaded from the Gene Expression Omnibus (GEO) database, which included data from 113 tumorous and 38 adjacent non‑tumorous tissue samples. Differentially expressed genes (DEGs) were identified using t‑tests in the limma R package. These DEGs were subsequently investigated by pathway enrichment analysis and a protein‑protein interaction (PPI) network was constructed. The most significant module from the PPI network was selected for pathway enrichment analysis. In total, 227 DEGs were identified, of which 82 were upregulated and 145 were downregulated. Pathway enrichment analysis results revealed that the upregulated DEGs were mainly enriched in 'cell division', the 'proteinaceous extracellular matrix (ECM)', 'ECM structural constituents' and 'ECM‑receptor interaction', whereas downregulated genes were mainly enriched in 'response to drugs', 'extracellular space', 'transcriptional activator activity' and the 'peroxisome proliferator‑activated receptor signaling pathway'. The PPI network contained 174 nodes and 1,257 edges. DNA topoisomerase 2‑a, baculoviral inhibitor of apoptosis repeat‑containing protein 5, cyclin‑dependent kinase 1, G2/mitotic‑specific cyclin‑B1 and kinetochore protein NDC80 homolog were identified as the top 5 hub genes. Furthermore, the genes in the most significant module were predominantly involved in 'mitotic nuclear division', 'mid‑body', 'protein binding' and 'cell cycle'. In conclusion, the DEGs, relative pathways and hub genes identified in the present study may aid in understanding of the molecular mechanisms underlying BC progression and provide

  3. Molecular structure, bioinformatics analysis, expression and bioactivity of BAFF (TNF13B) in dog (Canis familiaris).

    Science.gov (United States)

    Shen, Yuefen; Wang, Shule; Ai, Hongxin; Min, Cui; Chen, Yuqing; Zhang, Shuangquan

    2011-07-15

    B cell activating factor (BAFF), belonging to the tumor necrosis factor (TNF) family, plays an important role in B cell survival, proliferation and differentiation. In the present study, we amplified the cDNA of dog (Canis familiaris) BAFF (designated doBAFF) from spleen by reverse transcription-PCR (RT-PCR) and rapid amplification of cDNA ends (RACE) strategies. The open reading frame (ORF) of doBAFF covers 879 bp encoding 292 amino acids, with a 152-aa mature peptide. The soluble mature part of doBAFF (dosBAFF) shares 91.5%, 72.1%, 96.7%, 94.0% and 91.5% identity with the human, mouse, cattle, pig and rabbit counterparts, respectively. The predicted three-dimensional (3D) structural analysis of dosBAFF analyzed by "comparative protein modeling" revealed that it was very similar to its human counterpart. Real-time quantitative PCR analysis revealed that doBAFF was mainly expressed in spleen. Two fusion proteins SUMO-dosBAFF and GFP/dosBAFF were efficiently expressed in Escherichia coli BL21 (DE3) and purified using metal chelate affinity chromatography (Ni-NTA). Laser scanning confocal microscopy analysis showed that GFP/dosBAFF could bind to the mouse splenic B-cell. In vitro, SUMO-dosBAFF and GFP/dosBAFF were able to promote the survival/proliferation of dog lymphocytes or mouse splenic B cells with/without anti-IgM. Therefore, BAFF may be a potential immunologic factor for enhancing immunological efficacy in dog. Copyright © 2011 Elsevier B.V. All rights reserved.

  4. [Cloning and bioinformatics analysis of abscisic acid 8'-hydroxylase from Pseudostellariae Radix].

    Science.gov (United States)

    Li, Jun; Long, Deng-Kai; Zhou, Tao; Ding, Ling; Zheng, Wei; Jiang, Wei-Ke

    2016-07-01

    Abscisic acid 8'-hydroxylase was one of key enzymes genes in the metabolism of abscisic acid (ABA). Seven menbers of abscisic acid 8'-hydroxylase were identified from Pseudostellaria heterophylla transcriptome sequencing results by using sequence homology. The expression profiles of these genes were analyzed by transcriptome data. The coding sequence of ABA8ox1 was cloned and analyzed by informational technology. The full-length cDNA of ABA8ox1 was 1 401 bp,with 480 encoded amino acids. The predicated isoelectric point (pI) and relative molecular mass (MW) were 8.55 and 53 kDa,respectively. Transmembrane structure analysis showed that there were 21 amino acids in-side and 445 amino acids out-side. High level of transcripts can detect in bark of root and fibrous root. Multi-alignment and phylogenetic analysis both show that ABA8ox1 had a high similarity with the CYP707As from other plants,especially with AtCYP707A1 and AtCYP707A3 in Arabidopsis thaliana. These results lay a foundation for molecular mechanism of tuberous root expanding and response to adversity stress. Copyright© by the Chinese Pharmaceutical Association.

  5. Bioinformatics analysis of the structural and evolutionary characteristics for toll-like receptor 15

    Directory of Open Access Journals (Sweden)

    Jinlan Wang

    2016-05-01

    Full Text Available Toll-like receptors (TLRs play important role in the innate immune system. TLR15 is reported to have a unique role in defense against pathogens, but its structural and evolution characterizations are still poorly understood. In this study, we identified 57 completed TLR15 genes from avian and reptilian genomes. TLR15 clustered into an individual clade and was closely related to family 1 on the phylogenetic tree. Unlike the TLRs in family 1 with the broken asparagine ladders in the middle, TLR15 ectodomain had an intact asparagine ladder that is critical to maintain the overall shape of ectodomain. The conservation analysis found that TLR15 ectodomain had a highly evolutionarily conserved region on the convex surface of LRR11 module, which is probably involved in TLR15 activation process. Furthermore, the protein–protein docking analysis indicated that TLR15 TIR domains have the potential to form homodimers, the predicted interaction interface of TIR dimer was formed mainly by residues from the BB-loops and αC-helixes. Although TLR15 mainly underwent purifying selection, we detected 27 sites under positive selection for TLR15, 24 of which are located on its ectodomain. Our observations suggest the structural features of TLR15 which may be relevant to its function, but which requires further experimental validation.

  6. [Genome-wide identification and bioinformatic analysis of PPR gene family in tomato].

    Science.gov (United States)

    Ding, Anming; Li, Ling; Qu, Xu; Sun, Tingting; Chen, Yaqiong; Zong, Peng; Li, Zunqiang; Gong, Daping; Sun, Yuhe

    2014-01-01

    Pentatricopeptide repeats (PPRs) genes constitute one of the largest gene families in plants, which play a broad and essential role in plant growth and development. In this study, the protein sequences annotated by the tomato (S. lycopersicum L.) genome project were screened with the Pfam PPR sequences. A total of 471 putative PPR-encoding genes were identified. Based on the motifs defined in A. thaliana L., protein structure and conserved sequences for each tomato motif were analyzed. We also analyzed phylogenetic relationship, subcellular localization, expression and GO analysis of the identified gene sequences. Our results demonstrate that tomato PPR gene family contains two subfamilies, P and PLS, each accounting for half of the family. PLS subfamily can be divided into four subclasses i.e., PLS, E, E+ and DYW. Each subclass of sequences forms a clade in the phylogenetic tree. The PPR motifs were found highly conserved among plants. The tomato PPR genes were distributed over 12 chromosomes and most of them lack introns. The majority of PPR proteins harbor mitochondrial or chloroplast localization sequences, whereas GO analysis showed that most PPR proteins participate in RNA-related biological processes.

  7. Bioinformatic analysis of microRNA biogenesis and function related proteins in eleven animal genomes.

    Science.gov (United States)

    Liu, Xiuying; Luo, GuanZheng; Bai, Xiujuan; Wang, Xiu-Jie

    2009-10-01

    MicroRNAs are approximately 22 nt long small non-coding RNAs that play important regulatory roles in eukaryotes. The biogenesis and functional processes of microRNAs require the participation of many proteins, of which, the well studied ones are Dicer, Drosha, Argonaute and Exportin 5. To systematically study these four protein families, we screened 11 animal genomes to search for genes encoding above mentioned proteins, and identified some new members for each family. Domain analysis results revealed that most proteins within the same family share identical or similar domains. Alternative spliced transcript variants were found for some proteins. We also examined the expression patterns of these proteins in different human tissues and identified other proteins that could potentially interact with these proteins. These findings provided systematic information on the four key proteins involved in microRNA biogenesis and functional pathways in animals, and will shed light on further functional studies of these proteins.

  8. Bioinformatics analysis on molecular mechanism of rheum officinale in treatment of jaundice

    Science.gov (United States)

    Shan, Si; Tu, Jun; Nie, Peng; Yan, Xiaojun

    2017-01-01

    Objective: To study the molecular mechanism of Rheum officinale in the treatment of Jaundice by building molecular networks and comparing canonical pathways. Methods: Target proteins of Rheum officinale and related genes of Jaundice were searched from Pubchem and Gene databases online respectively. Molecular networks and canonical pathways comparison analyses were performed by Ingenuity Pathway Analysis (IPA). Results: The molecular networks of Rheum officinale and Jaundice were complex and multifunctional. The 40 target proteins of Rheum officinale and 33 Homo sapiens genes of Jaundice were found in databases. There were 19 common pathways both related networks. Rheum officinale could regulate endothelial differentiation, Interleukin-1B (IL-1B) and Tumor Necrosis Factor (TNF) in these pathways. Conclusions: Rheum officinale treat Jaundice by regulating many effective nodes of Apoptotic pathway and cellular immunity related pathways.

  9. Bioinformatic analysis of cis-regulatory interactions between progesterone and estrogen receptors in breast cancer

    Directory of Open Access Journals (Sweden)

    Matloob Khushi

    2014-11-01

    Full Text Available Chromatin factors interact with each other in a cell and sequence-specific manner in order to regulate transcription and a wealth of publically available datasets exists describing the genomic locations of these interactions. Our recently published BiSA (Binding Sites Analyser database contains transcription factor binding locations and epigenetic modifications collected from published studies and provides tools to analyse stored and imported data. Using BiSA we investigated the overlapping cis-regulatory role of estrogen receptor alpha (ERα and progesterone receptor (PR in the T-47D breast cancer cell line. We found that ERα binding sites overlap with a subset of PR binding sites. To investigate further, we re-analysed raw data to remove any biases introduced by the use of distinct tools in the original publications. We identified 22,152 PR and 18,560 ERα binding sites (<5% false discovery rate with 4,358 overlapping regions among the two datasets. BiSA statistical analysis revealed a non-significant overall overlap correlation between the two factors, suggesting that ERα and PR are not partner factors and do not require each other for binding to occur. However, Monte Carlo simulation by Binary Interval Search (BITS, Relevant Distance, Absolute Distance, Jaccard and Projection tests by Genometricorr revealed a statistically significant spatial correlation of binding regions on chromosome between the two factors. Motif analysis revealed that the shared binding regions were enriched with binding motifs for ERα, PR and a number of other transcription and pioneer factors. Some of these factors are known to co-locate with ERα and PR binding. Therefore spatially close proximity of ERα binding sites with PR binding sites suggests that ERα and PR, in general function independently at the molecular level, but that their activities converge on a specific subset of transcriptional targets.

  10. Bioinformatic analysis of xenobiotic reactive metabolite target proteins and their interacting partners

    Directory of Open Access Journals (Sweden)

    Hanzlik Robert P

    2009-06-01

    Full Text Available Abstract Background Protein covalent binding by reactive metabolites of drugs, chemicals and natural products can lead to acute cytotoxicity. Recent rapid progress in reactive metabolite target protein identification has shown that adduction is surprisingly selective and inspired the hope that analysis of target proteins might reveal protein factors that differentiate target- vs. non-target proteins and illuminate mechanisms connecting covalent binding to cytotoxicity. Results Sorting 171 known reactive metabolite target proteins revealed a number of GO categories and KEGG pathways to be significantly enriched in targets, but in most cases the classes were too large, and the "percent coverage" too small, to allow meaningful conclusions about mechanisms of toxicity. However, a similar analysis of the directlyinteracting partners of 28 common targets of multiple reactive metabolites revealed highly significant enrichments in terms likely to be highly relevant to cytotoxicity (e.g., MAP kinase pathways, apoptosis, response to unfolded protein. Machine learning was used to rank the contribution of 211 computed protein features to determining protein susceptibility to adduction. Protein lysine (but not cysteine content and protein instability index (i.e., rate of turnover in vivo were among the features most important to determining susceptibility. Conclusion As yet there is no good explanation for why some low-abundance proteins become heavily adducted while some abundant proteins become only lightly adducted in vivo. Analyzing the directly interacting partners of target proteins appears to yield greater insight into mechanisms of toxicity than analyzing target proteins per se. The insights provided can readily be formulated as hypotheses to test in future experimental studies.

  11. Carbohydrate-active enzymes in Trichoderma harzianum: a bioinformatic analysis bioprospecting for key enzymes for the biofuels industry.

    Science.gov (United States)

    Ferreira Filho, Jaire Alves; Horta, Maria Augusta Crivelente; Beloti, Lilian Luzia; Dos Santos, Clelton Aparecido; de Souza, Anete Pereira

    2017-10-12

    Trichoderma harzianum is used in biotechnology applications due to its ability to produce powerful enzymes for the conversion of lignocellulosic substrates into soluble sugars. Active enzymes involved in carbohydrate metabolism are defined as carbohydrate-active enzymes (CAZymes), and the most abundant family in the CAZy database is the glycoside hydrolases. The enzymes of this family play a fundamental role in the decomposition of plant biomass. In this study, the CAZymes of T. harzianum were identified and classified using bioinformatic approaches after which the expression profiles of all annotated CAZymes were assessed via RNA-Seq, and a phylogenetic analysis was performed. A total of 430 CAZymes (3.7% of the total proteins for this organism) were annotated in T. harzianum, including 259 glycoside hydrolases (GHs), 101 glycosyl transferases (GTs), 6 polysaccharide lyases (PLs), 22 carbohydrate esterases (CEs), 42 auxiliary activities (AAs) and 46 carbohydrate-binding modules (CBMs). Among the identified T. harzianum CAZymes, 47% were predicted to harbor a signal peptide sequence and were therefore classified as secreted proteins. The GH families were the CAZyme class with the greatest number of expressed genes, including GH18 (23 genes), GH3 (17 genes), GH16 (16 genes), GH2 (13 genes) and GH5 (12 genes). A phylogenetic analysis of the proteins in the AA9/GH61, CE5 and GH55 families showed high functional variation among the proteins. Identifying the main proteins used by T. harzianum for biomass degradation can ensure new advances in the biofuel production field. Herein, we annotated and characterized the expression levels of all of the CAZymes from T. harzianum, which may contribute to future studies focusing on the functional and structural characterization of the identified proteins.

  12. General Approach to Identifying Potential Targets for Cancer Imaging by Integrated Bioinformatics Analysis of Publicly Available Genomic Profiles

    Directory of Open Access Journals (Sweden)

    Yongliang Yang

    2011-03-01

    Full Text Available Molecular imaging has moved to the forefront of drug development and biomedical research. The identification of appropriate imaging targets has become the touchstone for the accurate diagnosis and prognosis of human cancer. Particularly, cell surface- or membrane-bound proteins are attractive imaging targets for their aberrant expression, easily accessible location, and unique biochemical functions in tumor cells. Previously, we published a literature mining of potential targets for our in-house enzyme-mediated cancer imaging and therapy technology. Here we present a simple and integrated bioinformatics analysis approach that assembles a public cancer microarray database with a pathway knowledge base for ascertaining and prioritizing upregulated genes encoding cell surface- or membrane-bound proteins, which could serve imaging targets. As examples, we obtained lists of potential hits for six common and lethal human tumors in the prostate, breast, lung, colon, ovary, and pancreas. As control tests, a number of well-known cancer imaging targets were detected and confirmed by our study. Further, by consulting gene-disease and protein-disease databases, we suggest a number of significantly upregulated genes as promising imaging targets, including cell surface-associated mucin-1, prostate-specific membrane antigen, hepsin, urokinase plasminogen activator receptor, and folate receptors. By integrating pathway analysis, we are able to organize and map “focused” interaction networks derived from significantly dysregulated entity pairs to reflect important cellular functions in disease processes. We provide herein an example of identifying a tumor cell growth and proliferation subnetwork for prostate cancer. This systematic mining approach can be broadly applied to identify imaging or therapeutic targets for other human diseases.

  13. Toward the Replacement of Animal Experiments through the Bioinformatics-driven Analysis of 'Omics' Data from Human Cell Cultures.

    Science.gov (United States)

    Grafström, Roland C; Nymark, Penny; Hongisto, Vesa; Spjuth, Ola; Ceder, Rebecca; Willighagen, Egon; Hardy, Barry; Kaski, Samuel; Kohonen, Pekka

    2015-11-01

    This paper outlines the work for which Roland Grafström and Pekka Kohonen were awarded the 2014 Lush Science Prize. The research activities of the Grafström laboratory have, for many years, covered cancer biology studies, as well as the development and application of toxicity-predictive in vitro models to determine chemical safety. Through the integration of in silico analyses of diverse types of genomics data (transcriptomic and proteomic), their efforts have proved to fit well into the recently-developed Adverse Outcome Pathway paradigm. Genomics analysis within state-of-the-art cancer biology research and Toxicology in the 21st Century concepts share many technological tools. A key category within the Three Rs paradigm is the Replacement of animals in toxicity testing with alternative methods, such as bioinformatics-driven analyses of data obtained from human cell cultures exposed to diverse toxicants. This work was recently expanded within the pan-European SEURAT-1 project (Safety Evaluation Ultimately Replacing Animal Testing), to replace repeat-dose toxicity testing with data-rich analyses of sophisticated cell culture models. The aims and objectives of the SEURAT project have been to guide the application, analysis, interpretation and storage of 'omics' technology-derived data within the service-oriented sub-project, ToxBank. Particularly addressing the Lush Science Prize focus on the relevance of toxicity pathways, a 'data warehouse' that is under continuous expansion, coupled with the development of novel data storage and management methods for toxicology, serve to address data integration across multiple 'omics' technologies. The prize winners' guiding principles and concepts for modern knowledge management of toxicological data are summarised. The translation of basic discovery results ranged from chemical-testing and material-testing data, to information relevant to human health and environmental safety. 2015 FRAME.

  14. Bioinformatics Identification of Modules of Transcription Factor Binding Sites in Alzheimer's Disease-Related Genes by In Silico Promoter Analysis and Microarrays

    Directory of Open Access Journals (Sweden)

    Regina Augustin

    2011-01-01

    Full Text Available The molecular mechanisms and genetic risk factors underlying Alzheimer's disease (AD pathogenesis are only partly understood. To identify new factors, which may contribute to AD, different approaches are taken including proteomics, genetics, and functional genomics. Here, we used a bioinformatics approach and found that distinct AD-related genes share modules of transcription factor binding sites, suggesting a transcriptional coregulation. To detect additional coregulated genes, which may potentially contribute to AD, we established a new bioinformatics workflow with known multivariate methods like support vector machines, biclustering, and predicted transcription factor binding site modules by using in silico analysis and over 400 expression arrays from human and mouse. Two significant modules are composed of three transcription factor families: CTCF, SP1F, and EGRF/ZBPF, which are conserved between human and mouse APP promoter sequences. The specific combination of in silico promoter and multivariate analysis can identify regulation mechanisms of genes involved in multifactorial diseases.

  15. Comprehensive data analysis of human ureter proteome

    Directory of Open Access Journals (Sweden)

    Sameh Magdeldin

    2016-03-01

    Full Text Available Comprehensive human ureter proteome dataset was generated from OFFGel fractionated ureter samples. Our result showed that among 2217 non-redundant ureter proteins, 751 protein candidates (33.8% were detected in urine as urinary protein/polypeptide or exosomal protein. On the other hand, comparing ureter protein hits (48 that are not shown in corresponding databases to urinary bladder and prostate human protein atlas databases pinpointed 21 proteins that might be unique to ureter tissue. In conclusion, this finding offers future perspectives for possible identification of ureter disease-associated biomarkers such as ureter carcinoma. In addition, Cytoscape GO annotation was examined on the final ureter dataset to better understand proteins molecular function, biological processes, and cellular component. The ureter proteomic dataset published in this article will provide a valuable resource for researchers working in the field of urology and urine biomarker discovery.

  16. Integrated bioinformatic analysis unveils significant genes and pathways in the pathogenesis of supratentorial primitive neuroectodermal tumor

    Directory of Open Access Journals (Sweden)

    Wang G

    2018-04-01

    Full Text Available Guang-Yu Wang,1,* Ling Li,2,* Bo Liu,1 Xiao Han,1 Chun-Hua Wang,1 Ji-Wen Wang3 1Department of Neurosurgery, 2Department of Pediatrics, Qilu Children’s Hospital of Shandong University, Jinan, Shandong, 3Department of Neurology, Shanghai Children’s Medical Center, Shanghai Jiaotong University School of Medicine, Pudong New District, Shanghai, People’s Republic of China *These authors contributed equally to this work Purpose: This study aimed to explore significant genes and pathways involved in the pathogenesis of supratentorial primitive neuroectodermal tumor (sPNET. Materials and methods: Gene expression profile of GSE14295 was downloaded from publicly available Gene Expression Omnibus (GEO database. Differentially expressed genes (DEGs were screened out in primary sPNET samples compared with normal fetal and adult brain reference samples (sPNET vs fetal brain and sPNET vs adult brain. Pathway enrichment analysis of these DEGs was conducted, followed by protein–protein interaction (PPI network construction and significant module selection. Additionally, transcription factors (TFs regulating the common DEGs in the two comparison groups were identified, and the regulatory network was constructed. Results: In total, 526 DEGs (99 up- and 427 downregulated in sPNET vs fetal brain and 815 DEGs (200 up- and 615 downregulated in sPNET vs adult brain were identified. DEGs in sPNET vs fetal brain and sPNET vs adult brain were associated with calcium signaling pathway, cell cycle, and p53 signaling pathway. CDK1, CDC20, BUB1B, and BUB1 were hub nodes in the PPI networks of DEGs in sPNET vs fetal brain and sPNET vs adult brain. Significant modules were extracted from the PPI networks. In addition, 64 upregulated and 200 downregulated overlapping DEGs were identified in both sPNET vs fetal brain and sPNET vs adult brain. The genes involved in the regulatory network upon overlapping DEGs and the TFs were correlated with calcium signaling pathway

  17. Flux Analysis of the Trypanosoma brucei Glycolysis Based on a Multiobjective-Criteria Bioinformatic Approach

    Directory of Open Access Journals (Sweden)

    Amine Ghozlane

    2012-01-01

    Full Text Available Trypanosoma brucei is a protozoan parasite of major of interest in discovering new genes for drug targets. This parasite alternates its life cycle between the mammal host(s (bloodstream form and the insect vector (procyclic form, with two divergent glucose metabolism amenable to in vitro culture. While the metabolic network of the bloodstream forms has been well characterized, the flux distribution between the different branches of the glucose metabolic network in the procyclic form has not been addressed so far. We present a computational analysis (called Metaboflux that exploits the metabolic topology of the procyclic form, and allows the incorporation of multipurpose experimental data to increase the biological relevance of the model. The alternatives resulting from the structural complexity of networks are formulated as an optimization problem solved by a metaheuristic where experimental data are modeled in a multiobjective function. Our results show that the current metabolic model is in agreement with experimental data and confirms the observed high metabolic flexibility of glucose metabolism. In addition, Metaboflux offers a rational explanation for the high flexibility in the ratio between final products from glucose metabolism, thsat is, flux redistribution through the malic enzyme steps.

  18. Cloning and bioinformatics analysis of CcPILS gene of Hickory (Carya cathayensis)

    Science.gov (United States)

    Guo, Wenbin; Yuan, Huwei; Gao, Liuxiao; Guo, Haipeng; Qiu, Lingling; Xu, Dongbin; Yan, Daoliang; Zheng, Bingsong

    2017-04-01

    PILS is a key auxin efflux carrier protein in the auxin signal transduction. A CcPILS gene related to hickory (Carya carthayensis) grafting process was obtained by RACE techniques. The full length of CcPILS gene was1541bp contained a 1263bp length open reading flame (ORF). The CcPILS encoded 294 amino acids with molecular weight of 46 kDa, PI 5.38 and localized at endoplasmic reticulum membrane. The gene contained a central hydrophilic loop separating two hydrophobic domains of about five transmembrane regions each. The gene of CcPILS belonged to Clade III sub-family of PILS and its sequence had high homology with Arabidopsis. Real Time RT-PCR analysis showed that the gene expressions were weakly induced in bud, inflorescence, fruit, leaf and stem, while strongly in root. The expression levels were strongly induced and reached a peak at the third day of grafting in scion and rootstock of hickory, which were 1.45 and 3.45 times higher, respectively, compared to that of control. The results indicated that CcPILS may be involved in regulating the expression of genes related to auxin signal transduction during hickory graft process.

  19. A Bioinformatic Strategy for the Detection, Classification and Analysis of Bacterial Autotransporters

    Science.gov (United States)

    Celik, Nermin; Webb, Chaille T.; Leyton, Denisse L.; Holt, Kathryn E.; Heinz, Eva; Gorrell, Rebecca; Kwok, Terry; Naderer, Thomas; Strugnell, Richard A.; Speed, Terence P.; Teasdale, Rohan D.; Likić, Vladimir A.; Lithgow, Trevor

    2012-01-01

    Autotransporters are secreted proteins that are assembled into the outer membrane of bacterial cells. The passenger domains of autotransporters are crucial for bacterial pathogenesis, with some remaining attached to the bacterial surface while others are released by proteolysis. An enigma remains as to whether autotransporters should be considered a class of secretion system, or simply a class of substrate with peculiar requirements for their secretion. We sought to establish a sensitive search protocol that could identify and characterize diverse autotransporters from bacterial genome sequence data. The new sequence analysis pipeline identified more than 1500 autotransporter sequences from diverse bacteria, including numerous species of Chlamydiales and Fusobacteria as well as all classes of Proteobacteria. Interrogation of the proteins revealed that there are numerous classes of passenger domains beyond the known proteases, adhesins and esterases. In addition the barrel-domain-a characteristic feature of autotransporters-was found to be composed from seven conserved sequence segments that can be arranged in multiple ways in the tertiary structure of the assembled autotransporter. One of these conserved motifs overlays the targeting information required for autotransporters to reach the outer membrane. Another conserved and diagnostic motif maps to the linker region between the passenger domain and barrel-domain, indicating it as an important feature in the assembly of autotransporters. PMID:22905239

  20. Genomic Analysis of a Marine Bacterium: Bioinformatics for Comparison, Evaluation, and Interpretation of DNA Sequences

    Directory of Open Access Journals (Sweden)

    Bhagwan N. Rekadwad

    2016-01-01

    Full Text Available A total of five highly related strains of an unidentified marine bacterium were analyzed through their short genome sequences (AM260709–AM260713. Genome-to-Genome Distance (GGDC showed high similarity to Pseudoalteromonas haloplanktis (X67024. The generated unique Quick Response (QR codes indicated no identity to other microbial species or gene sequences. Chaos Game Representation (CGR showed the number of bases concentrated in the area. Guanine residues were highest in number followed by cytosine. Frequency of Chaos Game Representation (FCGR indicated that CC and GG blocks have higher frequency in the sequence from the evaluated marine bacterium strains. Maximum GC content for the marine bacterium strains ranged 53-54%. The use of QR codes, CGR, FCGR, and GC dataset helped in identifying and interpreting short genome sequences from specific isolates. A phylogenetic tree was constructed with the bootstrap test (1000 replicates using MEGA6 software. Principal Component Analysis (PCA was carried out using EMBL-EBI MUSCLE program. Thus, generated genomic data are of great assistance for hierarchical classification in Bacterial Systematics which combined with phenotypic features represents a basic procedure for a polyphasic approach on unambiguous bacterial isolate taxonomic classification.

  1. The Alcohol Dehydrogenase Gene Family in Melon (Cucumis melo L.: Bioinformatic Analysis and Expression Patterns

    Directory of Open Access Journals (Sweden)

    Yazhong eJin

    2016-05-01

    Full Text Available Alcohol dehydrogenases (ADH, encoded by multigene family in plants, play a critical role in plant growth, development, adaptation, fruit ripening and aroma production. Thirteen ADH genes were identified in melon genome, including 12 ADHs and one formaldehyde dehydrogenease (FDH, designated CmADH1-12 and CmFDH1, in which CmADH1 and CmADH2 have been isolated in Cantaloupe. ADH genes shared a lower identity with each other at the protein level and had different intron-exon structure at nucleotide level. No typical signal peptides were found in all CmADHs, and CmADH proteins might locate in the cytoplasm. The phylogenetic tree revealed that 13 ADH genes were divided into 3 groups respectively, namely long-, medium- and short-chain ADH subfamily, and CmADH1,3-11, which belongs to the medium-chain ADH subfamily, fell into 6 medium-chain ADH subgroups. CmADH12 may belong to the long-chain ADH subfamily, while CmFDH1 may be a Class III ADH and serve as an ancestral ADH in melon. Expression profiling revealed that CmADH1, CmADH2, CmADH10 and CmFDH1 were moderately or strongly expressed in different vegetative tissues and fruit at medium and late developmental stages, while CmADH8 and CmADH12 were highly expressed in fruit after 20 days. CmADH3 showed preferential expression in young tissues. CmADH4 only had slight expression in root. Promoter analysis revealed several motifs of CmADH genes involved in the gene expression modulated by various hormones, and the response pattern of CmADH genes to ABA, IAA and ethylene were different. These CmADHs were divided into ethylene-sensitive and –insensitive groups, and the functions of CmADHs were discussed.

  2. Comprehensive adaptive mesh refinement in wrinkling prediction analysis

    NARCIS (Netherlands)

    Selman, A.; Meinders, Vincent T.; Huetink, Han; van den Boogaard, Antonius H.

    2002-01-01

    Discretisation errors indicator, contact free wrinkling and wrinkling with contact indicators are, in a challenging task, brought together and used in a comprehensive approach to wrinkling prediction analysis in thin sheet metal forming processes.

  3. Biggest challenges in bioinformatics.

    Science.gov (United States)

    Fuller, Jonathan C; Khoueiry, Pierre; Dinkel, Holger; Forslund, Kristoffer; Stamatakis, Alexandros; Barry, Joseph; Budd, Aidan; Soldatos, Theodoros G; Linssen, Katja; Rajput, Abdul Mateen

    2013-04-01

    The third Heidelberg Unseminars in Bioinformatics (HUB) was held on 18th October 2012, at Heidelberg University, Germany. HUB brought together around 40 bioinformaticians from academia and industry to discuss the 'Biggest Challenges in Bioinformatics' in a 'World Café' style event.

  4. Biggest challenges in bioinformatics

    OpenAIRE

    Fuller, Jonathan C; Khoueiry, Pierre; Dinkel, Holger; Forslund, Kristoffer; Stamatakis, Alexandros; Barry, Joseph; Budd, Aidan; Soldatos, Theodoros G; Linssen, Katja; Rajput, Abdul Mateen

    2013-01-01

    The third Heidelberg Unseminars in Bioinformatics (HUB) was held in October at Heidelberg University in Germany. HUB brought together around 40 bioinformaticians from academia and industry to discuss the ‘Biggest Challenges in Bioinformatics' in a ‘World Café' style event.

  5. Bioinformatics analysis of the factors controlling type I IFN gene expression in autoimmune disease and virus-induced immunity

    Directory of Open Access Journals (Sweden)

    Di eFeng

    2013-09-01

    Full Text Available Patients with systemic lupus erythematosus (SLE and Sjögren's syndrome (SS display increased levels of type I IFN-induced genes. Plasmacytoid dendritic cells (PDCs are natural interferon producing cells and considered to be a primary source of IFN-α in these two diseases. Differential expression patterns of type I IFN inducible transcripts can be found in different immune cell subsets and in patients with both active and inactive autoimmune disease. A type I IFN gene signature generally consists of three groups of IFN-induced genes - those regulated in response to virus-induced type I IFN, those regulated by the IFN-induced mitogen-activated protein kinase/extracellular-regulated kinase (MAPK/ERK pathway, and those by the IFN-induced phosphoinositide-3 kinase (PI-3K pathway. These three groups of type I IFN-regulated genes control important cellular processes such as apoptosis, survival, adhesion, and chemotaxis, that when dysregulated, contribute to autoimmunity. With the recent generation of large datasets in the public domain from next-generation sequencing and DNA microarray experiments, one can perform detailed analyses of cell type-specific gene signatures as well as identify distinct transcription factors that differentially regulate these gene signatures. We have performed bioinformatics analysis of data in the public domain and experimental data from our lab to gain insight into the regulation of type I IFN gene expression. We have found that the genetic landscape of the IFNA and IFNB genes are occupied by transcription factors, such as insulators CTCF and cohesin, that negatively regulate transcription, as well as IRF5 and IRF7, that positively and distinctly regulate IFNA subtypes. A detailed understanding of the factors controlling type I IFN gene transcription will significantly aid in the identification and development of new therapeutic strategies targeting the IFN pathway in autoimmune disease.

  6. A bioinformatics potpourri.

    Science.gov (United States)

    Schönbach, Christian; Li, Jinyan; Ma, Lan; Horton, Paul; Sjaugi, Muhammad Farhan; Ranganathan, Shoba

    2018-01-19

    The 16th International Conference on Bioinformatics (InCoB) was held at Tsinghua University, Shenzhen from September 20 to 22, 2017. The annual conference of the Asia-Pacific Bioinformatics Network featured six keynotes, two invited talks, a panel discussion on big data driven bioinformatics and precision medicine, and 66 oral presentations of accepted research articles or posters. Fifty-seven articles comprising a topic assortment of algorithms, biomolecular networks, cancer and disease informatics, drug-target interactions and drug efficacy, gene regulation and expression, imaging, immunoinformatics, metagenomics, next generation sequencing for genomics and transcriptomics, ontologies, post-translational modification, and structural bioinformatics are the subject of this editorial for the InCoB2017 supplement issues in BMC Genomics, BMC Bioinformatics, BMC Systems Biology and BMC Medical Genomics. New Delhi will be the location of InCoB2018, scheduled for September 26-28, 2018.

  7. Comprehensive analysis of information dissemination in disasters

    Science.gov (United States)

    Zhang, N.; Huang, H.; Su, Boni

    2016-11-01

    China is a country that experiences a large number of disasters. The number of deaths caused by large-scale disasters and accidents in past 10 years is around 900,000. More than 92.8 percent of these deaths could be avoided if there were an effective pre-warning system deployed. Knowledge of the information dissemination characteristics of different information media taking into consideration governmental assistance (information published by a government) in disasters in urban areas, plays a critical role in increasing response time and reducing the number of deaths and economic losses. In this paper we have developed a comprehensive information dissemination model to optimize efficiency of pre-warning mechanics. This model also can be used for disseminating information for evacuees making real-time evacuation plans. We analyzed every single information dissemination models for pre-warning in disasters by considering 14 media: short message service (SMS), phone, television, radio, news portals, Wechat, microblogs, email, newspapers, loudspeaker vehicles, loudspeakers, oral communication, and passive information acquisition via visual and auditory senses. Since governmental assistance is very useful in a disaster, we calculated the sensitivity of governmental assistance ratio. The results provide useful references for information dissemination during disasters in urban areas.

  8. Bioinformatics clouds for big data manipulation

    Directory of Open Access Journals (Sweden)

    Dai Lin

    2012-11-01

    Full Text Available Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS, Software as a Service (SaaS, Platform as a Service (PaaS, and Infrastructure as a Service (IaaS, and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.

  9. Bioinformatics clouds for big data manipulation

    KAUST Repository

    Dai, Lin

    2012-11-28

    As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics.This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor. 2012 Dai et al.; licensee BioMed Central Ltd.

  10. Bioinformatics clouds for big data manipulation.

    Science.gov (United States)

    Dai, Lin; Gao, Xin; Guo, Yan; Xiao, Jingfa; Zhang, Zhang

    2012-11-28

    As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.

  11. Bioinformatics for Genome Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gary J. Olsen

    2005-06-30

    Nesbo, Boucher and Doolittle (2001) used phylogenetic trees of four taxa to assess whether euryarchaeal genes share a common history. They have suggested that of the 521 genes examined, each of the three possible tree topologies relating the four taxa was supported essentially equal numbers of times. They suggest that this might be the result of numerous horizontal gene transfer events, essentially randomizing the relationships between gene histories (as inferred in the 521 gene trees) and organismal relationships (which would be a single underlying tree). Motivated by the fact that the order in which sequences are added to a multiple sequence alignment influences the alignment, and ultimately inferred tree, they were interested in the extent to which the variations among inferred trees might be due to variations in the alignment order. This bears directly on their efforts to evaluate and improve upon methods of multiple sequence alignment. They set out to analyze the influence of alignment order on the tree inferred for 43 genes shared among these same 4 taxa. Because alignments produced by CLUSTALW are directed by a rooted guide tree (the denderogram), there are 15 possible alignment orders of 4 taxa. For each gene they tested all 15 alignment orders, and as a 16th option, allowed CLUSTALW to generate its own guide tree. If we supply all 15 possible rooted guide trees, they expected that at least one of them should be as good at CLUSTAL's own guide tree, but most of the time they differed (sometimes being better than CLUSTAL's default tree and sometimes being worse). The difference seems to be that the user-supplied tree is not given meaningful branch lengths, which effect the assumed probability of amino acid changes. They examined the practicality of modifying CLUSTALW to improve its treatment of user-supplied guide trees. This work became ever increasing bogged down in finding and repairing minor bugs in the CLUSTALW code. This effort was put on hold as we feel that our other proposed approaches will ultimately be better.

  12. Application of machine learning methods in bioinformatics

    Science.gov (United States)

    Yang, Haoyu; An, Zheng; Zhou, Haotian; Hou, Yawen

    2018-05-01

    Faced with the development of bioinformatics, high-throughput genomic technology have enabled biology to enter the era of big data. [1] Bioinformatics is an interdisciplinary, including the acquisition, management, analysis, interpretation and application of biological information, etc. It derives from the Human Genome Project. The field of machine learning, which aims to develop computer algorithms that improve with experience, holds promise to enable computers to assist humans in the analysis of large, complex data sets.[2]. This paper analyzes and compares various algorithms of machine learning and their applications in bioinformatics.

  13. Bioinformatics Approach Based Research of Profile Protein Carbonic Anhydrase II Analysis as a Potential Candidate Cause Autism for The Variation of Learning Subjects Biotechnology

    Directory of Open Access Journals (Sweden)

    Dian Eka A. F. Ningrum

    2017-03-01

    Full Text Available This study aims to determine the needs of learning variations on Biotechnology courses using bioinformatics approaches. One example of applied use of bioinformatics in biotechnology course is the analysis of protein profiles carbonic anhydrase II as a potential cause of autism candidate. This research is a qualitative descriptive study consisted of two phases. The first phase of the data obtained from observations of learning, student questionnaires, and questionnaires lecturer. Results from the first phase, namely the need for variations learning in Biotechnology course using bioinformatics. Collecting data on the second stage uses three webserver to predict the target protein and scientific articles. Visualization of proteins using PyMOL software. 3 based webserver which is used, the candidate of target proteins associated with autism is carbonic anhydrase II. The survey results revealed that the protein carbonic anhydrase II as a potential candidate for the cause of autism classified metaloenzim are able to bind with heavy metals. The content of heavy metals in autistic patients high that affect metabolism. This prediction of protein candidate cause autism is applied use to solve the problem in society, so that can achieve the learning outcome in biotechnology course.

  14. Comprehensive cluster analysis with Transitivity Clustering.

    Science.gov (United States)

    Wittkop, Tobias; Emig, Dorothea; Truss, Anke; Albrecht, Mario; Böcker, Sebastian; Baumbach, Jan

    2011-03-01

    Transitivity Clustering is a method for the partitioning of biological data into groups of similar objects, such as genes, for instance. It provides integrated access to various functions addressing each step of a typical cluster analysis. To facilitate this, Transitivity Clustering is accessible online and offers three user-friendly interfaces: a powerful stand-alone version, a web interface, and a collection of Cytoscape plug-ins. In this paper, we describe three major workflows: (i) protein (super)family detection with Cytoscape, (ii) protein homology detection with incomplete gold standards and (iii) clustering of gene expression data. This protocol guides the user through the most important features of Transitivity Clustering and takes ∼1 h to complete.

  15. Bioinformatics and Cancer

    Science.gov (United States)

    Researchers take on challenges and opportunities to mine "Big Data" for answers to complex biological questions. Learn how bioinformatics uses advanced computing, mathematics, and technological platforms to store, manage, analyze, and understand data.

  16. A Bioinformatics Facility for NASA

    Science.gov (United States)

    Schweighofer, Karl; Pohorille, Andrew

    2006-01-01

    Building on an existing prototype, we have fielded a facility with bioinformatics technologies that will help NASA meet its unique requirements for biological research. This facility consists of a cluster of computers capable of performing computationally intensive tasks, software tools, databases and knowledge management systems. Novel computational technologies for analyzing and integrating new biological data and already existing knowledge have been developed. With continued development and support, the facility will fulfill strategic NASA s bioinformatics needs in astrobiology and space exploration. . As a demonstration of these capabilities, we will present a detailed analysis of how spaceflight factors impact gene expression in the liver and kidney for mice flown aboard shuttle flight STS-108. We have found that many genes involved in signal transduction, cell cycle, and development respond to changes in microgravity, but that most metabolic pathways appear unchanged.

  17. Microbial bioinformatics 2020.

    Science.gov (United States)

    Pallen, Mark J

    2016-09-01

    Microbial bioinformatics in 2020 will remain a vibrant, creative discipline, adding value to the ever-growing flood of new sequence data, while embracing novel technologies and fresh approaches. Databases and search strategies will struggle to cope and manual curation will not be sustainable during the scale-up to the million-microbial-genome era. Microbial taxonomy will have to adapt to a situation in which most microorganisms are discovered and characterised through the analysis of sequences. Genome sequencing will become a routine approach in clinical and research laboratories, with fresh demands for interpretable user-friendly outputs. The "internet of things" will penetrate healthcare systems, so that even a piece of hospital plumbing might have its own IP address that can be integrated with pathogen genome sequences. Microbiome mania will continue, but the tide will turn from molecular barcoding towards metagenomics. Crowd-sourced analyses will collide with cloud computing, but eternal vigilance will be the price of preventing the misinterpretation and overselling of microbial sequence data. Output from hand-held sequencers will be analysed on mobile devices. Open-source training materials will address the need for the development of a skilled labour force. As we boldly go into the third decade of the twenty-first century, microbial sequence space will remain the final frontier! © 2016 The Author. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.

  18. Comprehensive physical analysis of bond wire interfaces in power modules

    DEFF Research Database (Denmark)

    Popok, Vladimir; Pedersen, Kristian Bonderup; Kristensen, Peter Kjær

    2016-01-01

    causing failures. In this paper we present a review on the set of our experimental and theoretical studies allowing comprehensive physical analysis of changes in materials under active power cycling with focus on bond wire interfaces and thin metallisation layers. The developed electro-thermal and thermo...

  19. Identification Of Protein Vaccine Candidates Using Comprehensive Proteomic Analysis Strategies

    Science.gov (United States)

    2007-12-01

    that fascinating fungus known as Coccidioides. I also want to thank the UA Mass Spectrometry Facility and the UA Proteomics Consortium, especially...W. & N. N. Kav. 2006. The proteome of the phytopathogenic fungus Sclerotinia sclerotiorum. Proteomics 6: 5995-6007. 127. de Godoy, L. M., J. V...IDENTIFICATION OF PROTEIN VACCINE CANDIDATES USING COMPREHENSIVE PROTEOMIC ANALYSIS STRATEGIES by James G. Rohrbough

  20. Comprehensive School Reform and Achievement: A Meta-Analysis

    Science.gov (United States)

    Borman, Geoffrey D.; Hewes, Gina M.; Overman, Laura T.; Brown, Shelly

    2003-01-01

    This meta-analysis reviews research on the achievement effects of comprehensive school reform (CSR) and summarizes the specific effects of 29 widely implemented models. There are limitations on the overall quantity and quality of the research base, but the overall effects of CSR appear promising. The combined quantity, quality, and statistical…

  1. Are There Gender Differences in Emotion Comprehension? Analysis of the Test of Emotion Comprehension.

    Science.gov (United States)

    Fidalgo, Angel M; Tenenbaum, Harriet R; Aznar, Ana

    2018-01-01

    This article examines whether there are gender differences in understanding the emotions evaluated by the Test of Emotion Comprehension (TEC). The TEC provides a global index of emotion comprehension in children 3-11 years of age, which is the sum of the nine components that constitute emotion comprehension: (1) recognition of facial expressions, (2) understanding of external causes of emotions, (3) understanding of desire-based emotions, (4) understanding of belief-based emotions, (5) understanding of the influence of a reminder on present emotional states, (6) understanding of the possibility to regulate emotional states, (7) understanding of the possibility of hiding emotional states, (8) understanding of mixed emotions, and (9) understanding of moral emotions. We used the answers to the TEC given by 172 English girls and 181 boys from 3 to 8 years of age. First, the nine components into which the TEC is subdivided were analysed for differential item functioning (DIF), taking gender as the grouping variable. To evaluate DIF, the Mantel-Haenszel method and logistic regression analysis were used applying the Educational Testing Service DIF classification criteria. The results show that the TEC did not display gender DIF. Second, when absence of DIF had been corroborated, it was analysed for differences between boys and girls in the total TEC score and its components controlling for age. Our data are compatible with the hypothesis of independence between gender and level of comprehension in 8 of the 9 components of the TEC. Several hypotheses are discussed that could explain the differences found between boys and girls in the belief component. Given that the Belief component is basically a false belief task, the differences found seem to support findings in the literature indicating that girls perform better on this task.

  2. The Use of Next Generation Sequencing and Junction Sequence Analysis Bioinformatics to Achieve Molecular Characterization of Crops Improved Through Modern Biotechnology

    Directory of Open Access Journals (Sweden)

    David Kovalic

    2012-11-01

    Full Text Available The assessment of genetically modified (GM crops for regulatory approval currently requires a detailed molecular characterization of the DNA sequence and integrity of the transgene locus. In addition, molecular characterization is a critical component of event selection and advancement during product development. Typically, molecular characterization has relied on Southern blot analysis to establish locus and copy number along with targeted sequencing of polymerase chain reaction products spanning any inserted DNA to complete the characterization process. Here we describe the use of next generation (NexGen sequencing and junction sequence analysis bioinformatics in a new method for achieving full molecular characterization of a GM event without the need for Southern blot analysis. In this study, we examine a typical GM soybean [ (L. Merr.] line and demonstrate that this new method provides molecular characterization equivalent to the current Southern blot-based method. We also examine an event containing in vivo DNA rearrangement of multiple transfer DNA inserts to demonstrate that the new method is effective at identifying complex cases. Next generation sequencing and bioinformatics offers certain advantages over current approaches, most notably the simplicity, efficiency, and consistency of the method, and provides a viable alternative for efficiently and robustly achieving molecular characterization of GM crops.

  3. A Bioinformatics Analysis Reveals a Group of MocR Bacterial Transcriptional Regulators Linked to a Family of Genes Coding for Membrane Proteins

    Directory of Open Access Journals (Sweden)

    Teresa Milano

    2016-01-01

    Full Text Available The MocR bacterial transcriptional regulators are characterized by an N-terminal domain, 60 residues long on average, possessing the winged-helix-turn-helix (wHTH architecture responsible for DNA recognition and binding, linked to a large C-terminal domain (350 residues on average that is homologous to fold type-I pyridoxal 5′-phosphate (PLP dependent enzymes like aspartate aminotransferase (AAT. These regulators are involved in the expression of genes taking part in several metabolic pathways directly or indirectly connected to PLP chemistry, many of which are still uncharacterized. A bioinformatics analysis is here reported that studied the features of a distinct group of MocR regulators predicted to be functionally linked to a family of homologous genes coding for integral membrane proteins of unknown function. This group occurs mainly in the Actinobacteria and Gammaproteobacteria phyla. An analysis of the multiple sequence alignments of their wHTH and AAT domains suggested the presence of specificity-determining positions (SDPs. Mapping of SDPs onto a homology model of the AAT domain hinted at possible structural/functional roles in effector recognition. Likewise, SDPs in wHTH domain suggested the basis of specificity of Transcription Factor Binding Site recognition. The results reported represent a framework for rational design of experiments and for bioinformatics analysis of other MocR subgroups.

  4. Operator theory a comprehensive course in analysis, part 4

    CERN Document Server

    Simon, Barry

    2015-01-01

    A Comprehensive Course in Analysis by Poincaré Prize winner Barry Simon is a five-volume set that can serve as a graduate-level analysis textbook with a lot of additional bonus information, including hundreds of problems and numerous notes that extend the text and provide important historical background. Depth and breadth of exposition make this set a valuable reference source for almost all areas of classical analysis. Part 4 focuses on operator theory, especially on a Hilbert space. Central topics are the spectral theorem, the theory of trace class and Fredholm determinants, and the study of

  5. pocketZebra: a web-server for automated selection and classification of subfamily-specific binding sites by bioinformatic analysis of diverse protein families.

    Science.gov (United States)

    Suplatov, Dmitry; Kirilin, Eugeny; Arbatsky, Mikhail; Takhaveev, Vakil; Svedas, Vytas

    2014-07-01

    The new web-server pocketZebra implements the power of bioinformatics and geometry-based structural approaches to identify and rank subfamily-specific binding sites in proteins by functional significance, and select particular positions in the structure that determine selective accommodation of ligands. A new scoring function has been developed to annotate binding sites by the presence of the subfamily-specific positions in diverse protein families. pocketZebra web-server has multiple input modes to meet the needs of users with different experience in bioinformatics. The server provides on-site visualization of the results as well as off-line version of the output in annotated text format and as PyMol sessions ready for structural analysis. pocketZebra can be used to study structure-function relationship and regulation in large protein superfamilies, classify functionally important binding sites and annotate proteins with unknown function. The server can be used to engineer ligand-binding sites and allosteric regulation of enzymes, or implemented in a drug discovery process to search for potential molecular targets and novel selective inhibitors/effectors. The server, documentation and examples are freely available at http://biokinet.belozersky.msu.ru/pocketzebra and there are no login requirements. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. Massively parallel signature sequencing and bioinformatics analysis identifies up-regulation of TGFBI and SOX4 in human glioblastoma.

    Directory of Open Access Journals (Sweden)

    Biaoyang Lin

    Full Text Available BACKGROUND: A comprehensive network-based understanding of molecular pathways abnormally altered in glioblastoma multiforme (GBM is essential for developing effective therapeutic approaches for this deadly disease. METHODOLOGY/PRINCIPAL FINDINGS: Applying a next generation sequencing technology, massively parallel signature sequencing (MPSS, we identified a total of 4535 genes that are differentially expressed between normal brain and GBM tissue. The expression changes of three up-regulated genes, CHI3L1, CHI3L2, and FOXM1, and two down-regulated genes, neurogranin and L1CAM, were confirmed by quantitative PCR. Pathway analysis revealed that TGF- beta pathway related genes were significantly up-regulated in GBM tumor samples. An integrative pathway analysis of the TGF beta signaling network identified two alternative TGF-beta signaling pathways mediated by SOX4 (sex determining region Y-box 4 and TGFBI (Transforming growth factor beta induced. Quantitative RT-PCR and immunohistochemistry staining demonstrated that SOX4 and TGFBI expression is elevated in GBM tissues compared with normal brain tissues at both the RNA and protein levels. In vitro functional studies confirmed that TGFBI and SOX4 expression is increased by TGF-beta stimulation and decreased by a specific inhibitor of TGF-beta receptor 1 kinase. CONCLUSIONS/SIGNIFICANCE: Our MPSS database for GBM and normal brain tissues provides a useful resource for the scientific community. The identification of non-SMAD mediated TGF-beta signaling pathways acting through SOX4 and TGFBI (GENE ID:7045 in GBM indicates that these alternative pathways should be considered, in addition to the canonical SMAD mediated pathway, in the development of new therapeutic strategies targeting TGF-beta signaling in GBM. Finally, the construction of an extended TGF-beta signaling network with overlaid gene expression changes between GBM and normal brain extends our understanding of the biology of GBM.

  7. Using miscue analysis to assess comprehension in deaf college readers.

    Science.gov (United States)

    Albertini, John; Mayer, Connie

    2011-01-01

    For over 30 years, teachers have used miscue analysis as a tool to assess and evaluate the reading abilities of hearing students in elementary and middle schools and to design effective literacy programs. More recently, teachers of deaf and hard-of-hearing students have also reported its usefulness for diagnosing word- and phrase-level reading difficulties and for planning instruction. To our knowledge, miscue analysis has not been used with older, college-age deaf students who might also be having difficulty decoding and understanding text at the word level. The goal of this study was to determine whether such an analysis would be helpful in identifying the source of college students' reading comprehension difficulties. After analyzing the miscues of 10 college-age readers and the results of other comprehension-related tasks, we concluded that comprehension of basic grade school-level passages depended on the ability to recognize and comprehend key words and phrases in these texts. We also concluded that these diagnostic procedures provided useful information about the reading abilities and strategies of each reader that had implications for designing more effective interventions.

  8. Development of Bioinformatics Infrastructure for Genomics Research.

    Science.gov (United States)

    Mulder, Nicola J; Adebiyi, Ezekiel; Adebiyi, Marion; Adeyemi, Seun; Ahmed, Azza; Ahmed, Rehab; Akanle, Bola; Alibi, Mohamed; Armstrong, Don L; Aron, Shaun; Ashano, Efejiro; Baichoo, Shakuntala; Benkahla, Alia; Brown, David K; Chimusa, Emile R; Fadlelmola, Faisal M; Falola, Dare; Fatumo, Segun; Ghedira, Kais; Ghouila, Amel; Hazelhurst, Scott; Isewon, Itunuoluwa; Jung, Segun; Kassim, Samar Kamal; Kayondo, Jonathan K; Mbiyavanga, Mamana; Meintjes, Ayton; Mohammed, Somia; Mosaku, Abayomi; Moussa, Ahmed; Muhammd, Mustafa; Mungloo-Dilmohamud, Zahra; Nashiru, Oyekanmi; Odia, Trust; Okafor, Adaobi; Oladipo, Olaleye; Osamor, Victor; Oyelade, Jellili; Sadki, Khalid; Salifu, Samson Pandam; Soyemi, Jumoke; Panji, Sumir; Radouani, Fouzia; Souiai, Oussama; Tastan Bishop, Özlem

    2017-06-01

    Although pockets of bioinformatics excellence have developed in Africa, generally, large-scale genomic data analysis has been limited by the availability of expertise and infrastructure. H3ABioNet, a pan-African bioinformatics network, was established to build capacity specifically to enable H3Africa (Human Heredity and Health in Africa) researchers to analyze their data in Africa. Since the inception of the H3Africa initiative, H3ABioNet's role has evolved in response to changing needs from the consortium and the African bioinformatics community. H3ABioNet set out to develop core bioinformatics infrastructure and capacity for genomics research in various aspects of data collection, transfer, storage, and analysis. Various resources have been developed to address genomic data management and analysis needs of H3Africa researchers and other scientific communities on the continent. NetMap was developed and used to build an accurate picture of network performance within Africa and between Africa and the rest of the world, and Globus Online has been rolled out to facilitate data transfer. A participant recruitment database was developed to monitor participant enrollment, and data is being harmonized through the use of ontologies and controlled vocabularies. The standardized metadata will be integrated to provide a search facility for H3Africa data and biospecimens. Because H3Africa projects are generating large-scale genomic data, facilities for analysis and interpretation are critical. H3ABioNet is implementing several data analysis platforms that provide a large range of bioinformatics tools or workflows, such as Galaxy, the Job Management System, and eBiokits. A set of reproducible, portable, and cloud-scalable pipelines to support the multiple H3Africa data types are also being developed and dockerized to enable execution on multiple computing infrastructures. In addition, new tools have been developed for analysis of the uniquely divergent African data and for

  9. The Comprehension Problems for Second-Language Learners with Poor Reading Comprehension Despite Adequate Decoding: A Meta-Analysis

    Science.gov (United States)

    Spencer, Mercedes; Wagner, Richard K.

    2017-01-01

    We conducted a meta-analysis of 16 existing studies to examine the nature of the comprehension problems for children who were second-language learners with poor reading comprehension despite adequate decoding. Results indicated that these children had deficits in oral language (d = -0.80), but these deficits were not as severe as their reading…

  10. Comprehensive Analysis Competence and Innovative Approaches for Sustainable Chemical Production.

    Science.gov (United States)

    Appel, Joerg; Colombo, Corrado; Dätwyler, Urs; Chen, Yun; Kerimoglu, Nimet

    2016-01-01

    Humanity currently sees itself facing enormous economic, ecological, and social challenges. Sustainable products and production in specialty chemistry are an important strategic element to address these megatrends. In addition to that, digitalization and global connectivity will create new opportunities for the industry. One aspect is examined in this paper, which shows the development of comprehensive analysis of production networks for a more sustainable production in which the need for innovative solutions arises. Examples from data analysis, advanced process control and automated performance monitoring are shown. These efforts have significant impact on improved yields, reduced energy and water consumption, and better product performance in the application of the products.

  11. Prenatal alcohol exposure alters gene expression in the rat brain: Experimental design and bioinformatic analysis of microarray data

    Directory of Open Access Journals (Sweden)

    Alexandre A. Lussier

    2015-09-01

    Full Text Available We previously identified gene expression changes in the prefrontal cortex and hippocampus of rats prenatally exposed to alcohol under both steady-state and challenge conditions (Lussier et al., 2015, Alcohol.: Clin. Exp. Res., 39, 251–261. In this study, adult female rats from three prenatal treatment groups (ad libitum-fed control, pair-fed, and ethanol-fed were injected with physiological saline solution or complete Freund׳s adjuvant (CFA to induce arthritis (adjuvant-induced arthritis, AA. The prefrontal cortex and hippocampus were collected 16 days (peak of arthritis or 39 days (during recovery following injection, and whole genome gene expression was assayed using Illumina׳s RatRef-12 expression microarray. Here, we provide additional metadata, detailed explanations of data pre-processing steps and quality control, as well as a basic framework for the bioinformatic analyses performed. The datasets from this study are publicly available on the GEO repository (accession number GSE63561.

  12. Bioinformatics for Exploration

    Science.gov (United States)

    Johnson, Kathy A.

    2006-01-01

    For the purpose of this paper, bioinformatics is defined as the application of computer technology to the management of biological information. It can be thought of as the science of developing computer databases and algorithms to facilitate and expedite biological research. This is a crosscutting capability that supports nearly all human health areas ranging from computational modeling, to pharmacodynamics research projects, to decision support systems within autonomous medical care. Bioinformatics serves to increase the efficiency and effectiveness of the life sciences research program. It provides data, information, and knowledge capture which further supports management of the bioastronautics research roadmap - identifying gaps that still remain and enabling the determination of which risks have been addressed.

  13. Novel C16orf57 mutations in patients with Poikiloderma with Neutropenia: bioinformatic analysis of the protein and predicted effects of all reported mutations

    Directory of Open Access Journals (Sweden)

    Colombo Elisa A

    2012-01-01

    Full Text Available Abstract Background Poikiloderma with Neutropenia (PN is a rare autosomal recessive genodermatosis caused by C16orf57 mutations. To date 17 mutations have been identified in 31 PN patients. Results We characterize six PN patients expanding the clinical phenotype of the syndrome and the mutational repertoire of the gene. We detect the two novel C16orf57 mutations, c.232C>T and c.265+2T>G, as well as the already reported c.179delC, c.531delA and c.693+1G>T mutations. cDNA analysis evidences the presence of aberrant transcripts, and bioinformatic prediction of C16orf57 protein structure gauges the mutations effects on the folded protein chain. Computational analysis of the C16orf57 protein shows two conserved H-X-S/T-X tetrapeptide motifs marking the active site of a two-fold pseudosymmetric structure recalling the 2H phosphoesterase superfamily. Based on this model C16orf57 is likely a 2H-active site enzyme functioning in RNA processing, as a presumptive RNA ligase. According to bioinformatic prediction, all known C16orf57 mutations, including the novel mutations herein described, impair the protein structure by either removing one or both tetrapeptide motifs or by destroying the symmetry of the native folding. Finally, we analyse the geographical distribution of the recurrent mutations that depicts clusters featuring a founder effect. Conclusions In cohorts of patients clinically affected by genodermatoses with overlapping symptoms, the molecular screening of C16orf57 gene seems the proper way to address the correct diagnosis of PN, enabling the syndrome-specific oncosurveillance. The bioinformatic prediction of the C16orf57 protein structure denotes a very basic enzymatic function consistent with a housekeeping function. Detection of aberrant transcripts, also in cells from PN patients carrying early truncated mutations, suggests they might be translatable. Tissue-specific sensitivity to the lack of functionally correct protein accounts for the

  14. Advance in structural bioinformatics

    CERN Document Server

    Wei, Dongqing; Zhao, Tangzhen; Dai, Hao

    2014-01-01

    This text examines in detail mathematical and physical modeling, computational methods and systems for obtaining and analyzing biological structures, using pioneering research cases as examples. As such, it emphasizes programming and problem-solving skills. It provides information on structure bioinformatics at various levels, with individual chapters covering introductory to advanced aspects, from fundamental methods and guidelines on acquiring and analyzing genomics and proteomics sequences, the structures of protein, DNA and RNA, to the basics of physical simulations and methods for conform

  15. Phylogenetic trees in bioinformatics

    Energy Technology Data Exchange (ETDEWEB)

    Burr, Tom L [Los Alamos National Laboratory

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  16. Data mining for bioinformatics applications

    CERN Document Server

    Zengyou, He

    2015-01-01

    Data Mining for Bioinformatics Applications provides valuable information on the data mining methods have been widely used for solving real bioinformatics problems, including problem definition, data collection, data preprocessing, modeling, and validation. The text uses an example-based method to illustrate how to apply data mining techniques to solve real bioinformatics problems, containing 45 bioinformatics problems that have been investigated in recent research. For each example, the entire data mining process is described, ranging from data preprocessing to modeling and result validation. Provides valuable information on the data mining methods have been widely used for solving real bioinformatics problems Uses an example-based method to illustrate how to apply data mining techniques to solve real bioinformatics problems Contains 45 bioinformatics problems that have been investigated in recent research.

  17. Somatic mutation profiles of MSI and MSS colorectal cancer identified by whole exome next generation sequencing and bioinformatics analysis.

    Directory of Open Access Journals (Sweden)

    Bernd Timmermann

    Full Text Available BACKGROUND: Colorectal cancer (CRC is with approximately 1 million cases the third most common cancer worldwide. Extensive research is ongoing to decipher the underlying genetic patterns with the hope to improve early cancer diagnosis and treatment. In this direction, the recent progress in next generation sequencing technologies has revolutionized the field of cancer genomics. However, one caveat of these studies remains the large amount of genetic variations identified and their interpretation. METHODOLOGY/PRINCIPAL FINDINGS: Here we present the first work on whole exome NGS of primary colon cancers. We performed 454 whole exome pyrosequencing of tumor as well as adjacent not affected normal colonic tissue from microsatellite stable (MSS and microsatellite instable (MSI colon cancer patients and identified more than 50,000 small nucleotide variations for each tissue. According to predictions based on MSS and MSI pathomechanisms we identified eight times more somatic non-synonymous variations in MSI cancers than in MSS and we were able to reproduce the result in four additional CRCs. Our bioinformatics filtering approach narrowed down the rate of most significant mutations to 359 for MSI and 45 for MSS CRCs with predicted altered protein functions. In both CRCs, MSI and MSS, we found somatic mutations in the intracellular kinase domain of bone morphogenetic protein receptor 1A, BMPR1A, a gene where so far germline mutations are associated with juvenile polyposis syndrome, and show that the mutations functionally impair the protein function. CONCLUSIONS/SIGNIFICANCE: We conclude that with deep sequencing of tumor exomes one may be able to predict the microsatellite status of CRC and in addition identify potentially clinically relevant mutations.

  18. BioRuby: bioinformatics software for the Ruby programming language.

    Science.gov (United States)

    Goto, Naohisa; Prins, Pjotr; Nakao, Mitsuteru; Bonnal, Raoul; Aerts, Jan; Katayama, Toshiaki

    2010-10-15

    The BioRuby software toolkit contains a comprehensive set of free development tools and libraries for bioinformatics and molecular biology, written in the Ruby programming language. BioRuby has components for sequence analysis, pathway analysis, protein modelling and phylogenetic analysis; it supports many widely used data formats and provides easy access to databases, external programs and public web services, including BLAST, KEGG, GenBank, MEDLINE and GO. BioRuby comes with a tutorial, documentation and an interactive environment, which can be used in the shell, and in the web browser. BioRuby is free and open source software, made available under the Ruby license. BioRuby runs on all platforms that support Ruby, including Linux, Mac OS X and Windows. And, with JRuby, BioRuby runs on the Java Virtual Machine. The source code is available from http://www.bioruby.org/. katayama@bioruby.org

  19. Development of comprehensive and versatile framework for reactor analysis, MARBLE

    International Nuclear Information System (INIS)

    Yokoyama, Kenji; Hazama, Taira; Numata, Kazuyuki; Jin, Tomoyuki

    2014-01-01

    Highlights: • We have developed a neutronics code system for reactor analysis. • The new code system covers all five phases of the core design procedures. • All the functionalities are integrated and validated in the same framework. • The framework supports continuous improvement and extension. • We report results of validation and practical applications. - Abstract: A comprehensive and versatile reactor analysis code system, MARBLE, has been developed. MARBLE is designed as a software development framework for reactor analysis, which offers reusable and extendible functions and data models based on physical concepts, rather than a reactor analysis code system. From a viewpoint of the code system, it provides a set of functionalities utilized in a detailed reactor analysis scheme for fast criticality assemblies and power reactors, and nuclear data related uncertainty quantification such as cross-section adjustment. MARBLE includes five sub-systems named ECRIPSE, BIBLO, SCHEME, UNCERTAINTY and ORPHEUS, which are constructed of the shared functions and data models in the framework. By using these sub-systems, MARBLE covers all phases required in fast reactor core design prediction and improvement procedures, i.e. integral experiment database management, nuclear data processing, fast criticality assembly analysis, uncertainty quantification, and power reactor analysis. In the present paper, these functionalities are summarized and system validation results are described

  20. Analysis of Virtual Learning Environments from a Comprehensive Semiotic Perspective

    Directory of Open Access Journals (Sweden)

    Gloria María Álvarez Cadavid

    2012-11-01

    Full Text Available Although there is a wide variety of perspectives and models for the study of online education, most of these focus on the analysis of the verbal aspects of such learning, while very few consider the relationship between speech and elements of a different nature, such as images and hypermediality. In a previous article we presented a proposal for a comprehensive semiotic analysis of virtual learning environments that more recently has been developed and tested for the study of different online training courses without instructional intervention. In this paper we use this same proposal to analyze online learning environments in the framework of courses with instructional intervention. One of the main observations in relation to this type of analyses is that the organizational aspects of the courses are found to be related to the way in which the input elements for the teaching and learning process are constructed.

  1. Planning bioinformatics workflows using an expert system

    Science.gov (United States)

    Chen, Xiaoling; Chang, Jeffrey T.

    2017-01-01

    Abstract Motivation: Bioinformatic analyses are becoming formidably more complex due to the increasing number of steps required to process the data, as well as the proliferation of methods that can be used in each step. To alleviate this difficulty, pipelines are commonly employed. However, pipelines are typically implemented to automate a specific analysis, and thus are difficult to use for exploratory analyses requiring systematic changes to the software or parameters used. Results: To automate the development of pipelines, we have investigated expert systems. We created the Bioinformatics ExperT SYstem (BETSY) that includes a knowledge base where the capabilities of bioinformatics software is explicitly and formally encoded. BETSY is a backwards-chaining rule-based expert system comprised of a data model that can capture the richness of biological data, and an inference engine that reasons on the knowledge base to produce workflows. Currently, the knowledge base is populated with rules to analyze microarray and next generation sequencing data. We evaluated BETSY and found that it could generate workflows that reproduce and go beyond previously published bioinformatics results. Finally, a meta-investigation of the workflows generated from the knowledge base produced a quantitative measure of the technical burden imposed by each step of bioinformatics analyses, revealing the large number of steps devoted to the pre-processing of data. In sum, an expert system approach can facilitate exploratory bioinformatic analysis by automating the development of workflows, a task that requires significant domain expertise. Availability and Implementation: https://github.com/jefftc/changlab Contact: jeffrey.t.chang@uth.tmc.edu PMID:28052928

  2. Bioinformatics and its application in animal health: a review | Soetan ...

    African Journals Online (AJOL)

    Bioinformatics is an interdisciplinary subject, which uses computer application, statistics, mathematics and engineering for the analysis and management of biological information. It has become an important tool for basic and applied research in veterinary sciences. Bioinformatics has brought about advancements into ...

  3. Temporal dynamics of soil microbial communities under different moisture regimes: high-throughput sequencing and bioinformatics analysis

    Science.gov (United States)

    Semenov, Mikhail; Zhuravleva, Anna; Semenov, Vyacheslav; Yevdokimov, Ilya; Larionova, Alla

    2017-04-01

    Recent climate scenarios predict not only continued global warming but also an increased frequency and intensity of extreme climatic events such as strong changes in temperature and precipitation regimes. Microorganisms are well known to be more sensitive to changes in environmental conditions than to other soil chemical and physical parameters. In this study, we determined the shifts in soil microbial community structure as well as indicative taxa in soils under three moisture regimes using high-throughput Illumina sequencing and range of bioinformatics approaches for the assessment of sequence data. Incubation experiments were performed in soil-filled (Greyic Phaeozems Albic) rhizoboxes with maize and without plants. Three contrasting moisture regimes were being simulated: 1) optimal wetting (OW), a watering 2-3 times per week to maintain soil moisture of 20-25% by weight; 2) periodic wetting (PW), with alternating periods of wetting and drought; and 3) constant insufficient wetting (IW), while soil moisture of 12% by weight was permanently maintained. Sampled fresh soils were homogenized, and the total DNA of three replicates was extracted using the FastDNA® SPIN kit for Soil. DNA replicates were combined in a pooled sample and the DNA was used for PCR with specific primers for the 16S V3 and V4 regions. In order to compare variability between different samples and replicates within a single sample, some DNA replicates treated separately. The products were purified and submitted to Illumina MiSeq sequencing. Sequence data were evaluated by alpha-diversity (Chao1 and Shannon H' diversity indexes), beta-diversity (UniFrac and Bray-Curtis dissimilarity), heatmap, tagcloud, and plot-bar analyses using the MiSeq Reporter Metagenomics Workflow and R packages (phyloseq, vegan, tagcloud). Shannon index varied in a rather narrow range (4.4-4.9) with the lowest values for microbial communities under PW treatment. Chao1 index varied from 385 to 480, being a more flexible

  4. A comprehensive risk analysis of coastal zones in China

    Science.gov (United States)

    Wang, Guanghui; Liu, Yijun; Wang, Hongbing; Wang, Xueying

    2014-03-01

    Although coastal zones occupy an important position in the world development, they face high risks and vulnerability to natural disasters because of their special locations and their high population density. In order to estimate their capability for crisis-response, various models have been established. However, those studies mainly focused on natural factors or conditions, which could not reflect the social vulnerability and regional disparities of coastal zones. Drawing lessons from the experiences of the United Nations Environment Programme (UNEP), this paper presents a comprehensive assessment strategy based on the mechanism of Risk Matrix Approach (RMA), which includes two aspects that are further composed of five second-class indicators. The first aspect, the probability phase, consists of indicators of economic conditions, social development, and living standards, while the second one, the severity phase, is comprised of geographic exposure and natural disasters. After weighing all of the above indicators by applying the Analytic Hierarchy Process (AHP) and Delphi Method, the paper uses the comprehensive assessment strategy to analyze the risk indices of 50 coastal cities in China. The analytical results are presented in ESRI ArcGis10.1, which generates six different risk maps covering the aspects of economy, society, life, environment, disasters, and an overall assessment of the five areas. Furthermore, the study also investigates the spatial pattern of these risk maps, with detailed discussion and analysis of different risks in coastal cities.

  5. Comprehensive two-dimensional liquid chromatographic analysis of poloxamers.

    Science.gov (United States)

    Malik, Muhammad Imran; Lee, Sanghoon; Chang, Taihyun

    2016-04-15

    Poloxamers are low molar mass triblock copolymers of poly(ethylene oxide) (PEO) and poly(propylene oxide) (PPO), having number of applications as non-ionic surfactants. Comprehensive one and two-dimensional liquid chromatographic (LC) analysis of these materials is proposed in this study. The separation of oligomers of both types (PEO and PPO) is demonstrated for several commercial poloxamers. This is accomplished at the critical conditions for one of the block while interaction for the other block. Reversed phase LC at CAP of PEO allowed for oligomeric separation of triblock copolymers with regard to PPO block whereas normal phase LC at CAP of PPO renders oligomeric separation with respect to PEO block. The oligomeric separation with regard to PEO and PPO are coupled online (comprehensive 2D-LC) to reveal two-dimensional contour plots by unconventional 2D IC×IC (interaction chromatography) coupling. The study provides chemical composition mapping of both PEO and PPO, equivalent to combined molar mass and chemical composition mapping for several commercial poloxamers. Copyright © 2016 Elsevier B.V. All rights reserved.

  6. Open source tools and toolkits for bioinformatics: significance, and where are we?

    Science.gov (United States)

    Stajich, Jason E; Lapp, Hilmar

    2006-09-01

    This review summarizes important work in open-source bioinformatics software that has occurred over the past couple of years. The survey is intended to illustrate how programs and toolkits whose source code has been developed or released under an Open Source license have changed informatics-heavy areas of life science research. Rather than creating a comprehensive list of all tools developed over the last 2-3 years, we use a few selected projects encompassing toolkit libraries, analysis tools, data analysis environments and interoperability standards to show how freely available and modifiable open-source software can serve as the foundation for building important applications, analysis workflows and resources.

  7. Metagenomics and Bioinformatics in Microbial Ecology: Current Status and Beyond.

    Science.gov (United States)

    Hiraoka, Satoshi; Yang, Ching-Chia; Iwasaki, Wataru

    2016-09-29

    Metagenomic approaches are now commonly used in microbial ecology to study microbial communities in more detail, including many strains that cannot be cultivated in the laboratory. Bioinformatic analyses make it possible to mine huge metagenomic datasets and discover general patterns that govern microbial ecosystems. However, the findings of typical metagenomic and bioinformatic analyses still do not completely describe the ecology and evolution of microbes in their environments. Most analyses still depend on straightforward sequence similarity searches against reference databases. We herein review the current state of metagenomics and bioinformatics in microbial ecology and discuss future directions for the field. New techniques will allow us to go beyond routine analyses and broaden our knowledge of microbial ecosystems. We need to enrich reference databases, promote platforms that enable meta- or comprehensive analyses of diverse metagenomic datasets, devise methods that utilize long-read sequence information, and develop more powerful bioinformatic methods to analyze data from diverse perspectives.

  8. Revisiting Francisella tularensis subsp. holarctica, Causative Agent of Tularemia in Germany With Bioinformatics: New Insights in Genome Structure, DNA Methylation and Comparative Phylogenetic Analysis

    Directory of Open Access Journals (Sweden)

    Anne Busch

    2018-03-01

    Full Text Available Francisella (F. tularensis is a highly virulent, Gram-negative bacterial pathogen and the causative agent of the zoonotic disease tularemia. Here, we generated, analyzed and characterized a high quality circular genome sequence of the F. tularensis subsp. holarctica strain 12T0050 that caused fatal tularemia in a hare. Besides the genomic structure, we focused on the analysis of oriC, unique to the Francisella genus and regulating replication in and outside hosts and the first report on genomic DNA methylation of a Francisella strain. The high quality genome was used to establish and evaluate a diagnostic whole genome sequencing pipeline. A genotyping strategy for F. tularensis was developed using various bioinformatics tools for genotyping. Additionally, whole genome sequences of F. tularensis subsp. holarctica isolates isolated in the years 2008–2015 in Germany were generated. A phylogenetic analysis allowed to determine the genetic relatedness of these isolates and confirmed the highly conserved nature of F. tularensis subsp. holarctica.

  9. Advanced complex analysis a comprehensive course in analysis, part 2b

    CERN Document Server

    Simon, Barry

    2015-01-01

    A Comprehensive Course in Analysis by Poincaré Prize winner Barry Simon is a five-volume set that can serve as a graduate-level analysis textbook with a lot of additional bonus information, including hundreds of problems and numerous notes that extend the text and provide important historical background. Depth and breadth of exposition make this set a valuable reference source for almost all areas of classical analysis. Part 2B provides a comprehensive look at a number of subjects of complex analysis not included in Part 2A. Presented in this volume are the theory of conformal metrics (includ

  10. Emergent Computation Emphasizing Bioinformatics

    CERN Document Server

    Simon, Matthew

    2005-01-01

    Emergent Computation is concerned with recent applications of Mathematical Linguistics or Automata Theory. This subject has a primary focus upon "Bioinformatics" (the Genome and arising interest in the Proteome), but the closing chapter also examines applications in Biology, Medicine, Anthropology, etc. The book is composed of an organized examination of DNA, RNA, and the assembly of amino acids into proteins. Rather than examine these areas from a purely mathematical viewpoint (that excludes much of the biochemical reality), the author uses scientific papers written mostly by biochemists based upon their laboratory observations. Thus while DNA may exist in its double stranded form, triple stranded forms are not excluded. Similarly, while bases exist in Watson-Crick complements, mismatched bases and abasic pairs are not excluded, nor are Hoogsteen bonds. Just as there are four bases naturally found in DNA, the existence of additional bases is not ignored, nor amino acids in addition to the usual complement of...

  11. Interdisciplinary Introductory Course in Bioinformatics

    Science.gov (United States)

    Kortsarts, Yana; Morris, Robert W.; Utell, Janine M.

    2010-01-01

    Bioinformatics is a relatively new interdisciplinary field that integrates computer science, mathematics, biology, and information technology to manage, analyze, and understand biological, biochemical and biophysical information. We present our experience in teaching an interdisciplinary course, Introduction to Bioinformatics, which was developed…

  12. Virtual Bioinformatics Distance Learning Suite

    Science.gov (United States)

    Tolvanen, Martti; Vihinen, Mauno

    2004-01-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material…

  13. A bioinformatics roadmap for the human vaccines project.

    Science.gov (United States)

    Scheuermann, Richard H; Sinkovits, Robert S; Schenkelberg, Theodore; Koff, Wayne C

    2017-06-01

    Biomedical research has become a data intensive science in which high throughput experimentation is producing comprehensive data about biological systems at an ever-increasing pace. The Human Vaccines Project is a new public-private partnership, with the goal of accelerating development of improved vaccines and immunotherapies for global infectious diseases and cancers by decoding the human immune system. To achieve its mission, the Project is developing a Bioinformatics Hub as an open-source, multidisciplinary effort with the overarching goal of providing an enabling infrastructure to support the data processing, analysis and knowledge extraction procedures required to translate high throughput, high complexity human immunology research data into biomedical knowledge, to determine the core principles driving specific and durable protective immune responses.

  14. Stochastic biological response to radiation. Comprehensive analysis of gene expression

    International Nuclear Information System (INIS)

    Inoue, Tohru; Hirabayashi, Yoko

    2012-01-01

    Authors explain that the radiation effect on biological system is stochastic along the law of physics, differing from chemical effect, using instances of Cs-137 gamma-ray (GR) and benzene (BZ) exposures to mice and of resultant comprehensive analyses of gene expression. Single GR irradiation is done with Gamma Cell 40 (CSR) to C57BL/6 or C3H/He mouse at 0, 0.6 and 3 Gy. BE is given orally at 150 mg/kg/day for 5 days x 2 weeks. Bone marrow cells are sampled 1 month after the exposure. Comprehensive gene expression is analyzed by Gene Chip Mouse Genome 430 2.0 Array (Affymetrix) and data are processed by programs like case normalization, statistics, network generation, functional analysis etc. GR irradiation brings about changes of gene expression, which are classifiable in common genes variable commonly on the dose change and stochastic genes variable stochastically within each dose: e.g., with Welch-t-test, significant differences are between 0/3 Gy (dose-specific difference, 455 pbs (probe set), in stochastic 2113 pbs), 0/0.6 Gy (267 in 1284 pbs) and 0.6/3 Gy (532 pbs); and with one-way analysis of variation (ANOVA) and hierarchial/dendrographic analyses, 520 pbs are shown to involve the dose-dependent 226 and dose-specific 294 pbs. It is also shown that at 3 Gy, expression of common genes are rather suppressed, including those related to the proliferation/apoptosis of B/T cells, and of stochastic genes, related to cell division/signaling. Ven diagram of the common genes of above 520 pbs, stochastic 2113 pbs at 3 Gy and 1284 pbs at 0.6 Gy shows the overlapping genes 29, 2 and 4, respectively, indicating only 35 pbs are overlapping in total. Network analysis of changes by GR shows the rather high expression of genes around hub of cAMP response element binding protein (CREB) at 0.6 Gy, and rather variable expression around CREB hub/suppressed expression of kinesin hub at 3 Gy; in the network by BZ exposure, unchanged or low expression around p53 hub and suppression

  15. Comprehensive analysis of a medication dosing error related to CPOE.

    Science.gov (United States)

    Horsky, Jan; Kuperman, Gilad J; Patel, Vimla L

    2005-01-01

    This case study of a serious medication error demonstrates the necessity of a comprehensive methodology for the analysis of failures in interaction between humans and information systems. The authors used a novel approach to analyze a dosing error related to computer-based ordering of potassium chloride (KCl). The method included a chronological reconstruction of events and their interdependencies from provider order entry usage logs, semistructured interviews with involved clinicians, and interface usability inspection of the ordering system. Information collected from all sources was compared and evaluated to understand how the error evolved and propagated through the system. In this case, the error was the product of faults in interaction among human and system agents that methods limited in scope to their distinct analytical domains would not identify. The authors characterized errors in several converging aspects of the drug ordering process: confusing on-screen laboratory results review, system usability difficulties, user training problems, and suboptimal clinical system safeguards that all contributed to a serious dosing error. The results of the authors' analysis were used to formulate specific recommendations for interface layout and functionality modifications, suggest new user alerts, propose changes to user training, and address error-prone steps of the KCl ordering process to reduce the risk of future medication dosing errors.

  16. An updated comprehensive techno-economic analysis of algae biodiesel.

    Science.gov (United States)

    Nagarajan, Sanjay; Chou, Siaw Kiang; Cao, Shenyan; Wu, Chen; Zhou, Zhi

    2013-10-01

    Algae biodiesel is a promising but expensive alternative fuel to petro-diesel. To overcome cost barriers, detailed cost analyses are needed. A decade-old cost analysis by the U.S. National Renewable Energy Laboratory indicated that the costs of algae biodiesel were in the range of $0.53-0.85/L (2012 USD values). However, the cost of land and transesterification were just roughly estimated. In this study, an updated comprehensive techno-economic analysis was conducted with optimized processes and improved cost estimations. Latest process improvement, quotes from vendors, government databases, and other relevant data sources were used to calculate the updated algal biodiesel costs, and the final costs of biodiesel are in the range of $0.42-0.97/L. Additional improvements on cost-effective biodiesel production around the globe to cultivate algae was also recommended. Overall, the calculated costs seem promising, suggesting that a single step biodiesel production process is close to commercial reality. Copyright © 2012 Elsevier Ltd. All rights reserved.

  17. Bioinformatics analysis for evaluation of the diagnostic potentialities of miR-19b, -125b and -205 as liquid biopsy markers of prostate cancer

    Science.gov (United States)

    Bryzgunova, O. E.; Lekchnov, E. A.; Zaripov, M. M.; Yurchenko, Yu. B.; Yarmoschuk, S. V.; Pashkovskaya, O. A.; Rykova, E. Yu.; Zheravin, A. A.; Laktionov, P. P.

    2017-09-01

    Presence of tumor-derived cell-free miRNA in biological fluids as well as simplicity and robustness of cell-free miRNA quantification makes them suitable markers for cancer diagnostics. Based on previously published data demonstrating diagnostic potentialities of miR-205 in blood and miR-19b as well as miR-125b in urine of prostate cancer patients, bioinformatics analysis was carried out to follow their involvement in prostate cancer development and select additional miRNA-markers for prostate cancer diagnostics. Studied miRNAs are involved in different signaling pathways and regulate a number of genes involved in cancer development. Five of their targets (CCND1, BRAF, CCNE1, CCNE2, RAF1), according to the STRING database, act as part of the same signaling pathway. RAF1 is regulated by miR-19b and miR-125b, and it was shown to be involved in prostate cancer development by DIANA and STRING databases. Thus, other microRNAs regulating RAF1 expression such as miR-16, -195, -497, and -7 (suggested by DIANA, TargetScan, MiRTarBase and miRDB databases) can potentially be regarded as prostate cancer markers.

  18. Bioinformatics analysis to assess potential risks of allergenicity and toxicity of HRAP and PFLP proteins in genetically modified bananas resistant to Xanthomonas wilt disease.

    Science.gov (United States)

    Jin, Yuan; Goodman, Richard E; Tetteh, Afua O; Lu, Mei; Tripathi, Leena

    2017-11-01

    Banana Xanthomonas wilt (BXW) disease threatens banana production and food security throughout East Africa. Natural resistance is lacking among common cultivars. Genetically modified (GM) bananas resistant to BXW disease were developed by inserting the hypersensitive response-assisting protein (Hrap) or/and the plant ferredoxin-like protein (Pflp) gene(s) from sweet pepper (Capsicum annuum). Several of these GM banana events showed 100% resistance to BXW disease under field conditions in Uganda. The current study evaluated the potential allergenicity and toxicity of the expressed proteins HRAP and PFLP based on evaluation of published information on the history of safe use of the natural source of the proteins as well as established bioinformatics sequence comparison methods to known allergens (www.AllergenOnline.org and NCBI Protein) and toxins (NCBI Protein). The results did not identify potential risks of allergy and toxicity to either HRAP or PFLP proteins expressed in the GM bananas that might suggest potential health risks to humans. We recognize that additional tests including stability of these proteins in pepsin assay, nutrient analysis and possibly an acute rodent toxicity assay may be required by national regulatory authorities. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.

  19. Is there room for ethics within bioinformatics education?

    Science.gov (United States)

    Taneri, Bahar

    2011-07-01

    When bioinformatics education is considered, several issues are addressed. At the undergraduate level, the main issue revolves around conveying information from two main and different fields: biology and computer science. At the graduate level, the main issue is bridging the gap between biology students and computer science students. However, there is an educational component that is rarely addressed within the context of bioinformatics education: the ethics component. Here, a different perspective is provided on bioinformatics education, and the current status of ethics is analyzed within the existing bioinformatics programs. Analysis of the existing undergraduate and graduate programs, in both Europe and the United States, reveals the minimal attention given to ethics within bioinformatics education. Given that bioinformaticians speedily and effectively shape the biomedical sciences and hence their implications for society, here redesigning of the bioinformatics curricula is suggested in order to integrate the necessary ethics education. Unique ethical problems awaiting bioinformaticians and bioinformatics ethics as a separate field of study are discussed. In addition, a template for an "Ethics in Bioinformatics" course is provided.

  20. The utility of optical detection system (qPCR) and bioinformatics methods in reference gene expression analysis

    Science.gov (United States)

    Skarzyńska, Agnieszka; Pawełkowicz, Magdalena; PlÄ der, Wojciech; Przybecki, Zbigniew

    2016-09-01

    Real-time quantitative polymerase chain reaction is consider as the most reliable method for gene expression studies. However, the expression of target gene could be misinterpreted due to improper normalization. Therefore, the crucial step for analysing of qPCR data is selection of suitable reference genes, which should be validated experimentally. In order to choice the gene with stable expression in the designed experiment, we performed reference gene expression analysis. In this study genes described in the literature and novel genes predicted as control genes, based on the in silico analysis of transcriptome data were used. Analysis with geNorm and NormFinder algorithms allow to create the ranking of candidate genes and indicate the best reference for flower morphogenesis study. According to the results, genes CACS and CYCL were characterised the most stable expression, but the least suitable genes were TUA and EF.

  1. Systematic Analysis of the Association between Gut Flora and Obesity through High-Throughput Sequencing and Bioinformatics Approaches

    Directory of Open Access Journals (Sweden)

    Chih-Min Chiu

    2014-01-01

    Full Text Available Eighty-one stool samples from Taiwanese were collected for analysis of the association between the gut flora and obesity. The supervised analysis showed that the most, abundant genera of bacteria in normal samples (from people with a body mass index (BMI ≤ 24 were Bacteroides (27.7%, Prevotella (19.4%, Escherichia (12%, Phascolarctobacterium (3.9%, and Eubacterium (3.5%. The most abundant genera of bacteria in case samples (with a BMI ≥ 27 were Bacteroides (29%, Prevotella (21%, Escherichia (7.4%, Megamonas (5.1%, and Phascolarctobacterium (3.8%. A principal coordinate analysis (PCoA demonstrated that normal samples were clustered more compactly than case samples. An unsupervised analysis demonstrated that bacterial communities in the gut were clustered into two main groups: N-like and OB-like groups. Remarkably, most normal samples (78% were clustered in the N-like group, and most case samples (81% were clustered in the OB-like group (Fisher’s P  value=1.61E-07. The results showed that bacterial communities in the gut were highly associated with obesity. This is the first study in Taiwan to investigate the association between human gut flora and obesity, and the results provide new insights into the correlation of bacteria with the rising trend in obesity.

  2. Exercise-associated DNA methylation change in skeletal muscle and the importance of imprinted genes: a bioinformatics meta-analysis.

    Science.gov (United States)

    Brown, William M

    2015-12-01

    Epigenetics is the study of processes--beyond DNA sequence alteration--producing heritable characteristics. For example, DNA methylation modifies gene expression without altering the nucleotide sequence. A well-studied DNA methylation-based phenomenon is genomic imprinting (ie, genotype-independent parent-of-origin effects). We aimed to elucidate: (1) the effect of exercise on DNA methylation and (2) the role of imprinted genes in skeletal muscle gene networks (ie, gene group functional profiling analyses). Gene ontology (ie, gene product elucidation)/meta-analysis. 26 skeletal muscle and 86 imprinted genes were subjected to g:Profiler ontology analysis. Meta-analysis assessed exercise-associated DNA methylation change. g:Profiler found four muscle gene networks with imprinted loci. Meta-analysis identified 16 articles (387 genes/1580 individuals) associated with exercise. Age, method, sample size, sex and tissue variation could elevate effect size bias. Only skeletal muscle gene networks including imprinted genes were reported. Exercise-associated effect sizes were calculated by gene. Age, method, sample size, sex and tissue variation were moderators. Six imprinted loci (RB1, MEG3, UBE3A, PLAGL1, SGCE, INS) were important for muscle gene networks, while meta-analysis uncovered five exercise-associated imprinted loci (KCNQ1, MEG3, GRB10, L3MBTL1, PLAGL1). DNA methylation decreased with exercise (60% of loci). Exercise-associated DNA methylation change was stronger among older people (ie, age accounted for 30% of the variation). Among older people, genes exhibiting DNA methylation decreases were part of a microRNA-regulated gene network functioning to suppress cancer. Imprinted genes were identified in skeletal muscle gene networks and exercise-associated DNA methylation change. Exercise-associated DNA methylation modification could rewind the 'epigenetic clock' as we age. CRD42014009800. Published by the BMJ Publishing Group Limited. For permission to use (where

  3. Pressure Points in Reading Comprehension: A Quantile Multiple Regression Analysis

    Science.gov (United States)

    Logan, Jessica

    2017-01-01

    The goal of this study was to examine how selected pressure points or areas of vulnerability are related to individual differences in reading comprehension and whether the importance of these pressure points varies as a function of the level of children's reading comprehension. A sample of 245 third-grade children were given an assessment battery…

  4. Correlates of Early Reading Comprehension Skills: A Componential Analysis

    Science.gov (United States)

    Babayigit, Selma; Stainthorp, Rhona

    2014-01-01

    This study had three main aims. First, we examined to what extent listening comprehension, vocabulary, grammatical skills and verbal short-term memory (VSTM) assessed prior to formal reading instruction explained individual differences in early reading comprehension levels. Second, we examined to what extent the three common component skills,…

  5. Microarray-based bioinformatics analysis of the combined effects of SiNPs and PbAc on cardiovascular system in zebrafish.

    Science.gov (United States)

    Hu, Hejing; Zhang, Yannan; Shi, Yanfeng; Feng, Lin; Duan, Junchao; Sun, Zhiwei

    2017-10-01

    With rapid development of nanotechnology and growing environmental pollution, the combined toxic effects of SiNPs and pollutants of heavy metals like lead have received global attentions. The aim of this study was to explore the cardiovascular effects of the co-exposure of SiNPs and lead acetate (PbAc) in zebrafish using microarray and bioinformatics analysis. Although there was no other obvious cardiovascular malformation except bleeding phenotype, bradycardia, angiogenesis inhibition and declined cardiac output in zebrafish co-exposed of SiNPs and PbAc at NOAEL level, significant changes were observed in mRNA and microRNA (miRNA) expression patterns. STC-GO analysis indicated that the co-exposure might have more toxic effects on cardiovascular system than that exposure alone. Key differentially expressed genes were discerned out based on the Dynamic-gene-network, including stxbp1a, ndfip2, celf4 and gsk3b. Furthermore, several miRNAs obtained from the miRNA-Gene-Network might play crucial roles in cardiovascular disease, such as dre-miR-93, dre-miR-34a, dre-miR-181c, dre-miR-7145, dre-miR-730, dre-miR-129-5p, dre-miR-19d, dre-miR-218b, dre-miR-221. Besides, the analysis of miRNA-pathway-network indicated that the zebrafish were stimulated by the co-exposure of SiNPs and PbAc, which might cause the disturbance of calcium homeostasis and endoplasmic reticulum stress. As a result, cardiac muscle contraction might be deteriorated. In general, our data provide abundant fundamental research clues to the combined toxicity of environmental pollutants and further in-depth verifications are needed. Copyright © 2017 Elsevier Ltd. All rights reserved.

  6. Retrospective analysis of outcomes from two intensive comprehensive aphasia programs.

    Science.gov (United States)

    Persad, Carol; Wozniak, Linda; Kostopoulos, Ellina

    2013-01-01

    Positive outcomes from intensive therapy for individuals with aphasia have been reported in the literature. Little is known about the characteristics of individuals who attend intensive comprehensive aphasia programs (ICAPs) and what factors may predict who makes clinically significant changes when attending such programs. Demographic data on participants from 6 ICAPs showed that individuals who attend these programs spanned the entire age range (from adolescence to late adulthood), but they generally tended to be middle-aged and predominantly male. Analysis of outcome data from 2 of these ICAPs found that age and gender were not significant predictors of improved outcome on measures of language ability or functional communication. However, time post onset was related to clinical improvement in functional communication as measured by the Communication Activities of Daily Living, second edition (CADL-2). In addition, for one sample, initial severity of aphasia was related to outcome on the Western Aphasia Battery-Revised, such that individuals with more severe aphasia tended to show greater recovery compared to those with mild aphasia. Initial severity of aphasia also was highly correlated with changes in CADL-2 scores. These results suggest that adults of all ages with aphasia in either the acute or chronic phase of recovery can continue to show positive improvements in language ability and functional communication with intensive treatment.

  7. Neural activity associated with metaphor comprehension: spatial analysis.

    Science.gov (United States)

    Sotillo, María; Carretié, Luis; Hinojosa, José A; Tapia, Manuel; Mercado, Francisco; López-Martín, Sara; Albert, Jacobo

    2005-01-03

    Though neuropsychological data indicate that the right hemisphere (RH) plays a major role in metaphor processing, other studies suggest that, at least during some phases of this processing, a RH advantage may not exist. The present study explores, through a temporally agile neural signal--the event-related potentials (ERPs)--, and through source-localization algorithms applied to ERP recordings, whether the crucial phase of metaphor comprehension presents or not a RH advantage. Participants (n=24) were submitted to a S1-S2 experimental paradigm. S1 consisted of visually presented metaphoric sentences (e.g., "Green lung of the city"), followed by S2, which consisted of words that could (i.e., "Park") or could not (i.e., "Semaphore") be defined by S1. ERPs elicited by S2 were analyzed using temporal principal component analysis (tPCA) and source-localization algorithms. These analyses revealed that metaphorically related S2 words showed significantly higher N400 amplitudes than non-related S2 words. Source-localization algorithms showed differential activity between the two S2 conditions in the right middle/superior temporal areas. These results support the existence of an important RH contribution to (at least) one phase of metaphor processing and, furthermore, implicate the temporal cortex with respect to that contribution.

  8. A comprehensive overview on the foundations of formal concept analysis

    Directory of Open Access Journals (Sweden)

    K. Sumangali

    2017-12-01

    Full Text Available The immersion of voluminous collection of data is inevitable almost everywhere. The invention of mathematical models to analyse the patterns and trends of the data is an emerging necessity to extract and predict useful information in any Knowledge Discovery from Data (KDD process. The Formal Concept Analysis (FCA is an efficient mathematical model used in the process of KDD which is specially designed to portray the structure of the data in a context and depict the underlying patterns and hierarchies in it. Due to the huge increase in the application of FCA in various fields, the number of research and review articles on FCA has raised to a large extent. This review differs from the existing ones in presenting the comprehensive survey on the fundamentals of FCA in a compact and crisp manner to benefit the beginners and its focuses on the scalability issues in FCA. Further, we present the generic anatomy of FCA apart from its origin and growth at a primary level.

  9. Comprehensive proteomic analysis of the wheat pathogenic fungus Zymoseptoria tritici.

    Science.gov (United States)

    Yang, Fen; Yin, Qi

    2016-01-01

    Zymoseptoria tritici causes Septoria tritici blotch disease of wheat. To obtain a comprehensive protein dataset of this fungal pathogen, proteomes of Z. tritici growing in nutrient-limiting and rich media and in vivo at a late stage of wheat infection were fractionated by 1D gel or strong cation exchange (SCX) chromatography and analyzed by LC-MS/MS. A total of 5731, 5376 and 3168 Z. tritici proteins were confidently identified from these conditions, respectively. Of these in vitro and in planta proteins, 9 and 11% were predicted to contain signal peptides, respectively. Functional classification analysis revealed the proteins were involved in the various cellular activities. Comparison of three distinct protein expression profiles demonstrates the elevated carbohydrate, lipid and secondary metabolisms, transport, protein processing and energy production specifically in the host environment, in contrast to the enhancement of signaling, defense, replication, transcription and cell division in vitro. The data provide useful targets towards a better understanding of the molecular basis of Z. tritici growth, development, stress response and pathogenicity. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  10. Comprehensive Study on Lexicon-based Ensemble Classification Sentiment Analysis

    Directory of Open Access Journals (Sweden)

    Łukasz Augustyniak

    2015-12-01

    Full Text Available We propose a novel method for counting sentiment orientation that outperforms supervised learning approaches in time and memory complexity and is not statistically significantly different from them in accuracy. Our method consists of a novel approach to generating unigram, bigram and trigram lexicons. The proposed method, called frequentiment, is based on calculating the frequency of features (words in the document and averaging their impact on the sentiment score as opposed to documents that do not contain these features. Afterwards, we use ensemble classification to improve the overall accuracy of the method. What is important is that the frequentiment-based lexicons with sentiment threshold selection outperform other popular lexicons and some supervised learners, while being 3–5 times faster than the supervised approach. We compare 37 methods (lexicons, ensembles with lexicon’s predictions as input and supervised learners applied to 10 Amazon review data sets and provide the first statistical comparison of the sentiment annotation methods that include ensemble approaches. It is one of the most comprehensive comparisons of domain sentiment analysis in the literature.

  11. CGHPRO – A comprehensive data analysis tool for array CGH

    Directory of Open Access Journals (Sweden)

    Lenzner Steffen

    2005-04-01

    identification of odd clones. Conclusion CGHPRO is a comprehensive and easy-to-use data analysis tool for array CGH. Since all of its features are available offline, CGHPRO may be especially suitable in situations where protection of sensitive patient data is an issue. It is distributed under GNU GPL licence and runs on Linux and Windows.

  12. RADARS, a bioinformatics solution that automates proteome mass spectral analysis, optimises protein identification, and archives data in a relational database.

    Science.gov (United States)

    Field, Helen I; Fenyö, David; Beavis, Ronald C

    2002-01-01

    RADARS, a rapid, automated, data archiving and retrieval software system for high-throughput proteomic mass spectral data processing and storage, is described. The majority of mass spectrometer data files are compatible with RADARS, for consistent processing. The system automatically takes unprocessed data files, identifies proteins via in silico database searching, then stores the processed data and search results in a relational database suitable for customized reporting. The system is robust, used in 24/7 operation, accessible to multiple users of an intranet through a web browser, may be monitored by Virtual Private Network, and is secure. RADARS is scalable for use on one or many computers, and is suited to multiple processor systems. It can incorporate any local database in FASTA format, and can search protein and DNA databases online. A key feature is a suite of visualisation tools (many available gratis), allowing facile manipulation of spectra, by hand annotation, reanalysis, and access to all procedures. We also described the use of Sonar MS/MS, a novel, rapid search engine requiring 40 MB RAM per process for searches against a genomic or EST database translated in all six reading frames. RADARS reduces the cost of analysis by its efficient algorithms: Sonar MS/MS can identifiy proteins without accurate knowledge of the parent ion mass and without protein tags. Statistical scoring methods provide close-to-expert accuracy and brings robust data analysis to the non-expert user.

  13. MicroRNAs and cardiac sarcoplasmic reticulum calcium ATPase-2 in human myocardial infarction: expression and bioinformatic analysis.

    Science.gov (United States)

    Boštjančič, Emanuela; Zidar, Nina; Glavač, Damjan

    2012-10-15

    Cardiac sarco(endo)plasmic reticulum calcium ATPase-2 (SERCA2) plays one of the central roles in myocardial contractility. Both, SERCA2 mRNA and protein are reduced in myocardial infarction (MI), but the correlation has not been always observed. MicroRNAs (miRNAs) act by targeting 3'-UTR mRNA, causing translational repression in physiological and pathological conditions, including cardiovascular diseases. One of the aims of our study was to identify miRNAs that could influence SERCA2 expression in human MI. The protein SERCA2 was decreased and 43 miRNAs were deregulated in infarcted myocardium compared to corresponding remote myocardium, analyzed by western blot and microRNA microarrays, respectively. All the samples were stored as FFPE tissue and in RNAlater. miRNAs binding prediction to SERCA2 including four prediction algorithms (TargetScan, PicTar, miRanda and mirTarget2) identified 213 putative miRNAs. TAM and miRNApath annotation of deregulated miRNAs identified 18 functional and 21 diseased states related to heart diseases, and association of the half of the deregulated miRNAs to SERCA2. Free-energy of binding and flanking regions (RNA22, RNAfold) was calculated for 10 up-regulated miRNAs from microarray analysis (miR-122, miR-320a/b/c/d, miR-574-3p/-5p, miR-199a, miR-140, and miR-483), and nine miRNAs deregulated from microarray analysis were used for validation with qPCR (miR-21, miR-122, miR-126, miR-1, miR-133, miR-125a/b, and miR-98). Based on qPCR results, the comparison between FFPE and RNAlater stored tissue samples, between Sybr Green and TaqMan approaches, as well as between different reference genes were also performed. Combing all the results, we identified certain miRNAs as potential regulators of SERCA2; however, further functional studies are needed for verification. Using qPCR, we confirmed deregulation of nine miRNAs in human MI, and show that qPCR normalization strategy is important for the outcome of miRNA expression analysis in human MI.

  14. A Comprehensive Analysis of Alternative Splicing in Paleopolyploid Maize

    Directory of Open Access Journals (Sweden)

    Wenbin Mei

    2017-05-01

    Full Text Available Identifying and characterizing alternative splicing (AS enables our understanding of the biological role of transcript isoform diversity. This study describes the use of publicly available RNA-Seq data to identify and characterize the global diversity of AS isoforms in maize using the inbred lines B73 and Mo17, and a related species, sorghum. Identification and characterization of AS within maize tissues revealed that genes expressed in seed exhibit the largest differential AS relative to other tissues examined. Additionally, differences in AS between the two genotypes B73 and Mo17 are greatest within genes expressed in seed. We demonstrate that changes in the level of alternatively spliced transcripts (intron retention and exon skipping do not solely reflect differences in total transcript abundance, and we present evidence that intron retention may act to fine-tune gene expression across seed development stages. Furthermore, we have identified temperature sensitive AS in maize and demonstrate that drought-induced changes in AS involve distinct sets of genes in reproductive and vegetative tissues. Examining our identified AS isoforms within B73 × Mo17 recombinant inbred lines (RILs identified splicing QTL (sQTL. The 43.3% of cis-sQTL regulated junctions are actually identified as alternatively spliced junctions in our analysis, while 10 Mb windows on each side of 48.2% of trans-sQTLs overlap with splicing related genes. Using sorghum as an out-group enabled direct examination of loss or conservation of AS between homeologous genes representing the two subgenomes of maize. We identify several instances where AS isoforms that are conserved between one maize homeolog and its sorghum ortholog are absent from the second maize homeolog, suggesting that these AS isoforms may have been lost after the maize whole genome duplication event. This comprehensive analysis provides new insights into the complexity of AS in maize.

  15. The haloarchaeal MCM proteins: bioinformatic analysis and targeted mutagenesis of the β7-β8 and β9-β10 hairpin loops and conserved zinc binding domain cysteines.

    Science.gov (United States)

    Kristensen, Tatjana P; Maria Cherian, Reeja; Gray, Fiona C; MacNeill, Stuart A

    2014-01-01

    The hexameric MCM complex is the catalytic core of the replicative helicase in eukaryotic and archaeal cells. Here we describe the first in vivo analysis of archaeal MCM protein structure and function relationships using the genetically tractable haloarchaeon Haloferax volcanii as a model system. Hfx. volcanii encodes a single MCM protein that is part of the previously identified core group of haloarchaeal MCM proteins. Three structural features of the N-terminal domain of the Hfx. volcanii MCM protein were targeted for mutagenesis: the β7-β8 and β9-β10 β-hairpin loops and putative zinc binding domain. Five strains carrying single point mutations in the β7-β8 β-hairpin loop were constructed, none of which displayed impaired cell growth under normal conditions or when treated with the DNA damaging agent mitomycin C. However, short sequence deletions within the β7-β8 β-hairpin were not tolerated and neither was replacement of the highly conserved residue glutamate 187 with alanine. Six strains carrying paired alanine substitutions within the β9-β10 β-hairpin loop were constructed, leading to the conclusion that no individual amino acid within that hairpin loop is absolutely required for MCM function, although one of the mutant strains displays greatly enhanced sensitivity to mitomycin C. Deletions of two or four amino acids from the β9-β10 β-hairpin were tolerated but mutants carrying larger deletions were inviable. Similarly, it was not possible to construct mutants in which any of the conserved zinc binding cysteines was replaced with alanine, underlining the likely importance of zinc binding for MCM function. The results of these studies demonstrate the feasibility of using Hfx. volcanii as a model system for reverse genetic analysis of archaeal MCM protein function and provide important confirmation of the in vivo importance of conserved structural features identified by previous bioinformatic, biochemical and structural studies.

  16. Evaluation of simultaneous nutrient and COD removal with polyhydroxybutyrate (PHB) accumulation using mixed microbial consortia under anoxic condition and their bioinformatics analysis.

    Science.gov (United States)

    Jena, Jyotsnarani; Kumar, Ravindra; Dixit, Anshuman; Pandey, Sony; Das, Trupti

    2015-01-01

    Simultaneous nitrate-N, phosphate and COD removal was evaluated from synthetic waste water using mixed microbial consortia in an anoxic environment under various initial carbon load (ICL) in a batch scale reactor system. Within 6 hours of incubation, enriched DNPAOs (Denitrifying Polyphosphate Accumulating Microorganisms) were able to remove maximum COD (87%) at 2 g/L of ICL whereas maximum nitrate-N (97%) and phosphate (87%) removal along with PHB accumulation (49 mg/L) was achieved at 8 g/L of ICL. Exhaustion of nitrate-N, beyond 6 hours of incubation, had a detrimental effect on COD and phosphate removal rate. Fresh supply of nitrate-N to the reaction medium, beyond 6 hours, helped revive the removal rates of both COD and phosphate. Therefore, it was apparent that in spite of a high carbon load, maximum COD and nutrient removal can be maintained, with adequate nitrate-N availability. Denitrifying condition in the medium was evident from an increasing pH trend. PHB accumulation by the mixed culture was directly proportional to ICL; however the time taken for accumulation at higher ICL was more. Unlike conventional EBPR, PHB depletion did not support phosphate accumulation in this case. The unique aspect of all the batch studies were PHB accumulation was observed along with phosphate uptake and nitrate reduction under anoxic conditions. Bioinformatics analysis followed by pyrosequencing of the mixed culture DNA from the seed sludge revealed the dominance of denitrifying population, such as Corynebacterium, Rhodocyclus and Paraccocus (Alphaproteobacteria and Betaproteobacteria). Rarefaction curve indicated complete bacterial population and corresponding number of OTUs through sequence analysis. Chao1 and Shannon index (H') was used to study the diversity of sampling. "UCI95" and "LCI95" indicated 95% confidence level of upper and lower values of Chao1 for each distance. Values of Chao1 index supported the results of rarefaction curve.

  17. Evaluation of simultaneous nutrient and COD removal with polyhydroxybutyrate (PHB accumulation using mixed microbial consortia under anoxic condition and their bioinformatics analysis.

    Directory of Open Access Journals (Sweden)

    Jyotsnarani Jena

    Full Text Available Simultaneous nitrate-N, phosphate and COD removal was evaluated from synthetic waste water using mixed microbial consortia in an anoxic environment under various initial carbon load (ICL in a batch scale reactor system. Within 6 hours of incubation, enriched DNPAOs (Denitrifying Polyphosphate Accumulating Microorganisms were able to remove maximum COD (87% at 2 g/L of ICL whereas maximum nitrate-N (97% and phosphate (87% removal along with PHB accumulation (49 mg/L was achieved at 8 g/L of ICL. Exhaustion of nitrate-N, beyond 6 hours of incubation, had a detrimental effect on COD and phosphate removal rate. Fresh supply of nitrate-N to the reaction medium, beyond 6 hours, helped revive the removal rates of both COD and phosphate. Therefore, it was apparent that in spite of a high carbon load, maximum COD and nutrient removal can be maintained, with adequate nitrate-N availability. Denitrifying condition in the medium was evident from an increasing pH trend. PHB accumulation by the mixed culture was directly proportional to ICL; however the time taken for accumulation at higher ICL was more. Unlike conventional EBPR, PHB depletion did not support phosphate accumulation in this case. The unique aspect of all the batch studies were PHB accumulation was observed along with phosphate uptake and nitrate reduction under anoxic conditions. Bioinformatics analysis followed by pyrosequencing of the mixed culture DNA from the seed sludge revealed the dominance of denitrifying population, such as Corynebacterium, Rhodocyclus and Paraccocus (Alphaproteobacteria and Betaproteobacteria. Rarefaction curve indicated complete bacterial population and corresponding number of OTUs through sequence analysis. Chao1 and Shannon index (H' was used to study the diversity of sampling. "UCI95" and "LCI95" indicated 95% confidence level of upper and lower values of Chao1 for each distance. Values of Chao1 index supported the results of rarefaction curve.

  18. The haloarchaeal MCM proteins: bioinformatic analysis and targeted mutagenesis of the β7-β8 and β9-β10 hairpin loops and conserved zinc binding domain cysteines

    Directory of Open Access Journals (Sweden)

    Tatjana P Kristensen

    2014-03-01

    Full Text Available The hexameric MCM complex is the catalytic core of the replicative helicase in eukaryotic and archaeal cells. Here we describe the first in vivo analysis of archaeal MCM protein structure and function relationships using the genetically tractable haloarchaeon Haloferax volcanii as a model system. Hfx. volcanii encodes a single MCM protein that is part of the previously identified core group of haloarchaeal MCM proteins. Three structural features of the N-terminal domain of the Hfx. volcanii MCM protein were targeted for mutagenesis: the β7-β8 and β9-β10 β-hairpin loops and putative zinc binding domain. Five strains carrying single point mutations in the β7-β8 β-hairpin loop were constructed, none of which displayed impaired cell growth under normal conditions or when treated with the DNA damaging agent mitomycin C. However, short sequence deletions within the β7-β8 β-hairpin were not tolerated and neither was replacement of the highly conserved residue glutamate 187 with alanine. Six strains carrying paired alanine substitutions within the β9-β10 β-hairpin loop were constructed, leading to the conclusion that no individual amino acid within that hairpin loop is absolutely required for MCM function, although one of the mutant strains displays greatly enhanced sensitivity to mitomycin C. Deletions of two or four amino acids from the β9-β10 β-hairpin were tolerated but mutants carrying larger deletions were inviable. Similarly, it was not possible to construct mutants in which any of the conserved zinc binding cysteines was replaced with alanine, underlining the likely importance of zinc binding for MCM function. The results of these studies demonstrate the feasibility of using Hfx. volcanii as a model system for reverse genetic analysis of archaeal MCM protein function and provide important confirmation of the in vivo importance of conserved structural features identified by previous bioinformatic, biochemical and structural

  19. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software.

    Science.gov (United States)

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians.

  20. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software

    Science.gov (United States)

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians. PMID:25996054

  1. Evaluation of Bioinformatics Approaches for Next-Generation Sequencing Analysis of microRNAs with a Toxicogenomics Study Design

    Directory of Open Access Journals (Sweden)

    Halil Bisgin

    2018-02-01

    Full Text Available MicroRNAs (miRNAs are key post-transcriptional regulators that affect protein translation by targeting mRNAs. Their role in disease etiology and toxicity are well recognized. Given the rapid advancement of next-generation sequencing techniques, miRNA profiling has been increasingly conducted with RNA-seq, namely miRNA-seq. Analysis of miRNA-seq data requires several steps: (1 mapping the reads to miRBase, (2 considering mismatches during the hairpin alignment (windowing, and (3 counting the reads (quantification. The choice made in each step with respect to the parameter settings could affect miRNA quantification, differentially expressed miRNAs (DEMs detection and novel miRNA identification. Furthermore, these parameters do not act in isolation and their joint effects impact miRNA-seq results and interpretation. In toxicogenomics, the variation associated with parameter setting should not overpower the treatment effect (such as the dose/time-dependent effect. In this study, four commonly used miRNA-seq analysis tools (i.e., miRDeep2, miRExpress, miRNAkey, sRNAbench were comparatively evaluated with a standard toxicogenomics study design. We tested 30 different parameter settings on miRNA-seq data generated from thioacetamide-treated rat liver samples for three dose levels across four time points, followed by four normalization options. Because both miRExpress and miRNAkey yielded larger variation than that of the treatment effects across multiple parameter settings, our analyses mainly focused on the side-by-side comparison between miRDeep2 and sRNAbench. While the number of miRNAs detected by miRDeep2 was almost the subset of those detected by sRNAbench, the number of DEMs identified by both tools was comparable under the same parameter settings and normalization method. Change in the number of nucleotides out of the mature sequence in the hairpin alignment (window option yielded the largest variation for miRNA quantification and DEMs

  2. Bioinformatic analysis of the distribution of inorganic carbon transporters and prospective targets for bioengineering to increase Ci uptake by cyanobacteria.

    Science.gov (United States)

    Gaudana, Sandeep B; Zarzycki, Jan; Moparthi, Vamsi K; Kerfeld, Cheryl A

    2015-10-01

    Cyanobacteria have evolved a carbon-concentrating mechanism (CCM) which has enabled them to inhabit diverse environments encompassing a range of inorganic carbon (Ci: [Formula: see text] and CO2) concentrations. Several uptake systems facilitate inorganic carbon accumulation in the cell, which can in turn be fixed by ribulose 1,5-bisphosphate carboxylase/oxygenase. Here we survey the distribution of genes encoding known Ci uptake systems in cyanobacterial genomes and, using a pfam- and gene context-based approach, identify in the marine (alpha) cyanobacteria a heretofore unrecognized number of putative counterparts to the well-known Ci transporters of beta cyanobacteria. In addition, our analysis shows that there is a huge repertoire of transport systems in cyanobacteria of unknown function, many with homology to characterized Ci transporters. These can be viewed as prospective targets for conversion into ancillary Ci transporters through bioengineering. Increasing intracellular Ci concentration coupled with efforts to increase carbon fixation will be beneficial for the downstream conversion of fixed carbon into value-added products including biofuels. In addition to CCM transporter homologs, we also survey the occurrence of rhodopsin homologs in cyanobacteria, including bacteriorhodopsin, a class of retinal-binding, light-activated proton pumps. Because they are light driven and because of the apparent ease of altering their ion selectivity, we use this as an example of re-purposing an endogenous transporter for the augmentation of Ci uptake by cyanobacteria and potentially chloroplasts.

  3. Bioinformatic analysis of the nucleotide binding site-encoding disease-resistance genes in foxtail millet (Setaria italica (L.) Beauv.).

    Science.gov (United States)

    Zhu, Y B; Xie, X Q; Li, Z Y; Bai, H; Dong, L; Dong, Z P; Dong, J G

    2014-08-28

    The nucleotide-binding site (NBS) disease-resistance genes are the largest category of plant disease-resistance gene analogs. The complete set of disease-resistant candidate genes, which encode the NBS sequence, was filtered in the genomes of two varieties of foxtail millet (Yugu1 and 'Zhang gu'). This study investigated a number of characteristics of the putative NBS genes, such as structural diversity and phylogenetic relationships. A total of 269 and 281 NBS-coding sequences were identified in Yugu1 and 'Zhang gu', respectively. When the two databases were compared, 72 genes were found to be identical and 164 genes showed more than 90% similarity. Physical positioning and gene family analysis of the NBS disease-resistance genes in the genome revealed that the number of genes on each chromosome was similar in both varieties. The eighth chromosome contained the largest number of genes and the ninth chromosome contained the lowest number of genes. Exactly 34 gene clusters containing the 161 genes were found in the Yugu1 genome, with each cluster containing 4.7 genes on average. In comparison, the 'Zhang gu' genome possessed 28 gene clusters, which had 151 genes, with an average of 5.4 genes in each cluster. The largest gene cluster, located on the eighth chromosome, contained 12 genes in the Yugu1 database, whereas it contained 16 genes in the 'Zhang gu' database. The classification results showed that the CC-NBS-LRR gene made up the largest part of each chromosome in the two databases. Two TIR-NBS genes were also found in the Yugu1 genome.

  4. Designing XML schemas for bioinformatics.

    Science.gov (United States)

    Bruhn, Russel Elton; Burton, Philip John

    2003-06-01

    Data interchange bioinformatics databases will, in the future, most likely take place using extensible markup language (XML). The document structure will be described by an XML Schema rather than a document type definition (DTD). To ensure flexibility, the XML Schema must incorporate aspects of Object-Oriented Modeling. This impinges on the choice of the data model, which, in turn, is based on the organization of bioinformatics data by biologists. Thus, there is a need for the general bioinformatics community to be aware of the design issues relating to XML Schema. This paper, which is aimed at a general bioinformatics audience, uses examples to describe the differences between a DTD and an XML Schema and indicates how Unified Modeling Language diagrams may be used to incorporate Object-Oriented Modeling in the design of schema.

  5. Comprehensive Behavioral Analysis of Activating Transcription Factor 5-Deficient Mice

    Directory of Open Access Journals (Sweden)

    Mariko Umemura

    2017-07-01

    Full Text Available Activating transcription factor 5 (ATF5 is a member of the CREB/ATF family of basic leucine zipper transcription factors. We previously reported that ATF5-deficient (ATF5-/- mice demonstrated abnormal olfactory bulb development due to impaired interneuron supply. Furthermore, ATF5-/- mice were less aggressive than ATF5+/+ mice. Although ATF5 is widely expressed in the brain, and involved in the regulation of proliferation and development of neurons, the physiological role of ATF5 in the higher brain remains unknown. Our objective was to investigate the physiological role of ATF5 in the higher brain. We performed a comprehensive behavioral analysis using ATF5-/- mice and wild type littermates. ATF5-/- mice exhibited abnormal locomotor activity in the open field test. They also exhibited abnormal anxiety-like behavior in the light/dark transition test and open field test. Furthermore, ATF5-/- mice displayed reduced social interaction in the Crawley’s social interaction test and increased pain sensitivity in the hot plate test compared with wild type. Finally, behavioral flexibility was reduced in the T-maze test in ATF5-/- mice compared with wild type. In addition, we demonstrated that ATF5-/- mice display disturbances of monoamine neurotransmitter levels in several brain regions. These results indicate that ATF5 deficiency elicits abnormal behaviors and the disturbance of monoamine neurotransmitter levels in the brain. The behavioral abnormalities of ATF5-/- mice may be due to the disturbance of monoamine levels. Taken together, these findings suggest that ATF5-/- mice may be a unique animal model of some psychiatric disorders.

  6. Computerized comprehensive data analysis of Lung Imaging Database Consortium (LIDC)

    International Nuclear Information System (INIS)

    Tan Jun; Pu Jiantao; Zheng Bin; Wang Xingwei; Leader, Joseph K.

    2010-01-01

    Purpose: Lung Image Database Consortium (LIDC) is the largest public CT image database of lung nodules. In this study, the authors present a comprehensive and the most updated analysis of this dynamically growing database under the help of a computerized tool, aiming to assist researchers to optimally use this database for lung cancer related investigations. Methods: The authors developed a computer scheme to automatically match the nodule outlines marked manually by radiologists on CT images. A large variety of characteristics regarding the annotated nodules in the database including volume, spiculation level, elongation, interobserver variability, as well as the intersection of delineated nodule voxels and overlapping ratio between the same nodules marked by different radiologists are automatically calculated and summarized. The scheme was applied to analyze all 157 examinations with complete annotation data currently available in LIDC dataset. Results: The scheme summarizes the statistical distributions of the abovementioned geometric and diagnosis features. Among the 391 nodules, (1) 365 (93.35%) have principal axis length ≤20 mm; (2) 120, 75, 76, and 120 were marked by one, two, three, and four radiologists, respectively; and (3) 122 (32.48%) have the maximum volume overlapping ratios ≥80% for the delineations of two radiologists, while 198 (50.64%) have the maximum volume overlapping ratios <60%. The results also showed that 72.89% of the nodules were assessed with malignancy score between 2 and 4, and only 7.93% of these nodules were considered as severely malignant (malignancy ≥4). Conclusions: This study demonstrates that LIDC contains examinations covering a diverse distribution of nodule characteristics and it can be a useful resource to assess the performance of the nodule detection and/or segmentation schemes.

  7. Comprehensive benefit analysis of regional water resources based on multi-objective evaluation

    Science.gov (United States)

    Chi, Yixia; Xue, Lianqing; Zhang, Hui

    2018-01-01

    The purpose of the water resources comprehensive benefits analysis is to maximize the comprehensive benefits on the aspects of social, economic and ecological environment. Aiming at the defects of the traditional analytic hierarchy process in the evaluation of water resources, it proposed a comprehensive benefit evaluation of social, economic and environmental benefits index from the perspective of water resources comprehensive benefit in the social system, economic system and environmental system; determined the index weight by the improved fuzzy analytic hierarchy process (AHP), calculated the relative index of water resources comprehensive benefit and analyzed the comprehensive benefit of water resources in Xiangshui County by the multi-objective evaluation model. Based on the water resources data in Xiangshui County, 20 main comprehensive benefit assessment factors of 5 districts belonged to Xiangshui County were evaluated. The results showed that the comprehensive benefit of Xiangshui County was 0.7317, meanwhile the social economy has a further development space in the current situation of water resources.

  8. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    Science.gov (United States)

    Magana, Alejandra J.; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the…

  9. Bioinformatic analysis of neurotropic HIV envelope sequences identifies polymorphisms in the gp120 bridging sheet that increase macrophage-tropism through enhanced interactions with CCR5

    International Nuclear Information System (INIS)

    Mefford, Megan E.; Kunstman, Kevin; Wolinsky, Steven M.; Gabuzda, Dana

    2015-01-01

    Macrophages express low levels of the CD4 receptor compared to T-cells. Macrophage-tropic HIV strains replicating in brain of untreated patients with HIV-associated dementia (HAD) express Envs that are adapted to overcome this restriction through mechanisms that are poorly understood. Here, bioinformatic analysis of env sequence datasets together with functional studies identified polymorphisms in the β3 strand of the HIV gp120 bridging sheet that increase M-tropism. D197, which results in loss of an N-glycan located near the HIV Env trimer apex, was detected in brain in some HAD patients, while position 200 was estimated to be under positive selection. D197 and T/V200 increased fusion and infection of cells expressing low CD4 by enhancing gp120 binding to CCR5. These results identify polymorphisms in the HIV gp120 bridging sheet that overcome the restriction to macrophage infection imposed by low CD4 through enhanced gp120–CCR5 interactions, thereby promoting infection of brain and other macrophage-rich tissues. - Highlights: • We analyze HIV Env sequences and identify amino acids in beta 3 of the gp120 bridging sheet that enhance macrophage tropism. • These amino acids at positions 197 and 200 are present in brain of some patients with HIV-associated dementia. • D197 results in loss of a glycan near the HIV Env trimer apex, which may increase exposure of V3. • These variants may promote infection of macrophages in the brain by enhancing gp120–CCR5 interactions

  10. LegumeDB1 bioinformatics resource: comparative genomic analysis and novel cross-genera marker identification in lupin and pasture legume species.

    Science.gov (United States)

    Moolhuijzen, P; Cakir, M; Hunter, A; Schibeci, D; Macgregor, A; Smith, C; Francki, M; Jones, M G K; Appels, R; Bellgard, M

    2006-06-01

    The identification of markers in legume pasture crops, which can be associated with traits such as protein and lipid production, disease resistance, and reduced pod shattering, is generally accepted as an important strategy for improving the agronomic performance of these crops. It has been demonstrated that many quantitative trait loci (QTLs) identified in one species can be found in other plant species. Detailed legume comparative genomic analyses can characterize the genome organization between model legume species (e.g., Medicago truncatula, Lotus japonicus) and economically important crops such as soybean (Glycine max), pea (Pisum sativum), chickpea (Cicer arietinum), and lupin (Lupinus angustifolius), thereby identifying candidate gene markers that can be used to track QTLs in lupin and pasture legume breeding. LegumeDB is a Web-based bioinformatics resource for legume researchers. LegumeDB analysis of Medicago truncatula expressed sequence tags (ESTs) has identified novel simple sequence repeat (SSR) markers (16 tested), some of which have been putatively linked to symbiosome membrane proteins in root nodules and cell-wall proteins important in plant-pathogen defence mechanisms. These novel markers by preliminary PCR assays have been detected in Medicago truncatula and detected in at least one other legume species, Lotus japonicus, Glycine max, Cicer arietinum, and (or) Lupinus angustifolius (15/16 tested). Ongoing research has validated some of these markers to map them in a range of legume species that can then be used to compile composite genetic and physical maps. In this paper, we outline the features and capabilities of LegumeDB as an interactive application that provides legume genetic and physical comparative maps, and the efficient feature identification and annotation of the vast tracks of model legume sequences for convenient data integration and visualization. LegumeDB has been used to identify potential novel cross-genera polymorphic legume

  11. Bioinformatic analysis of neurotropic HIV envelope sequences identifies polymorphisms in the gp120 bridging sheet that increase macrophage-tropism through enhanced interactions with CCR5

    Energy Technology Data Exchange (ETDEWEB)

    Mefford, Megan E., E-mail: megan_mefford@hms.harvard.edu [Department of Cancer Immunology and AIDS, Dana-Farber Cancer Institute, Boston, MA (United States); Kunstman, Kevin, E-mail: kunstman@northwestern.edu [Northwestern University Medical School, Chicago, IL (United States); Wolinsky, Steven M., E-mail: s-wolinsky@northwestern.edu [Northwestern University Medical School, Chicago, IL (United States); Gabuzda, Dana, E-mail: dana_gabuzda@dfci.harvard.edu [Department of Cancer Immunology and AIDS, Dana-Farber Cancer Institute, Boston, MA (United States); Department of Neurology (Microbiology and Immunobiology), Harvard Medical School, Boston, MA (United States)

    2015-07-15

    Macrophages express low levels of the CD4 receptor compared to T-cells. Macrophage-tropic HIV strains replicating in brain of untreated patients with HIV-associated dementia (HAD) express Envs that are adapted to overcome this restriction through mechanisms that are poorly understood. Here, bioinformatic analysis of env sequence datasets together with functional studies identified polymorphisms in the β3 strand of the HIV gp120 bridging sheet that increase M-tropism. D197, which results in loss of an N-glycan located near the HIV Env trimer apex, was detected in brain in some HAD patients, while position 200 was estimated to be under positive selection. D197 and T/V200 increased fusion and infection of cells expressing low CD4 by enhancing gp120 binding to CCR5. These results identify polymorphisms in the HIV gp120 bridging sheet that overcome the restriction to macrophage infection imposed by low CD4 through enhanced gp120–CCR5 interactions, thereby promoting infection of brain and other macrophage-rich tissues. - Highlights: • We analyze HIV Env sequences and identify amino acids in beta 3 of the gp120 bridging sheet that enhance macrophage tropism. • These amino acids at positions 197 and 200 are present in brain of some patients with HIV-associated dementia. • D197 results in loss of a glycan near the HIV Env trimer apex, which may increase exposure of V3. • These variants may promote infection of macrophages in the brain by enhancing gp120–CCR5 interactions.

  12. RNA Sequencing and Bioinformatics Analysis Implicate the Regulatory Role of a Long Noncoding RNA-mRNA Network in Hepatic Stellate Cell Activation.

    Science.gov (United States)

    Guo, Can-Jie; Xiao, Xiao; Sheng, Li; Chen, Lili; Zhong, Wei; Li, Hai; Hua, Jing; Ma, Xiong

    2017-01-01

    To analyze the long noncoding (lncRNA)-mRNA expression network and potential roles in rat hepatic stellate cells (HSCs) during activation. LncRNA expression was analyzed in quiescent and culture-activated HSCs by RNA sequencing, and differentially expressed lncRNAs verified by quantitative reverse transcription polymerase chain reaction (qRT-PCR) were subjected to bioinformatics analysis. In vivo analyses of differential lncRNA-mRNA expression were performed on a rat model of liver fibrosis. We identified upregulation of 12 lncRNAs and 155 mRNAs and downregulation of 12 lncRNAs and 374 mRNAs in activated HSCs. Additionally, we identified the differential expression of upregulated lncRNAs (NONRATT012636.2, NONRATT016788.2, and NONRATT021402.2) and downregulated lncRNAs (NONRATT007863.2, NONRATT019720.2, and NONRATT024061.2) in activated HSCs relative to levels observed in quiescent HSCs, and Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathway analyses showed that changes in lncRNAs associated with HSC activation revealed 11 significantly enriched pathways according to their predicted targets. Moreover, based on the predicted co-expression network, the relative dynamic levels of NONRATT013819.2 and lysyl oxidase (Lox) were compared during HSC activation both in vitro and in vivo. Our results confirmed the upregulation of lncRNA NONRATT013819.2 and Lox mRNA associated with the extracellular matrix (ECM)-related signaling pathway in HSCs and fibrotic livers. Our results detailing a dysregulated lncRNA-mRNA network might provide new treatment strategies for hepatic fibrosis based on findings indicating potentially critical roles for NONRATT013819.2 and Lox in ECM remodeling during HSC activation. © 2017 The Author(s). Published by S. Karger AG, Basel.

  13. Residue analysis of a CTL epitope of SARS-CoV spike protein by IFN-gamma production and bioinformatics prediction

    Directory of Open Access Journals (Sweden)

    Huang Jun

    2012-09-01

    Full Text Available Abstract Background Severe acute respiratory syndrome (SARS is an emerging infectious disease caused by the novel coronavirus SARS-CoV. The T cell epitopes of the SARS CoV spike protein are well known, but no systematic evaluation of the functional and structural roles of each residue has been reported for these antigenic epitopes. Analysis of the functional importance of side-chains by mutational study may exaggerate the effect by imposing a structural disturbance or an unusual steric, electrostatic or hydrophobic interaction. Results We demonstrated that N50 could induce significant IFN-gamma response from SARS-CoV S DNA immunized mice splenocytes by the means of ELISA, ELISPOT and FACS. Moreover, S366-374 was predicted to be an optimal epitope by bioinformatics tools: ANN, SMM, ARB and BIMAS, and confirmed by IFN-gamma response induced by a series of S358-374-derived peptides. Furthermore, each of S366-374 was replaced by alanine (A, lysine (K or aspartic acid (D, respectively. ANN was used to estimate the binding affinity of single S366-374 mutants to H-2 Kd. Y367 and L374 were predicated to possess the most important role in peptide binding. Additionally, these one residue mutated peptides were synthesized, and IFN-gamma production induced by G368, V369, A371, T372 and K373 mutated S366-374 were decreased obviously. Conclusions We demonstrated that S366-374 is an optimal H-2 Kd CTL epitope in the SARS CoV S protein. Moreover, Y367, S370, and L374 are anchors in the epitope, while C366, G368, V369, A371, T372, and K373 may directly interact with TCR on the surface of CD8-T cells.

  14. An Overview of Bioinformatics Tools and Resources in Allergy.

    Science.gov (United States)

    Fu, Zhiyan; Lin, Jing

    2017-01-01

    The rapidly increasing number of characterized allergens has created huge demands for advanced information storage, retrieval, and analysis. Bioinformatics and machine learning approaches provide useful tools for the study of allergens and epitopes prediction, which greatly complement traditional laboratory techniques. The specific applications mainly include identification of B- and T-cell epitopes, and assessment of allergenicity and cross-reactivity. In order to facilitate the work of clinical and basic researchers who are not familiar with bioinformatics, we review in this chapter the most important databases, bioinformatic tools, and methods with relevance to the study of allergens.

  15. Bioinformatics of cardiovascular miRNA biology.

    Science.gov (United States)

    Kunz, Meik; Xiao, Ke; Liang, Chunguang; Viereck, Janika; Pachel, Christina; Frantz, Stefan; Thum, Thomas; Dandekar, Thomas

    2015-12-01

    MicroRNAs (miRNAs) are small ~22 nucleotide non-coding RNAs and are highly conserved among species. Moreover, miRNAs regulate gene expression of a large number of genes associated with important biological functions and signaling pathways. Recently, several miRNAs have been found to be associated with cardiovascular diseases. Thus, investigating the complex regulatory effect of miRNAs may lead to a better understanding of their functional role in the heart. To achieve this, bioinformatics approaches have to be coupled with validation and screening experiments to understand the complex interactions of miRNAs with the genome. This will boost the subsequent development of diagnostic markers and our understanding of the physiological and therapeutic role of miRNAs in cardiac remodeling. In this review, we focus on and explain different bioinformatics strategies and algorithms for the identification and analysis of miRNAs and their regulatory elements to better understand cardiac miRNA biology. Starting with the biogenesis of miRNAs, we present approaches such as LocARNA and miRBase for combining sequence and structure analysis including phylogenetic comparisons as well as detailed analysis of RNA folding patterns, functional target prediction, signaling pathway as well as functional analysis. We also show how far bioinformatics helps to tackle the unprecedented level of complexity and systemic effects by miRNA, underlining the strong therapeutic potential of miRNA and miRNA target structures in cardiovascular disease. In addition, we discuss drawbacks and limitations of bioinformatics algorithms and the necessity of experimental approaches for miRNA target identification. This article is part of a Special Issue entitled 'Non-coding RNAs'. Copyright © 2014 Elsevier Ltd. All rights reserved.

  16. Bioinformatics tools and database resources for systems genetics analysis in mice-a short review and an evaluation of future needs

    NARCIS (Netherlands)

    Durrant, Caroline; Swertz, Morris A.; Alberts, Rudi; Arends, Danny; Moeller, Steffen; Mott, Richard; Prins, Pjotr; van der Velde, K. Joeri; Jansen, Ritsert C.; Schughart, Klaus

    During a meeting of the SYSGENET working group 'Bioinformatics', currently available software tools and databases for systems genetics in mice were reviewed and the needs for future developments discussed. The group evaluated interoperability and performed initial feasibility studies. To aid future

  17. Bioinformatics resource manager v2.3: an integrated software environment for systems biology with microRNA and cross-species analysis tools

    Directory of Open Access Journals (Sweden)

    Tilton Susan C

    2012-11-01

    Full Text Available Abstract Background MicroRNAs (miRNAs are noncoding RNAs that direct post-transcriptional regulation of protein coding genes. Recent studies have shown miRNAs are important for controlling many biological processes, including nervous system development, and are highly conserved across species. Given their importance, computational tools are necessary for analysis, interpretation and integration of high-throughput (HTP miRNA data in an increasing number of model species. The Bioinformatics Resource Manager (BRM v2.3 is a software environment for data management, mining, integration and functional annotation of HTP biological data. In this study, we report recent updates to BRM for miRNA data analysis and cross-species comparisons across datasets. Results BRM v2.3 has the capability to query predicted miRNA targets from multiple databases, retrieve potential regulatory miRNAs for known genes, integrate experimentally derived miRNA and mRNA datasets, perform ortholog mapping across species, and retrieve annotation and cross-reference identifiers for an expanded number of species. Here we use BRM to show that developmental exposure of zebrafish to 30 uM nicotine from 6–48 hours post fertilization (hpf results in behavioral hyperactivity in larval zebrafish and alteration of putative miRNA gene targets in whole embryos at developmental stages that encompass early neurogenesis. We show typical workflows for using BRM to integrate experimental zebrafish miRNA and mRNA microarray datasets with example retrievals for zebrafish, including pathway annotation and mapping to human ortholog. Functional analysis of differentially regulated (p Conclusions BRM provides the ability to mine complex data for identification of candidate miRNAs or pathways that drive phenotypic outcome and, therefore, is a useful hypothesis generation tool for systems biology. The miRNA workflow in BRM allows for efficient processing of multiple miRNA and mRNA datasets in a single

  18. Bioinformatics resource manager v2.3: an integrated software environment for systems biology with microRNA and cross-species analysis tools

    Science.gov (United States)

    2012-01-01

    Background MicroRNAs (miRNAs) are noncoding RNAs that direct post-transcriptional regulation of protein coding genes. Recent studies have shown miRNAs are important for controlling many biological processes, including nervous system development, and are highly conserved across species. Given their importance, computational tools are necessary for analysis, interpretation and integration of high-throughput (HTP) miRNA data in an increasing number of model species. The Bioinformatics Resource Manager (BRM) v2.3 is a software environment for data management, mining, integration and functional annotation of HTP biological data. In this study, we report recent updates to BRM for miRNA data analysis and cross-species comparisons across datasets. Results BRM v2.3 has the capability to query predicted miRNA targets from multiple databases, retrieve potential regulatory miRNAs for known genes, integrate experimentally derived miRNA and mRNA datasets, perform ortholog mapping across species, and retrieve annotation and cross-reference identifiers for an expanded number of species. Here we use BRM to show that developmental exposure of zebrafish to 30 uM nicotine from 6–48 hours post fertilization (hpf) results in behavioral hyperactivity in larval zebrafish and alteration of putative miRNA gene targets in whole embryos at developmental stages that encompass early neurogenesis. We show typical workflows for using BRM to integrate experimental zebrafish miRNA and mRNA microarray datasets with example retrievals for zebrafish, including pathway annotation and mapping to human ortholog. Functional analysis of differentially regulated (p<0.05) gene targets in BRM indicates that nicotine exposure disrupts genes involved in neurogenesis, possibly through misregulation of nicotine-sensitive miRNAs. Conclusions BRM provides the ability to mine complex data for identification of candidate miRNAs or pathways that drive phenotypic outcome and, therefore, is a useful hypothesis

  19. Bioinformatic analysis of the nucleolus

    DEFF Research Database (Denmark)

    Leung, Anthony K L; Andersen, Jens S; Mann, Matthias

    2003-01-01

    The nucleolus is a plurifunctional, nuclear organelle, which is responsible for ribosome biogenesis and many other functions in eukaryotes, including RNA processing, viral replication and tumour suppression. Our knowledge of the human nucleolar proteome has been expanded dramatically by the two r...

  20. Multifunctionality and diversity of GDSL esterase/lipase gene family in rice (Oryza sativa L. japonica genome: new insights from bioinformatics analysis

    Directory of Open Access Journals (Sweden)

    Chepyshko Hanna

    2012-07-01

    Full Text Available Abstract Background GDSL esterases/lipases are a newly discovered subclass of lipolytic enzymes that are very important and attractive research subjects because of their multifunctional properties, such as broad substrate specificity and regiospecificity. Compared with the current knowledge regarding these enzymes in bacteria, our understanding of the plant GDSL enzymes is very limited, although the GDSL gene family in plant species include numerous members in many fully sequenced plant genomes. Only two genes from a large rice GDSL esterase/lipase gene family were previously characterised, and the majority of the members remain unknown. In the present study, we describe the rice OsGELP (Oryza sativa GDSL esterase/lipase protein gene family at the genomic and proteomic levels, and use this knowledge to provide insights into the multifunctionality of the rice OsGELP enzymes. Results In this study, an extensive bioinformatics analysis identified 114 genes in the rice OsGELP gene family. A complete overview of this family in rice is presented, including the chromosome locations, gene structures, phylogeny, and protein motifs. Among the OsGELPs and the plant GDSL esterase/lipase proteins of known functions, 41 motifs were found that represent the core secondary structure elements or appear specifically in different phylogenetic subclades. The specification and distribution of identified putative conserved clade-common and -specific peptide motifs, and their location on the predicted protein three dimensional structure may possibly signify their functional roles. Potentially important regions for substrate specificity are highlighted, in accordance with protein three-dimensional model and location of the phylogenetic specific conserved motifs. The differential expression of some representative genes were confirmed by quantitative real-time PCR. The phylogenetic analysis, together with protein motif architectures, and the expression profiling were

  1. Short-term arginine deprivation results in large-scale modulation of hepatic gene expression in both normal and tumor cells: microarray bioinformatic analysis

    Directory of Open Access Journals (Sweden)

    Sabo Edmond

    2006-09-01

    Full Text Available Abstract Background We have reported arginine-sensitive regulation of LAT1 amino acid transporter (SLC 7A5 in normal rodent hepatic cells with loss of arginine sensitivity and high level constitutive expression in tumor cells. We hypothesized that liver cell gene expression is highly sensitive to alterations in the amino acid microenvironment and that tumor cells may differ substantially in gene sets sensitive to amino acid availability. To assess the potential number and classes of hepatic genes sensitive to arginine availability at the RNA level and compare these between normal and tumor cells, we used an Affymetrix microarray approach, a paired in vitro model of normal rat hepatic cells and a tumorigenic derivative with triplicate independent replicates. Cells were exposed to arginine-deficient or control conditions for 18 hours in medium formulated to maintain differentiated function. Results Initial two-way analysis with a p-value of 0.05 identified 1419 genes in normal cells versus 2175 in tumor cells whose expression was altered in arginine-deficient conditions relative to controls, representing 9–14% of the rat genome. More stringent bioinformatic analysis with 9-way comparisons and a minimum of 2-fold variation narrowed this set to 56 arginine-responsive genes in normal liver cells and 162 in tumor cells. Approximately half the arginine-responsive genes in normal cells overlap with those in tumor cells. Of these, the majority was increased in expression and included multiple growth, survival, and stress-related genes. GADD45, TA1/LAT1, and caspases 11 and 12 were among this group. Previously known amino acid regulated genes were among the pool in both cell types. Available cDNA probes allowed independent validation of microarray data for multiple genes. Among genes downregulated under arginine-deficient conditions were multiple genes involved in cholesterol and fatty acid metabolism. Expression of low-density lipoprotein receptor was

  2. Comprehensive spectral analysis of Cyg X-1 using RXTE data

    International Nuclear Information System (INIS)

    Shahid, Rizwan; Jaaffrey, S. N. A.; Misra, Ranjeev

    2012-01-01

    spectra with Γ < 1.6, despite a large number having Γ ∼ 1.65. This comprehensive analysis lays the framework by which more detailed and sophisticated broadband observations may be understood. (research papers)

  3. Comprehensive proteomic analysis of the wheat pathogenic fungus Zymoseptoria tritici

    DEFF Research Database (Denmark)

    Yang, Fen; Yin, Qi

    2016-01-01

    Zymoseptoria tritici causes Septoria tritici blotch disease of wheat. To obtain a comprehensive protein dataset of this fungal pathogen, proteomes of Z. tritici growing in nutrient-limiting and rich media and in vivo at a late stage of wheat infection were fractionated by 1D gel or strong cation...

  4. Advanced AEM by Comprehensive Analysis and Modeling of System Drift

    Science.gov (United States)

    Schiller, Arnulf; Klune, Klaus; Schattauer, Ingrid

    2010-05-01

    The quality of the assessment of risks outgoing from environmental hazards strongly depends on the spatial and temporal distribution of the data collected in a survey area. Natural hazards generally emerge from wide areas as it is in the case of volcanoes or land slides. Conventional surface measurements are restricted to few lines or locations and often can't be conducted in difficult terrain. So they only give a spatial and temporary limited data set and therefore limit the reliability of risk analysis. Aero-geophysical measurements potentially provide a valuable tool for completing the data set as they can be performed over a wide area, even above difficult terrain within a short time. A most desirable opportunity in course of such measurements is the ascertainment of the dynamics of such potentially hazardous environmental processes. This necessitates repeated and reproducible measurements. Current HEM systems can't accomplish this adequately due to their system immanent drift and - in some cases - bad signal to noise ratio. So, to develop comprising concepts for advancing state of the art HEM-systems to a valuable tool for data acquisition in risk assessment or hydrological problems, different studies have been undertaken which form the contents of the presented work conducted in course of the project HIRISK (Helicopter Based Electromagnetic System for Advanced Environmental Risk Assessment - FWF L-354 N10, supported by the Austrian Science Fund). The methodology is based upon two paths: A - Comprehensive experimental testing on an existing HEM system serving as an experimental platform. B - The setup of a numerical model which is continuously refined according to the results of the experimental data. The model then serves to simulate the experimental as well as alternative configurations and to analyze them subject to their drift behavior. Finally, concepts for minimizing the drift are derived and tested. Different test series - stationary on ground as well

  5. Basic complex analysis a comprehensive course in analysis, part 2a

    CERN Document Server

    Simon, Barry

    2015-01-01

    A Comprehensive Course in Analysis by Poincaré Prize winner Barry Simon is a five-volume set that can serve as a graduate-level analysis textbook with a lot of additional bonus information, including hundreds of problems and numerous notes that extend the text and provide important historical background. Depth and breadth of exposition make this set a valuable reference source for almost all areas of classical analysis. Part 2A is devoted to basic complex analysis. It interweaves three analytic threads associated with Cauchy, Riemann, and Weierstrass, respectively. Cauchy's view focuses on th

  6. Identification of microRNAs from Amur grape (Vitis amurensis Rupr.) by deep sequencing and analysis of microRNA variations with bioinformatics.

    Science.gov (United States)

    Wang, Chen; Han, Jian; Liu, Chonghuai; Kibet, Korir Nicholas; Kayesh, Emrul; Shangguan, Lingfei; Li, Xiaoying; Fang, Jinggui

    2012-03-29

    MicroRNA (miRNA) is a class of functional non-coding small RNA with 19-25 nucleotides in length while Amur grape (Vitis amurensis Rupr.) is an important wild fruit crop with the strongest cold resistance among the Vitis species, is used as an excellent breeding parent for grapevine, and has elicited growing interest in wine production. To date, there is a relatively large number of grapevine miRNAs (vv-miRNAs) from cultivated grapevine varieties such as Vitis vinifera L. and hybrids of V. vinifera and V. labrusca, but there is no report on miRNAs from Vitis amurensis Rupr, a wild grapevine species. A small RNA library from Amur grape was constructed and Solexa technology used to perform deep sequencing of the library followed by subsequent bioinformatics analysis to identify new miRNAs. In total, 126 conserved miRNAs belonging to 27 miRNA families were identified, and 34 known but non-conserved miRNAs were also found. Significantly, 72 new potential Amur grape-specific miRNAs were discovered. The sequences of these new potential va-miRNAs were further validated through miR-RACE, and accumulation of 18 new va-miRNAs in seven tissues of grapevines confirmed by real time RT-PCR (qRT-PCR) analysis. The expression levels of va-miRNAs in flowers and berries were found to be basically consistent in identity to those from deep sequenced sRNAs libraries of combined corresponding tissues. We also describe the conservation and variation of va-miRNAs using miR-SNPs and miR-LDs during plant evolution based on comparison of orthologous sequences, and further reveal that the number and sites of miR-SNP in diverse miRNA families exhibit distinct divergence. Finally, 346 target genes for the new miRNAs were predicted and they include a number of Amur grape stress tolerance genes and many genes regulating anthocyanin synthesis and sugar metabolism. Deep sequencing of short RNAs from Amur grape flowers and berries identified 72 new potential miRNAs and 34 known but non-conserved mi

  7. Identification of microRNAs from Amur grape (vitis amurensis Rupr. by deep sequencing and analysis of microRNA variations with bioinformatics

    Directory of Open Access Journals (Sweden)

    Wang Chen

    2012-03-01

    Full Text Available Abstract Background MicroRNA (miRNA is a class of functional non-coding small RNA with 19-25 nucleotides in length while Amur grape (Vitis amurensis Rupr. is an important wild fruit crop with the strongest cold resistance among the Vitis species, is used as an excellent breeding parent for grapevine, and has elicited growing interest in wine production. To date, there is a relatively large number of grapevine miRNAs (vv-miRNAs from cultivated grapevine varieties such as Vitis vinifera L. and hybrids of V. vinifera and V. labrusca, but there is no report on miRNAs from Vitis amurensis Rupr, a wild grapevine species. Results A small RNA library from Amur grape was constructed and Solexa technology used to perform deep sequencing of the library followed by subsequent bioinformatics analysis to identify new miRNAs. In total, 126 conserved miRNAs belonging to 27 miRNA families were identified, and 34 known but non-conserved miRNAs were also found. Significantly, 72 new potential Amur grape-specific miRNAs were discovered. The sequences of these new potential va-miRNAs were further validated through miR-RACE, and accumulation of 18 new va-miRNAs in seven tissues of grapevines confirmed by real time RT-PCR (qRT-PCR analysis. The expression levels of va-miRNAs in flowers and berries were found to be basically consistent in identity to those from deep sequenced sRNAs libraries of combined corresponding tissues. We also describe the conservation and variation of va-miRNAs using miR-SNPs and miR-LDs during plant evolution based on comparison of orthologous sequences, and further reveal that the number and sites of miR-SNP in diverse miRNA families exhibit distinct divergence. Finally, 346 target genes for the new miRNAs were predicted and they include a number of Amur grape stress tolerance genes and many genes regulating anthocyanin synthesis and sugar metabolism. Conclusions Deep sequencing of short RNAs from Amur grape flowers and berries identified 72

  8. Application of Bioinformatics in Chronobiology Research

    Directory of Open Access Journals (Sweden)

    Robson da Silva Lopes

    2013-01-01

    Full Text Available Bioinformatics and other well-established sciences, such as molecular biology, genetics, and biochemistry, provide a scientific approach for the analysis of data generated through “omics” projects that may be used in studies of chronobiology. The results of studies that apply these techniques demonstrate how they significantly aided the understanding of chronobiology. However, bioinformatics tools alone cannot eliminate the need for an understanding of the field of research or the data to be considered, nor can such tools replace analysts and researchers. It is often necessary to conduct an evaluation of the results of a data mining effort to determine the degree of reliability. To this end, familiarity with the field of investigation is necessary. It is evident that the knowledge that has been accumulated through chronobiology and the use of tools derived from bioinformatics has contributed to the recognition and understanding of the patterns and biological rhythms found in living organisms. The current work aims to develop new and important applications in the near future through chronobiology research.

  9. Bringing Web 2.0 to bioinformatics.

    Science.gov (United States)

    Zhang, Zhang; Cheung, Kei-Hoi; Townsend, Jeffrey P

    2009-01-01

    Enabling deft data integration from numerous, voluminous and heterogeneous data sources is a major bioinformatic challenge. Several approaches have been proposed to address this challenge, including data warehousing and federated databasing. Yet despite the rise of these approaches, integration of data from multiple sources remains problematic and toilsome. These two approaches follow a user-to-computer communication model for data exchange, and do not facilitate a broader concept of data sharing or collaboration among users. In this report, we discuss the potential of Web 2.0 technologies to transcend this model and enhance bioinformatics research. We propose a Web 2.0-based Scientific Social Community (SSC) model for the implementation of these technologies. By establishing a social, collective and collaborative platform for data creation, sharing and integration, we promote a web services-based pipeline featuring web services for computer-to-computer data exchange as users add value. This pipeline aims to simplify data integration and creation, to realize automatic analysis, and to facilitate reuse and sharing of data. SSC can foster collaboration and harness collective intelligence to create and discover new knowledge. In addition to its research potential, we also describe its potential role as an e-learning platform in education. We discuss lessons from information technology, predict the next generation of Web (Web 3.0), and describe its potential impact on the future of bioinformatics studies.

  10. Bioinformatics in translational drug discovery.

    Science.gov (United States)

    Wooller, Sarah K; Benstead-Hume, Graeme; Chen, Xiangrong; Ali, Yusuf; Pearl, Frances M G

    2017-08-31

    Bioinformatics approaches are becoming ever more essential in translational drug discovery both in academia and within the pharmaceutical industry. Computational exploitation of the increasing volumes of data generated during all phases of drug discovery is enabling key challenges of the process to be addressed. Here, we highlight some of the areas in which bioinformatics resources and methods are being developed to support the drug discovery pipeline. These include the creation of large data warehouses, bioinformatics algorithms to analyse 'big data' that identify novel drug targets and/or biomarkers, programs to assess the tractability of targets, and prediction of repositioning opportunities that use licensed drugs to treat additional indications. © 2017 The Author(s).

  11. The growing need for microservices in bioinformatics

    Directory of Open Access Journals (Sweden)

    Christopher L Williams

    2016-01-01

    Full Text Available Objective: Within the information technology (IT industry, best practices and standards are constantly evolving and being refined. In contrast, computer technology utilized within the healthcare industry often evolves at a glacial pace, with reduced opportunities for justified innovation. Although the use of timely technology refreshes within an enterprise′s overall technology stack can be costly, thoughtful adoption of select technologies with a demonstrated return on investment can be very effective in increasing productivity and at the same time, reducing the burden of maintenance often associated with older and legacy systems. In this brief technical communication, we introduce the concept of microservices as applied to the ecosystem of data analysis pipelines. Microservice architecture is a framework for dividing complex systems into easily managed parts. Each individual service is limited in functional scope, thereby conferring a higher measure of functional isolation and reliability to the collective solution. Moreover, maintenance challenges are greatly simplified by virtue of the reduced architectural complexity of each constitutive module. This fact notwithstanding, rendered overall solutions utilizing a microservices-based approach provide equal or greater levels of functionality as compared to conventional programming approaches. Bioinformatics, with its ever-increasing demand for performance and new testing algorithms, is the perfect use-case for such a solution. Moreover, if promulgated within the greater development community as an open-source solution, such an approach holds potential to be transformative to current bioinformatics software development. Context: Bioinformatics relies on nimble IT framework which can adapt to changing requirements. Aims: To present a well-established software design and deployment strategy as a solution for current challenges within bioinformatics Conclusions: Use of the microservices framework

  12. The growing need for microservices in bioinformatics.

    Science.gov (United States)

    Williams, Christopher L; Sica, Jeffrey C; Killen, Robert T; Balis, Ulysses G J

    2016-01-01

    Within the information technology (IT) industry, best practices and standards are constantly evolving and being refined. In contrast, computer technology utilized within the healthcare industry often evolves at a glacial pace, with reduced opportunities for justified innovation. Although the use of timely technology refreshes within an enterprise's overall technology stack can be costly, thoughtful adoption of select technologies with a demonstrated return on investment can be very effective in increasing productivity and at the same time, reducing the burden of maintenance often associated with older and legacy systems. In this brief technical communication, we introduce the concept of microservices as applied to the ecosystem of data analysis pipelines. Microservice architecture is a framework for dividing complex systems into easily managed parts. Each individual service is limited in functional scope, thereby conferring a higher measure of functional isolation and reliability to the collective solution. Moreover, maintenance challenges are greatly simplified by virtue of the reduced architectural complexity of each constitutive module. This fact notwithstanding, rendered overall solutions utilizing a microservices-based approach provide equal or greater levels of functionality as compared to conventional programming approaches. Bioinformatics, with its ever-increasing demand for performance and new testing algorithms, is the perfect use-case for such a solution. Moreover, if promulgated within the greater development community as an open-source solution, such an approach holds potential to be transformative to current bioinformatics software development. Bioinformatics relies on nimble IT framework which can adapt to changing requirements. To present a well-established software design and deployment strategy as a solution for current challenges within bioinformatics. Use of the microservices framework is an effective methodology for the fabrication and

  13. The growing need for microservices in bioinformatics

    Science.gov (United States)

    Williams, Christopher L.; Sica, Jeffrey C.; Killen, Robert T.; Balis, Ulysses G. J.

    2016-01-01

    Objective: Within the information technology (IT) industry, best practices and standards are constantly evolving and being refined. In contrast, computer technology utilized within the healthcare industry often evolves at a glacial pace, with reduced opportunities for justified innovation. Although the use of timely technology refreshes within an enterprise's overall technology stack can be costly, thoughtful adoption of select technologies with a demonstrated return on investment can be very effective in increasing productivity and at the same time, reducing the burden of maintenance often associated with older and legacy systems. In this brief technical communication, we introduce the concept of microservices as applied to the ecosystem of data analysis pipelines. Microservice architecture is a framework for dividing complex systems into easily managed parts. Each individual service is limited in functional scope, thereby conferring a higher measure of functional isolation and reliability to the collective solution. Moreover, maintenance challenges are greatly simplified by virtue of the reduced architectural complexity of each constitutive module. This fact notwithstanding, rendered overall solutions utilizing a microservices-based approach provide equal or greater levels of functionality as compared to conventional programming approaches. Bioinformatics, with its ever-increasing demand for performance and new testing algorithms, is the perfect use-case for such a solution. Moreover, if promulgated within the greater development community as an open-source solution, such an approach holds potential to be transformative to current bioinformatics software development. Context: Bioinformatics relies on nimble IT framework which can adapt to changing requirements. Aims: To present a well-established software design and deployment strategy as a solution for current challenges within bioinformatics Conclusions: Use of the microservices framework is an effective

  14. Bioinformatics Training Network (BTN): a community resource for bioinformatics trainers

    DEFF Research Database (Denmark)

    Schneider, Maria V.; Walter, Peter; Blatter, Marie-Claude

    2012-01-01

    and clearly tagged in relation to target audiences, learning objectives, etc. Ideally, they would also be peer reviewed, and easily and efficiently accessible for downloading. Here, we present the Bioinformatics Training Network (BTN), a new enterprise that has been initiated to address these needs and review...

  15. Revealing and Quantifying the Impaired Phonological Analysis Underpinning Impaired Comprehension in Wernicke's Aphasia

    Science.gov (United States)

    Robson, Holly; Keidel, James L.; Lambon Ralph, Matthew A.; Sage, Karen

    2012-01-01

    Wernicke's aphasia is a condition which results in severely disrupted language comprehension following a lesion to the left temporo-parietal region. A phonological analysis deficit has traditionally been held to be at the root of the comprehension impairment in Wernicke's aphasia, a view consistent with current functional neuroimaging which finds…

  16. Robust Bioinformatics Recognition with VLSI Biochip Microsystem

    Science.gov (United States)

    Lue, Jaw-Chyng L.; Fang, Wai-Chi

    2006-01-01

    A microsystem architecture for real-time, on-site, robust bioinformatic patterns recognition and analysis has been proposed. This system is compatible with on-chip DNA analysis means such as polymerase chain reaction (PCR)amplification. A corresponding novel artificial neural network (ANN) learning algorithm using new sigmoid-logarithmic transfer function based on error backpropagation (EBP) algorithm is invented. Our results show the trained new ANN can recognize low fluorescence patterns better than the conventional sigmoidal ANN does. A differential logarithmic imaging chip is designed for calculating logarithm of relative intensities of fluorescence signals. The single-rail logarithmic circuit and a prototype ANN chip are designed, fabricated and characterized.

  17. Bioinformatic evaluation of L-arginine catabolic pathways in 24 cyanobacteria and transcriptional analysis of genes encoding enzymes of L-arginine catabolism in the cyanobacterium Synechocystis sp. PCC 6803

    Directory of Open Access Journals (Sweden)

    Pistorius Elfriede K

    2007-11-01

    Full Text Available Abstract Background So far very limited knowledge exists on L-arginine catabolism in cyanobacteria, although six major L-arginine-degrading pathways have been described for prokaryotes. Thus, we have performed a bioinformatic analysis of possible L-arginine-degrading pathways in cyanobacteria. Further, we chose Synechocystis sp. PCC 6803 for a more detailed bioinformatic analysis and for validation of the bioinformatic predictions on L-arginine catabolism with a transcript analysis. Results We have evaluated 24 cyanobacterial genomes of freshwater or marine strains for the presence of putative L-arginine-degrading enzymes. We identified an L-arginine decarboxylase pathway in all 24 strains. In addition, cyanobacteria have one or two further pathways representing either an arginase pathway or L-arginine deiminase pathway or an L-arginine oxidase/dehydrogenase pathway. An L-arginine amidinotransferase pathway as a major L-arginine-degrading pathway is not likely but can not be entirely excluded. A rather unusual finding was that the cyanobacterial L-arginine deiminases are substantially larger than the enzymes in non-photosynthetic bacteria and that they are membrane-bound. A more detailed bioinformatic analysis of Synechocystis sp. PCC 6803 revealed that three different L-arginine-degrading pathways may in principle be functional in this cyanobacterium. These are (i an L-arginine decarboxylase pathway, (ii an L-arginine deiminase pathway, and (iii an L-arginine oxidase/dehydrogenase pathway. A transcript analysis of cells grown either with nitrate or L-arginine as sole N-source and with an illumination of 50 μmol photons m-2 s-1 showed that the transcripts for the first enzyme(s of all three pathways were present, but that the transcript levels for the L-arginine deiminase and the L-arginine oxidase/dehydrogenase were substantially higher than that of the three isoenzymes of L-arginine decarboxylase. Conclusion The evaluation of 24

  18. Peer Mentoring for Bioinformatics presentation

    OpenAIRE

    Budd, Aidan

    2014-01-01

    A handout used in a HUB (Heidelberg Unseminars in Bioinformatics) meeting focused on career development for bioinformaticians. It describes an activity for use to help introduce the idea of peer mentoring, potnetially acting as an opportunity to create peer-mentoring groups.

  19. Reproducible Bioinformatics Research for Biologists

    Science.gov (United States)

    This book chapter describes the current Big Data problem in Bioinformatics and the resulting issues with performing reproducible computational research. The core of the chapter provides guidelines and summaries of current tools/techniques that a noncomputational researcher would need to learn to pe...

  20. Taking Bioinformatics to Systems Medicine

    NARCIS (Netherlands)

    van Kampen, Antoine H. C.; Moerland, Perry D.

    2016-01-01

    Systems medicine promotes a range of approaches and strategies to study human health and disease at a systems level with the aim of improving the overall well-being of (healthy) individuals, and preventing, diagnosing, or curing disease. In this chapter we discuss how bioinformatics critically

  1. An innovative approach for testing bioinformatics programs using metamorphic testing

    Directory of Open Access Journals (Sweden)

    Liu Huai

    2009-01-01

    Full Text Available Abstract Background Recent advances in experimental and computational technologies have fueled the development of many sophisticated bioinformatics programs. The correctness of such programs is crucial as incorrectly computed results may lead to wrong biological conclusion or misguide downstream experimentation. Common software testing procedures involve executing the target program with a set of test inputs and then verifying the correctness of the test outputs. However, due to the complexity of many bioinformatics programs, it is often difficult to verify the correctness of the test outputs. Therefore our ability to perform systematic software testing is greatly hindered. Results We propose to use a novel software testing technique, metamorphic testing (MT, to test a range of bioinformatics programs. Instead of requiring a mechanism to verify whether an individual test output is correct, the MT technique verifies whether a pair of test outputs conform to a set of domain specific properties, called metamorphic relations (MRs, thus greatly increases the number and variety of test cases that can be applied. To demonstrate how MT is used in practice, we applied MT to test two open-source bioinformatics programs, namely GNLab and SeqMap. In particular we show that MT is simple to implement, and is effective in detecting faults in a real-life program and some artificially fault-seeded programs. Further, we discuss how MT can be applied to test programs from various domains of bioinformatics. Conclusion This paper describes the application of a simple, effective and automated technique to systematically test a range of bioinformatics programs. We show how MT can be implemented in practice through two real-life case studies. Since many bioinformatics programs, particularly those for large scale simulation and data analysis, are hard to test systematically, their developers may benefit from using MT as part of the testing strategy. Therefore our work

  2. Statistical Analysis of a Comprehensive List of Visual Binaries

    Directory of Open Access Journals (Sweden)

    Kovaleva D.

    2015-12-01

    Full Text Available Visual binary stars are the most abundant class of observed binaries. The most comprehensive list of data on visual binaries compiled recently by cross-matching the largest catalogues of visual binaries allowed a statistical investigation of observational parameters of these systems. The dataset was cleaned by correcting uncertainties and misclassifications, and supplemented with available parallax data. The refined dataset is free from technical biases and contains 3676 presumably physical visual pairs of luminosity class V with known angular separations, magnitudes of the components, spectral types, and parallaxes. We also compiled a restricted sample of 998 pairs free from observational biases due to the probability of binary discovery. Certain distributions of observational and physical parameters of stars of our dataset are discussed.

  3. CHESS (CgHExpreSS): a comprehensive analysis tool for the analysis of genomic alterations and their effects on the expression profile of the genome.

    Science.gov (United States)

    Lee, Mikyung; Kim, Yangseok

    2009-12-16

    Genomic alterations frequently occur in many cancer patients and play important mechanistic roles in the pathogenesis of cancer. Furthermore, they can modify the expression level of genes due to altered copy number in the corresponding region of the chromosome. An accumulating body of evidence supports the possibility that strong genome-wide correlation exists between DNA content and gene expression. Therefore, more comprehensive analysis is needed to quantify the relationship between genomic alteration and gene expression. A well-designed bioinformatics tool is essential to perform this kind of integrative analysis. A few programs have already been introduced for integrative analysis. However, there are many limitations in their performance of comprehensive integrated analysis using published software because of limitations in implemented algorithms and visualization modules. To address this issue, we have implemented the Java-based program CHESS to allow integrative analysis of two experimental data sets: genomic alteration and genome-wide expression profile. CHESS is composed of a genomic alteration analysis module and an integrative analysis module. The genomic alteration analysis module detects genomic alteration by applying a threshold based method or SW-ARRAY algorithm and investigates whether the detected alteration is phenotype specific or not. On the other hand, the integrative analysis module measures the genomic alteration's influence on gene expression. It is divided into two separate parts. The first part calculates overall correlation between comparative genomic hybridization ratio and gene expression level by applying following three statistical methods: simple linear regression, Spearman rank correlation and Pearson's correlation. In the second part, CHESS detects the genes that are differentially expressed according to the genomic alteration pattern with three alternative statistical approaches: Student's t-test, Fisher's exact test and Chi square

  4. EURASIP journal on bioinformatics & systems biology

    National Research Council Canada - National Science Library

    2006-01-01

    "The overall aim of "EURASIP Journal on Bioinformatics and Systems Biology" is to publish research results related to signal processing and bioinformatics theories and techniques relevant to a wide...

  5. Bioinformatics decoding the genome

    CERN Multimedia

    CERN. Geneva; Deutsch, Sam; Michielin, Olivier; Thomas, Arthur; Descombes, Patrick

    2006-01-01

    Extracting the fundamental genomic sequence from the DNA From Genome to Sequence : Biology in the early 21st century has been radically transformed by the availability of the full genome sequences of an ever increasing number of life forms, from bacteria to major crop plants and to humans. The lecture will concentrate on the computational challenges associated with the production, storage and analysis of genome sequence data, with an emphasis on mammalian genomes. The quality and usability of genome sequences is increasingly conditioned by the careful integration of strategies for data collection and computational analysis, from the construction of maps and libraries to the assembly of raw data into sequence contigs and chromosome-sized scaffolds. Once the sequence is assembled, a major challenge is the mapping of biologically relevant information onto this sequence: promoters, introns and exons of protein-encoding genes, regulatory elements, functional RNAs, pseudogenes, transposons, etc. The methodological ...

  6. Big data bioinformatics.

    Science.gov (United States)

    Greene, Casey S; Tan, Jie; Ung, Matthew; Moore, Jason H; Cheng, Chao

    2014-12-01

    Recent technological advances allow for high throughput profiling of biological systems in a cost-efficient manner. The low cost of data generation is leading us to the "big data" era. The availability of big data provides unprecedented opportunities but also raises new challenges for data mining and analysis. In this review, we introduce key concepts in the analysis of big data, including both "machine learning" algorithms as well as "unsupervised" and "supervised" examples of each. We note packages for the R programming language that are available to perform machine learning analyses. In addition to programming based solutions, we review webservers that allow users with limited or no programming background to perform these analyses on large data compendia. © 2014 Wiley Periodicals, Inc.

  7. A decade of Web Server updates at the Bioinformatics Links Directory: 2003-2012.

    Science.gov (United States)

    Brazas, Michelle D; Yim, David; Yeung, Winston; Ouellette, B F Francis

    2012-07-01

    The 2012 Bioinformatics Links Directory update marks the 10th special Web Server issue from Nucleic Acids Research. Beginning with content from their 2003 publication, the Bioinformatics Links Directory in collaboration with Nucleic Acids Research has compiled and published a comprehensive list of freely accessible, online tools, databases and resource materials for the bioinformatics and life science research communities. The past decade has exhibited significant growth and change in the types of tools, databases and resources being put forth, reflecting both technology changes and the nature of research over that time. With the addition of 90 web server tools and 12 updates from the July 2012 Web Server issue of Nucleic Acids Research, the Bioinformatics Links Directory at http://bioinformatics.ca/links_directory/ now contains an impressive 134 resources, 455 databases and 1205 web server tools, mirroring the continued activity and efforts of our field.

  8. Swabs to genomes: a comprehensive workflow

    Directory of Open Access Journals (Sweden)

    Madison I. Dunitz

    2015-05-01

    Full Text Available The sequencing, assembly, and basic analysis of microbial genomes, once a painstaking and expensive undertaking, has become much easier for research labs with access to standard molecular biology and computational tools. However, there are a confusing variety of options available for DNA library preparation and sequencing, and inexperience with bioinformatics can pose a significant barrier to entry for many who may be interested in microbial genomics. The objective of the present study was to design, test, troubleshoot, and publish a simple, comprehensive workflow from the collection of an environmental sample (a swab to a published microbial genome; empowering even a lab or classroom with limited resources and bioinformatics experience to perform it.

  9. Identification Of Protein Vaccine Candidates Using Comprehensive Proteomic Analysis Strategies

    National Research Council Canada - National Science Library

    Rohrbough, James G

    2007-01-01

    Presented in this dissertation are proteomic analysis studies focused on identifying proteins to be used as vaccine candidates against Coccidioidomycosis, a potentially fatal human pulmonary disease...

  10. Preface to Introduction to Structural Bioinformatics

    NARCIS (Netherlands)

    Feenstra, K. Anton; Abeln, Sanne

    2018-01-01

    While many good textbooks are available on Protein Structure, Molecular Simulations, Thermodynamics and Bioinformatics methods in general, there is no good introductory level book for the field of Structural Bioinformatics. This book aims to give an introduction into Structural Bioinformatics, which

  11. Modern bioinformatics meets traditional Chinese medicine.

    Science.gov (United States)

    Gu, Peiqin; Chen, Huajun

    2014-11-01

    Traditional Chinese medicine (TCM) is gaining increasing attention with the emergence of integrative medicine and personalized medicine, characterized by pattern differentiation on individual variance and treatments based on natural herbal synergism. Investigating the effectiveness and safety of the potential mechanisms of TCM and the combination principles of drug therapies will bridge the cultural gap with Western medicine and improve the development of integrative medicine. Dealing with rapidly growing amounts of biomedical data and their heterogeneous nature are two important tasks among modern biomedical communities. Bioinformatics, as an emerging interdisciplinary field of computer science and biology, has become a useful tool for easing the data deluge pressure by automating the computation processes with informatics methods. Using these methods to retrieve, store and analyze the biomedical data can effectively reveal the associated knowledge hidden in the data, and thus promote the discovery of integrated information. Recently, these techniques of bioinformatics have been used for facilitating the interactional effects of both Western medicine and TCM. The analysis of TCM data using computational technologies provides biological evidence for the basic understanding of TCM mechanisms, safety and efficacy of TCM treatments. At the same time, the carrier and targets associated with TCM remedies can inspire the rethinking of modern drug development. This review summarizes the significant achievements of applying bioinformatics techniques to many aspects of the research in TCM, such as analysis of TCM-related '-omics' data and techniques for analyzing biological processes and pharmaceutical mechanisms of TCM, which have shown certain potential of bringing new thoughts to both sides. © The Author 2013. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  12. Comprehensive Case Analysis on Participatory Approaches, from Nexus Perspectives

    Science.gov (United States)

    Masuhara, N.; Baba, K.

    2014-12-01

    According to Messages from the Bonn2011 Conference, involving local communities fully and effectively in the planning and implementation processes related to water, energy and food nexus for local ownership and commitment should be strongly needed. The participatory approaches such as deliberative polling, "joint fact-finding" and so on have been applied so far to resolve various environmental disputes, however the drivers and barriers in such processes have not been necessarily enough analyzed in a comprehensive manner, especially in Japan. Our research aims to explore solutions for conflicts in the context of water-energy-food nexus in local communities. To achieve it, we clarify drivers and barriers of each approaches applied so far in water, energy and food policy, focusing on how to deal with scientific facts. We generate hypotheses primarily that multi-issue solutions through policy integration will be more effective for conflicts in the context of water-energy-food nexus than single issue solutions for each policy. One of the key factors to formulate effective solutions is to integrate "scientific fact (expert knowledge)" and "local knowledge". Given this primary hypothesis, more specifically, we assume that it is effective for building consensus to provide opportunities to resolve the disagreement of "framing" that stakeholders can offer experts the points for providing scientific facts and that experts can get common understanding of scientific facts in the early stage of the process. To verify the hypotheses, we develop a database of the cases which such participatory approaches have been applied so far to resolve various environmental disputes based on literature survey of journal articles and public documents of Japanese cases. At present, our database is constructing. But it's estimated that conditions of framing and providing scientific information are important driving factors for problem solving and consensus building. And it's important to refine

  13. A Comprehensive Classification and Evolutionary Analysis of Plant Homeobox Genes

    OpenAIRE

    Mukherjee, Krishanu; Brocchieri, Luciano; B?rglin, Thomas R.

    2009-01-01

    The full complement of homeobox transcription factor sequences, including genes and pseudogenes, was determined from the analysis of 10 complete genomes from flowering plants, moss, Selaginella, unicellular green algae, and red algae. Our exhaustive genome-wide searches resulted in the discovery in each class of a greater number of homeobox genes than previously reported. All homeobox genes can be unambiguously classified by sequence evolutionary analysis into 14 distinct classes also charact...

  14. Establishing bioinformatics research in the Asia Pacific

    OpenAIRE

    Ranganathan, Shoba; Tammi, Martti; Gribskov, Michael; Tan, Tin Wee

    2006-01-01

    Abstract In 1998, the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation was set up to champion the advancement of bioinformatics in the Asia Pacific. By 2002, APBioNet was able to gain sufficient critical mass to initiate the first International Conference on Bioinformatics (InCoB) bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2006 Conference was organized as the 5th annual conference of the Asia-...

  15. Bioinformatics programs are 31-fold over-represented among the highest impact scientific papers of the past two decades.

    Science.gov (United States)

    Wren, Jonathan D

    2016-09-01

    To analyze the relative proportion of bioinformatics papers and their non-bioinformatics counterparts in the top 20 most cited papers annually for the past two decades. When defining bioinformatics papers as encompassing both those that provide software for data analysis or methods underlying data analysis software, we find that over the past two decades, more than a third (34%) of the most cited papers in science were bioinformatics papers, which is approximately a 31-fold enrichment relative to the total number of bioinformatics papers published. More than half of the most cited papers during this span were bioinformatics papers. Yet, the average 5-year JIF of top 20 bioinformatics papers was 7.7, whereas the average JIF for top 20 non-bioinformatics papers was 25.8, significantly higher (P papers, bioinformatics journals tended to have higher Gini coefficients, suggesting that development of novel bioinformatics resources may be somewhat 'hit or miss'. That is, relative to other fields, bioinformatics produces some programs that are extremely widely adopted and cited, yet there are fewer of intermediate success. jdwren@gmail.com Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  16. Instructional Systems Development: Conceptual Analysis and Comprehensive Bibliography

    Science.gov (United States)

    1976-02-01

    M. P., Daily, J. T. An analysis of elementary pilot performance from instructors’ comments. Amer. Psychol., 1946. 1, 292. Creelman , J. A. Evaluation...of approach training procedures. Report No. 2, Project No. NM 001-I09-T7, U. S. Naval School of Aviation Medicine, Pensacola, Florida, 1955. Creelman

  17. Bioinformatics Methods for Interpreting Toxicogenomics Data: The Role of Text-Mining

    NARCIS (Netherlands)

    Hettne, K.M.; Kleinjans, J.; Stierum, R.H.; Boorsma, A.; Kors, J.A.

    2014-01-01

    This chapter concerns the application of bioinformatics methods to the analysis of toxicogenomics data. The chapter starts with an introduction covering how bioinformatics has been applied in toxicogenomics data analysis, and continues with a description of the foundations of a specific

  18. Computed microtomography and X-ray fluorescence analysis for comprehensive analysis of structural changes in bone.

    Science.gov (United States)

    Buzmakov, Alexey; Chukalina, Marina; Nikolaev, Dmitry; Schaefer, Gerald; Gulimova, Victoria; Saveliev, Sergey; Tereschenko, Elena; Seregin, Alexey; Senin, Roman; Prun, Victor; Zolotov, Denis; Asadchikov, Victor

    2013-01-01

    This paper presents the results of a comprehensive analysis of structural changes in the caudal vertebrae of Turner's thick-toed geckos by computer microtomography and X-ray fluorescence analysis. We present algorithms used for the reconstruction of tomographic images which allow to work with high noise level projections that represent typical conditions dictated by the nature of the samples. Reptiles, due to their ruggedness, small size, belonging to the amniote and a number of other valuable features, are an attractive model object for long-orbital experiments on unmanned spacecraft. Issues of possible changes in their bone tissue under the influence of spaceflight are the subject of discussions between biologists from different laboratories around the world.

  19. Compare the Difference of B-cell Epitopes of EgAgB1 and EgAgB3 Proteins Selected through Bioinformatic Analysis

    Science.gov (United States)

    An, Mengting; Zhang, Fengbo; Zhu, Yuejie; Zhao, Xiao; Ding, Jianbing

    2018-01-01

    Cystic echinococcosis, as a zoonosis, seriously endangers humans and animals, so early diagnosis of this disease is particularly important. Therefore, this study is to predict B-cell epitopes of EgAgB1 and EgAgB3 proteins by bioinformatics software. B-cell epitopes of EgAgB1 and EgAgB3 proteins are predicted using DNAStar and IEDB software. The results suggest that there are two potential B-cell epitopes in EgAgB1, which located in the 8-15 and 31-37 amino acid residue segments. And two potential B-cell epitopes in EgAgB2, located in the 20∼27 and 47∼53 amino acid residue segments. This study predicted the B-cell epitopes of EgAgB1 and EgAgB3 proteins, which laid the foundation for the early diagnosis of Cystic echinococcosis.

  20. A Comprehensive Analysis of Authorship in Radiology Journals.

    Science.gov (United States)

    Dang, Wilfred; McInnes, Matthew D F; Kielar, Ania Z; Hong, Jiho

    2015-01-01

    The purpose of our study was to investigate authorship trends in radiology journals, and whether International Committee of Medical Journal Editors (ICMJE) recommendations have had an impact on these trends. A secondary objective was to explore other variables associated with authorship trends. A retrospective, bibliometric analysis of 49 clinical radiology journals published from 1946-2013 was conducted. The following data was exported from MEDLINE (1946 to May 2014) for each article: authors' full name, year of publication, primary author institution information, language of publication and publication type. Microsoft Excel Visual Basics for Applications scripts were programmed to categorize extracted data. Statistical analysis was performed to determine the overall mean number of authors per article over time, impact of ICMJE guidelines, authorship frequency per journal, country of origin, article type and language of publication. 216,271 articles from 1946-2013 were included. A univariate analysis of the mean authorship frequency per year of all articles yielded a linear relationship between time and authorship frequency. The mean number of authors per article in 1946 (1.42) was found to have increased consistently by 0.07 authors/ article per year (R² = 0.9728, Pjournals, country of origin, language of publication and article type. Overall authorship for 49 radiology journals across 68 years has increased markedly with no demonstrated impact from ICMJE guidelines. A higher number of authors per article was seen in articles from: higher impact journals, European and Asian countries, original research type, and those journals who explicitly endorse the ICMJE guidelines.

  1. Bioinformatic Analysis of Deleterious Non-Synonymous Single Nucleotide Polymorphisms (nsSNPs in the Coding Regions of Human Prion Protein Gene (PRNP

    Directory of Open Access Journals (Sweden)

    Kourosh Bamdad

    2016-12-01

    Full Text Available Background & Objective: Single nucleotide polymorphisms are the cause of genetic variation to living organisms. Single nucleotide polymorphisms alter residues in the protein sequence. In this investigation, the relationship between prion protein gene polymorphisms and its relevance to pathogenicity was studied. Material & Method: Amino acid sequence of the main isoform from the human prion protein gene (PRNP was extracted from UniProt database and evaluated by FoldAmyloid and AmylPred servers. All non-synonymous single nucleotide polymorphisms (nsSNPs from SNP database (dbSNP were further analyzed by bioinformatics servers including SIFT, PolyPhen-2, I-Mutant-3.0, PANTHER, SNPs & GO, PHD-SNP, Meta-SNP, and MutPred to determine the most damaging nsSNPs. Results: The results of the first structure analyses by FoldAmyloid and AmylPerd servers implied that regions including 5-15, 174-178, 180-184, 211-217, and 240-252 were the most sensitive parts of the protein sequence to amyloidosis. Screening all nsSNPs of the main protein isoform using bioinformatic servers revealed that substitution of Aspartic acid with Valine at position 178 (ID code: rs11538766 was the most deleterious nsSNP in the protein structure. Conclusion:  Substitution of the Aspartic acid with Valine at position 178 (D178V was the most pathogenic mutation in the human prion protein gene. Analyses from the MutPred server also showed that beta-sheets’ increment in the secondary structure was the main reason behind the molecular mechanism of the prion protein aggregation.

  2. Peripheral blood smear image analysis: A comprehensive review

    Directory of Open Access Journals (Sweden)

    Emad A Mohammed

    2014-01-01

    Full Text Available Peripheral blood smear image examination is a part of the routine work of every laboratory. The manual examination of these images is tedious, time-consuming and suffers from interobserver variation. This has motivated researchers to develop different algorithms and methods to automate peripheral blood smear image analysis. Image analysis itself consists of a sequence of steps consisting of image segmentation, features extraction and selection and pattern classification. The image segmentation step addresses the problem of extraction of the object or region of interest from the complicated peripheral blood smear image. Support vector machine (SVM and artificial neural networks (ANNs are two common approaches to image segmentation. Features extraction and selection aims to derive descriptive characteristics of the extracted object, which are similar within the same object class and different between different objects. This will facilitate the last step of the image analysis process: pattern classification. The goal of pattern classification is to assign a class to the selected features from a group of known classes. There are two types of classifier learning algorithms: supervised and unsupervised. Supervised learning algorithms predict the class of the object under test using training data of known classes. The training data have a predefined label for every class and the learning algorithm can utilize this data to predict the class of a test object. Unsupervised learning algorithms use unlabeled training data and divide them into groups using similarity measurements. Unsupervised learning algorithms predict the group to which a new test object belong to, based on the training data without giving an explicit class to that object. ANN, SVM, decision tree and K-nearest neighbor are possible approaches to classification algorithms. Increased discrimination may be obtained by combining several classifiers together.

  3. Integrated care: a comprehensive bibliometric analysis and literature review

    Directory of Open Access Journals (Sweden)

    Xiaowei Sun

    2014-06-01

    Full Text Available Introduction: Integrated care could not only fix up fragmented health care but also improve the continuity of care and the quality of life. Despite the volume and variety of publications, little is known about how ‘integrated care’ has developed. There is a need for a systematic bibliometric analysis on studying the important features of the integrated care literature.Aim: To investigate the growth pattern, core journals and jurisdictions and identify the key research domains of integrated care.Methods: We searched Medline/PubMed using the search strategy ‘(delivery of health care, integrated [MeSH Terms] OR integrated care [Title/Abstract]’ without time and language limits. Second, we extracted the publishing year, journals, jurisdictions and keywords of the retrieved articles. Finally, descriptive statistical analysis by the Bibliographic Item Co-occurrence Matrix Builder and hierarchical clustering by SPSS were used.Results: As many as 9090 articles were retrieved. Results included: (1 the cumulative numbers of the publications on integrated care rose perpendicularly after 1993; (2 all documents were recorded by 1646 kinds of journals. There were 28 core journals; (3 the USA is the predominant publishing country; and (4 there are six key domains including: the definition/models of integrated care, interdisciplinary patient care team, disease management for chronically ill patients, types of health care organizations and policy, information system integration and legislation/jurisprudence.Discussion and conclusion: Integrated care literature has been most evident in developed countries. International Journal of Integrated Care is highly recommended in this research area. The bibliometric analysis and identification of publication hotspots provides researchers and practitioners with core target journals, as well as an overview of the field for further research in integrated care.

  4. Emotions while awaiting lung transplantation: A comprehensive qualitative analysis

    Science.gov (United States)

    Brügger, Aurelia; Aubert, John-David

    2014-01-01

    Patients awaiting lung transplantation are at risk of negative emotional and physical experiences. How do they talk about emotions? Semi-structured interviews were performed (15 patients). Categorical analysis focusing on emotion-related descriptions was organized into positive–negative–neutral descriptions: for primary and secondary emotions, evaluation processes, coping strategies, personal characteristics, emotion descriptions associated with physical states, (and) contexts were listed. Patients develop different strategies to maintain positive identity and attitude, while preserving significant others from extra emotional load. Results are discussed within various theoretical and research backgrounds, in emphasizing their importance in the definition of emotional support starting from the patient’s perspective. PMID:28070345

  5. Comprehensive safeguards evaluation methods and societal risk analysis

    International Nuclear Information System (INIS)

    Richardson, J.M.

    1982-03-01

    Essential capabilities of an integrated evaluation methodology for analyzing safeguards systems are discussed. Such a methodology must be conceptually meaningful, technically defensible, discriminating and consistent. A decompostion of safeguards systems by function is mentioned as a possible starting point for methodology development. The application of a societal risk equation to safeguards systems analysis is addressed. Conceptual problems with this approach are discussed. Technical difficulties in applying this equation to safeguards systems are illustrated through the use of confidence intervals, information content, hypothesis testing and ranking and selection procedures

  6. Emotions while awaiting lung transplantation: A comprehensive qualitative analysis.

    Science.gov (United States)

    Brügger, Aurelia; Aubert, John-David; Piot-Ziegler, Chantal

    2014-07-01

    Patients awaiting lung transplantation are at risk of negative emotional and physical experiences. How do they talk about emotions? Semi-structured interviews were performed (15 patients). Categorical analysis focusing on emotion-related descriptions was organized into positive-negative-neutral descriptions: for primary and secondary emotions, evaluation processes, coping strategies, personal characteristics, emotion descriptions associated with physical states, (and) contexts were listed. Patients develop different strategies to maintain positive identity and attitude, while preserving significant others from extra emotional load. Results are discussed within various theoretical and research backgrounds, in emphasizing their importance in the definition of emotional support starting from the patient's perspective.

  7. Emotions while awaiting lung transplantation: A comprehensive qualitative analysis

    Directory of Open Access Journals (Sweden)

    Aurelia Brügger

    2014-12-01

    Full Text Available Patients awaiting lung transplantation are at risk of negative emotional and physical experiences. How do they talk about emotions? Semi-structured interviews were performed (15 patients. Categorical analysis focusing on emotion-related descriptions was organized into positive–negative–neutral descriptions: for primary and secondary emotions, evaluation processes, coping strategies, personal characteristics, emotion descriptions associated with physical states, (and contexts were listed. Patients develop different strategies to maintain positive identity and attitude, while preserving significant others from extra emotional load. Results are discussed within various theoretical and research backgrounds, in emphasizing their importance in the definition of emotional support starting from the patient’s perspective.

  8. MSeqDR: A Centralized Knowledge Repository and Bioinformatics Web Resource to Facilitate Genomic Investigations in Mitochondrial Disease

    NARCIS (Netherlands)

    L. Shen (Lishuang); M.A. Diroma (Maria Angela); M. Gonzalez (Michael); D. Navarro-Gomez (Daniel); J. Leipzig (Jeremy); M.T. Lott (Marie T.); M. van Oven (Mannis); D.C. Wallace; C.C. Muraresku (Colleen Clarke); Z. Zolkipli-Cunningham (Zarazuela); P.F. Chinnery (Patrick); M. Attimonelli (Marcella); S. Zuchner (Stephan); M.J. Falk (Marni J.); X. Gai (Xiaowu)

    2016-01-01

    textabstractMSeqDR is the Mitochondrial Disease Sequence Data Resource, a centralized and comprehensive genome and phenome bioinformatics resource built by the mitochondrial disease community to facilitate clinical diagnosis and research investigations of individual patient phenotypes, genomes,

  9. A comprehensive fuel nuclide analysis at the reprocessing plant

    International Nuclear Information System (INIS)

    Arenz, H.J.; Koch, L.

    1983-01-01

    The composition of spent fuel can be determined by various methods. They rely partially on different information. Therefore the synopsis of the results of all methods permits a detection of systematic errors and their explanation. Methods for determining the masses of fuel nuclides at the reprocessing input point range from pure calculations (shipper data) to mere experimental determinations (volumetric analysis). In between, a mix of ''fresh'' experimental results and ''historical'' data is used to establish a material balance. Deviations in the results obtained by the individual methods can be attributed to the information source, which is unique for the method in question. The methodology of the approach consists of three steps: by paired comparison of the operator analysis (usually volumetric or gravimetric) with remeasurements the error components are determined on a batch-by-batch basis. Using the isotope correlation technique the operator data as well as the remeasurements are checked on an inter-batch basis for outliers, precision and bias. Systematic errors can be uncovered by inter-lab comparison of remeasurements and confirmed by using historical information. Experience collected during the reprocessing of LWR fuel at two reprocessing plants prove the flexibility and effectiveness of this approach. An example is presented to demonstrate its capability in detecting outliers and determining systematic errors. (author)

  10. G-DOC Plus - an integrative bioinformatics platform for precision medicine.

    Science.gov (United States)

    Bhuvaneshwar, Krithika; Belouali, Anas; Singh, Varun; Johnson, Robert M; Song, Lei; Alaoui, Adil; Harris, Michael A; Clarke, Robert; Weiner, Louis M; Gusev, Yuriy; Madhavan, Subha

    2016-04-30

    G-DOC Plus is a data integration and bioinformatics platform that uses cloud computing and other advanced computational tools to handle a variety of biomedical BIG DATA including gene expression arrays, NGS and medical images so that they can be analyzed in the full context of other omics and clinical information. G-DOC Plus currently holds data from over 10,000 patients selected from private and public resources including Gene Expression Omnibus (GEO), The Cancer Genome Atlas (TCGA) and the recently added datasets from REpository for Molecular BRAin Neoplasia DaTa (REMBRANDT), caArray studies of lung and colon cancer, ImmPort and the 1000 genomes data sets. The system allows researchers to explore clinical-omic data one sample at a time, as a cohort of samples; or at the level of population, providing the user with a comprehensive view of the data. G-DOC Plus tools have been leveraged in cancer and non-cancer studies for hypothesis generation and validation; biomarker discovery and multi-omics analysis, to explore somatic mutations and cancer MRI images; as well as for training and graduate education in bioinformatics, data and computational sciences. Several of these use cases are described in this paper to demonstrate its multifaceted usability. G-DOC Plus can be used to support a variety of user groups in multiple domains to enable hypothesis generation for precision medicine research. The long-term vision of G-DOC Plus is to extend this translational bioinformatics platform to stay current with emerging omics technologies and analysis methods to continue supporting novel hypothesis generation, analysis and validation for integrative biomedical research. By integrating several aspects of the disease and exposing various data elements, such as outpatient lab workup, pathology, radiology, current treatments, molecular signatures and expected outcomes over a web interface, G-DOC Plus will continue to strengthen precision medicine research. G-DOC Plus is available

  11. COMPREHENSIVE ANALYSIS OF PREBIOTIC PROPENAL UP TO 660 GHz

    Energy Technology Data Exchange (ETDEWEB)

    Daly, A. M.; Bermúdez, C.; Kolesniková, L.; Alonso, J. L., E-mail: Adam.M.Daly@jpl.nasa.gov [Grupo de Espectroscopia Molecular (GEM), Edificio Quifima, Área de Química-Física, Laboratorios de Espectroscopia y Bioespectroscopia, Parque Científico UVa, Unidad Asociada CSIC, Universidad de Valladolid, E-47011 Valladolid (Spain)

    2015-06-22

    Since interstellar detection of propenal is only based on two rotational transitions in the centimeter wave region, its high resolution rotational spectrum has been measured up to 660 GHz and fully characterized by assignment of more than 12,000 transitions to provide direct laboratory data to the astronomical community. Spectral assignments and analysis include transitions from the ground state of the trans and cis isomers, three trans-{sup 13}C isotopologues, and ten excited vibrational states of the trans form. Combining new millimeter and submillimeter data with those from the far-infrared region has yielded the most precise set of spectroscopic constants of trans-propenal obtained to date. Newly determined rotational constants, centrifugal distortion constants, vibrational energies, and Coriolis and Fermi interaction constants are given with high accuracy and were used to predict transition frequencies and intensities over a wide frequency range. Results of this work should facilitate astronomers further observation of propenal in the interstellar medium.

  12. Hydrocarbon Fuel Thermal Performance Modeling based on Systematic Measurement and Comprehensive Chromatographic Analysis

    Science.gov (United States)

    2016-07-31

    distribution unlimited Hydrocarbon Fuel Thermal Performance Modeling based on Systematic Measurement and Comprehensive Chromatographic Analysis Matthew...vital importance for hydrocarbon -fueled propulsion systems: fuel thermal performance as indicated by physical and chemical effects of cooling passage... analysis . The selection and acquisition of a set of chemically diverse fuels is pivotal for a successful outcome since test method validation and

  13. Pixel-based analysis of comprehensive two-dimensional gas chromatograms (color plots) of petroleum

    DEFF Research Database (Denmark)

    Furbo, Søren; Hansen, Asger B.; Skov, Thomas

    2014-01-01

    We demonstrate how to process comprehensive two-dimensional gas chromatograms (GC × GC chromatograms) to remove nonsample information (artifacts), including background and retention time shifts. We also demonstrate how this, combined with further reduction of the influence of irrelevant informati......, allows for data analysis without integration or peak deconvolution (pixelbased analysis)....

  14. Maintenance analysis method and operational feedback: a comprehensive maintenance management

    International Nuclear Information System (INIS)

    Mathieu Riou; Victor Planchon

    2006-01-01

    Full text of publication follows: Current periodic inspections program carried out on the COGEMA LOGISTICS casks is required by regulations and approved by the competent Authority. Thus, Safety and casks conformity to the according certificate of approval are guaranteed. Nonetheless, based on experience it appeared that some maintenance operations did not seem relevant or were redundant. Then, it was decided to rethink completely our maintenance program to reach the following objectives: - Set up the 'a minima' required inspection operations required to guarantee Safety and conformity to the certificate of approval, - Optimize criteria and periodicities of inspections taking into account: operational feedback, routine inspections carried out for each transport, regulations, environmental impact (ALARA, waste reduction,...), cost-effectiveness (reduction of cask's immobilization period,...). - Set up a maintenance program in Safety Analysis Reports that: stands alone (no need to check the specification or the certificate of approval to have the complete list of inspections mandatory to guarantee Safety), gives objectives instead of means of controls. This approach needs then to be re-evaluated by the competent Authority. Study's scope has been limited to the TN TM 12 cask family which is intensely used. COGEMA LOGISTICS has a high operational feedback on these casks. After Authority agreement, and in accordance with its requirements, study will then be extended to the other casks belonging to the COGEMA LOGISTICS cask fleet. Actually, the term 'maintenance' is linked to 'Base maintenance' and 'Main maintenance' and implicitly means that the cask is immobilized for a given period. To emphasize the modifications, the term 'maintenance' is no longer used and is substituted by 'periodic upkeep'. By changing the name, COGEMA LOGISTICS wants to emphasize that: some operations can for instance be realized while the cask is unloaded, periodicities are thought in terms of

  15. ANALYSIS, SELECTION AND RANKING OF FOREIGN MARKETS. A COMPREHENSIVE APPROACH

    Directory of Open Access Journals (Sweden)

    LIVIU NEAMŢU

    2013-12-01

    Full Text Available Choosing the appropriate markets for growth and development is essential for a company that wishes expanding businesses through international economic exchanges. But in this business case foreign markets research is not sufficient even though is an important chapter in the decision technology and an indispensable condition for achieving firm’s objectives. If in marketing on the national market this market is defined requiring no more than its prospection and segmentation, in the case of the international market outside the research process there is a need of a selection of markets and their classification. Companies that have this intention know little or nothing about the conditions offered by a new market or another. Therefore, they must go, step by step, through a complex analysis process, multilevel- type, composed of selection and ranking of markets followed by the proper research through exploration and segmentation, which can lead to choosing the most profitable markets. In this regard, within this study, we propose a multi-criteria model for selection and ranking of international development markets, allowing companies access to those markets which are in compliance with the company's development strategy.

  16. Analyse Factorielle d'une Batterie de Tests de Comprehension Orale et Ecrite (Factor Analysis of a Battery of Tests of Listening and Reading Comprehension). Melanges Pedagogiques, 1971.

    Science.gov (United States)

    Lonchamp, F.

    This is a presentation of the results of a factor analysis of a battery of tests intended to measure listening and reading comprehension in English as a second language. The analysis sought to answer the following questions: (1) whether the factor analysis method yields results when applied to tests which are not specifically designed for this…

  17. Comprehensive analysis of NuMA variation in breast cancer

    Directory of Open Access Journals (Sweden)

    Aittomäki Kristiina

    2008-03-01

    Full Text Available Abstract Background A recent genome wide case-control association study identified NuMA region on 11q13 as a candidate locus for breast cancer susceptibility. Specifically, the variant Ala794Gly was suggested to be associated with increased risk of breast cancer. Methods In order to evaluate the NuMa gene for breast cancer susceptibility, we have here screened the entire coding region and exon-intron boundaries of NuMa in 92 familial breast cancer patients and constructed haplotypes of the identified variants. Five missense variants were further screened in 341 breast cancer cases with a positive family history and 368 controls. We examined the frequency of Ala794Gly in an extensive series of familial (n = 910 and unselected (n = 884 breast cancer cases and controls (n = 906, with a high power to detect the suggested breast cancer risk. We also tested if the variant is associated with histopathologic features of breast tumors. Results Screening of NuMA resulted in identification of 11 exonic variants and 12 variants in introns or untranslated regions. Five missense variants that were further screened in breast cancer cases with a positive family history and controls, were each carried on a unique haplotype. None of the variants, or the haplotypes represented by them, was associated with breast cancer risk although due to low power in this analysis, very low risk alleles may go unrecognized. The NuMA Ala794Gly showed no difference in frequency in the unselected breast cancer case series or familial case series compared to control cases. Furthermore, Ala794Gly did not show any significant association with histopathologic characteristics of the tumors, though Ala794Gly was slightly more frequent among unselected cases with lymph node involvement. Conclusion Our results do not support the role of NuMA variants as breast cancer susceptibility alleles.

  18. Eye laterality: a comprehensive analysis in refractive surgery candidates.

    Science.gov (United States)

    Linke, Stephan J; Druchkiv, Vasyl; Steinberg, Johannes; Richard, Gisbert; Katz, Toam

    2013-08-01

    To explore eye laterality (higher refractive error in one eye) and its association with refractive state, spherical/astigmatic anisometropia, age and sex in refractive surgery candidates. Medical records of 12 493 consecutive refractive surgery candidates were filtered. Refractive error (subjective and cycloplegic) was measured in each subject and correlated with eye laterality. Only subjects with corrected distance visual acuity (CDVA) of >20/22 in each eye were enrolled to exclude amblyopia. Associations between eye laterality and refractive state were analysed by means of t-test, chi-squared test, Spearman's correlation and multivariate logistic regression analysis, respectively. There was no statistically significant difference in spherical equivalent between right (-3.47 ± 2.76 D) and left eyes (-3.47 ± 2.76 D, p = 0.510; Pearson's r = 0.948, p laterality for anisometropia >2.5 D in myopic (-5.64 ± 2.5 D versus -4.92 ± 2.6 D; p = 0.001) and in hyperopic (4.44 ± 1.69 D versus 3.04 ± 1.79 D; p = 0.025) subjects, (II) a tendency for left eye cylindrical laterality in myopic subjects, and (III) myopic male subjects had a higher prevalence of left eye laterality. (IV) Age did not show any significant impact on laterality. Over the full refractive spectrum, this study confirmed previously described strong interocular refractive correlation but revealed a statistically significant higher rate of right eye laterality for anisometropia >2.5 D. In general, our results support the use of data from one eye only in studies of ocular refraction. © 2013 The Authors. Acta Ophthalmologica © 2013 Acta Ophthalmologica Scandinavica Foundation.

  19. Comprehensive analysis of NuMA variation in breast cancer

    International Nuclear Information System (INIS)

    Kilpivaara, Outi; Rantanen, Matias; Tamminen, Anitta; Aittomäki, Kristiina; Blomqvist, Carl; Nevanlinna, Heli

    2008-01-01

    A recent genome wide case-control association study identified NuMA region on 11q13 as a candidate locus for breast cancer susceptibility. Specifically, the variant Ala794Gly was suggested to be associated with increased risk of breast cancer. In order to evaluate the NuMa gene for breast cancer susceptibility, we have here screened the entire coding region and exon-intron boundaries of NuMa in 92 familial breast cancer patients and constructed haplotypes of the identified variants. Five missense variants were further screened in 341 breast cancer cases with a positive family history and 368 controls. We examined the frequency of Ala794Gly in an extensive series of familial (n = 910) and unselected (n = 884) breast cancer cases and controls (n = 906), with a high power to detect the suggested breast cancer risk. We also tested if the variant is associated with histopathologic features of breast tumors. Screening of NuMA resulted in identification of 11 exonic variants and 12 variants in introns or untranslated regions. Five missense variants that were further screened in breast cancer cases with a positive family history and controls, were each carried on a unique haplotype. None of the variants, or the haplotypes represented by them, was associated with breast cancer risk although due to low power in this analysis, very low risk alleles may go unrecognized. The NuMA Ala794Gly showed no difference in frequency in the unselected breast cancer case series or familial case series compared to control cases. Furthermore, Ala794Gly did not show any significant association with histopathologic characteristics of the tumors, though Ala794Gly was slightly more frequent among unselected cases with lymph node involvement. Our results do not support the role of NuMA variants as breast cancer susceptibility alleles

  20. GOBLET: the Global Organisation for Bioinformatics Learning, Education and Training.

    Science.gov (United States)

    Attwood, Teresa K; Atwood, Teresa K; Bongcam-Rudloff, Erik; Brazas, Michelle E; Corpas, Manuel; Gaudet, Pascale; Lewitter, Fran; Mulder, Nicola; Palagi, Patricia M; Schneider, Maria Victoria; van Gelder, Celia W G

    2015-04-01

    In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy--paradoxically, many are actually closing "niche" bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all.

  1. Genomics and bioinformatics resources for translational science in Rosaceae.

    Science.gov (United States)

    Jung, Sook; Main, Dorrie

    2014-01-01

    Recent technological advances in biology promise unprecedented opportunities for rapid and sustainable advancement of crop quality. Following this trend, the Rosaceae research community continues to generate large amounts of genomic, genetic and breeding data. These include annotated whole genome sequences, transcriptome and expression data, proteomic and metabolomic data, genotypic and phenotypic data, and genetic and physical maps. Analysis, storage, integration and dissemination of these data using bioinformatics tools and databases are essential to provide utility of the data for basic, translational and applied research. This review discusses the currently available genomics and bioinformatics resources for the Rosaceae family.

  2. Diagnostic and therapeutic implications of genetic heterogeneity in myeloid neoplasms uncovered by comprehensive mutational analysis

    Directory of Open Access Journals (Sweden)

    Sarah M. Choi

    2017-01-01

    Full Text Available While growing use of comprehensive mutational analysis has led to the discovery of innumerable genetic alterations associated with various myeloid neoplasms, the under-recognized phenomenon of genetic heterogeneity within such neoplasms creates a potential for diagnostic confusion. Here, we describe two cases where expanded mutational testing led to amendment of an initial diagnosis of chronic myelogenous leukemia with subsequent altered treatment of each patient. We demonstrate the power of comprehensive testing in ensuring appropriate classification of genetically heterogeneous neoplasms, and emphasize thoughtful analysis of molecular and genetic data as an essential component of diagnosis and management.

  3. Genetic Markers Analyses and Bioinformatic Approaches to Distinguish Between Olive Tree (Olea europaea L.) Cultivars.

    Science.gov (United States)

    Ben Ayed, Rayda; Ben Hassen, Hanen; Ennouri, Karim; Rebai, Ahmed

    2016-12-01

    The genetic diversity of 22 olive tree cultivars (Olea europaea L.) sampled from different Mediterranean countries was assessed using 5 SNP markers (FAD2.1; FAD2.3; CALC; SOD and ANTHO3) located in four different genes. The genotyping analysis of the 22 cultivars with 5 SNP loci revealed 11 alleles (average 2.2 per allele). The dendrogram based on cultivar genotypes revealed three clusters consistent with the cultivars classification. Besides, the results obtained with the five SNPs were compared to those obtained with the SSR markers using bioinformatic analyses and by computing a cophenetic correlation coefficient, indicating the usefulness of the UPGMA method for clustering plant genotypes. Based on principal coordinate analysis using a similarity matrix, the first two coordinates, revealed 54.94 % of the total variance. This work provides a more comprehensive explanation of the diversity available in Tunisia olive cultivars, and an important contribution for olive breeding and olive oil authenticity.

  4. Green Fluorescent Protein-Focused Bioinformatics Laboratory Experiment Suitable for Undergraduates in Biochemistry Courses

    Science.gov (United States)

    Rowe, Laura

    2017-01-01

    An introductory bioinformatics laboratory experiment focused on protein analysis has been developed that is suitable for undergraduate students in introductory biochemistry courses. The laboratory experiment is designed to be potentially used as a "stand-alone" activity in which students are introduced to basic bioinformatics tools and…

  5. Bioinformatics for Next Generation Sequencing Data

    Directory of Open Access Journals (Sweden)

    Alberto Magi

    2010-09-01

    Full Text Available The emergence of next-generation sequencing (NGS platforms imposes increasing demands on statistical methods and bioinformatic tools for the analysis and the management of the huge amounts of data generated by these technologies. Even at the early stages of their commercial availability, a large number of softwares already exist for analyzing NGS data. These tools can be fit into many general categories including alignment of sequence reads to a reference, base-calling and/or polymorphism detection, de novo assembly from paired or unpaired reads, structural variant detection and genome browsing. This manuscript aims to guide readers in the choice of the available computational tools that can be used to face the several steps of the data analysis workflow.

  6. KNIME4NGS: a comprehensive toolbox for next generation sequencing analysis.

    Science.gov (United States)

    Hastreiter, Maximilian; Jeske, Tim; Hoser, Jonathan; Kluge, Michael; Ahomaa, Kaarin; Friedl, Marie-Sophie; Kopetzky, Sebastian J; Quell, Jan-Dominik; Mewes, H Werner; Küffner, Robert

    2017-05-15

    Analysis of Next Generation Sequencing (NGS) data requires the processing of large datasets by chaining various tools with complex input and output formats. In order to automate data analysis, we propose to standardize NGS tasks into modular workflows. This simplifies reliable handling and processing of NGS data, and corresponding solutions become substantially more reproducible and easier to maintain. Here, we present a documented, linux-based, toolbox of 42 processing modules that are combined to construct workflows facilitating a variety of tasks such as DNAseq and RNAseq analysis. We also describe important technical extensions. The high throughput executor (HTE) helps to increase the reliability and to reduce manual interventions when processing complex datasets. We also provide a dedicated binary manager that assists users in obtaining the modules' executables and keeping them up to date. As basis for this actively developed toolbox we use the workflow management software KNIME. See http://ibisngs.github.io/knime4ngs for nodes and user manual (GPLv3 license). robert.kueffner@helmholtz-muenchen.de. Supplementary data are available at Bioinformatics online.

  7. Intrageneric Primer Design: Bringing Bioinformatics Tools to the Class

    Science.gov (United States)

    Lima, Andre O. S.; Garces, Sergio P. S.

    2006-01-01

    Bioinformatics is one of the fastest growing scientific areas over the last decade. It focuses on the use of informatics tools for the organization and analysis of biological data. An example of their importance is the availability nowadays of dozens of software programs for genomic and proteomic studies. Thus, there is a growing field (private…

  8. Establishing bioinformatics research in the Asia Pacific

    Directory of Open Access Journals (Sweden)

    Tammi Martti

    2006-12-01

    Full Text Available Abstract In 1998, the Asia Pacific Bioinformatics Network (APBioNet, Asia's oldest bioinformatics organisation was set up to champion the advancement of bioinformatics in the Asia Pacific. By 2002, APBioNet was able to gain sufficient critical mass to initiate the first International Conference on Bioinformatics (InCoB bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2006 Conference was organized as the 5th annual conference of the Asia-Pacific Bioinformatics Network, on Dec. 18–20, 2006 in New Delhi, India, following a series of successful events in Bangkok (Thailand, Penang (Malaysia, Auckland (New Zealand and Busan (South Korea. This Introduction provides a brief overview of the peer-reviewed manuscripts accepted for publication in this Supplement. It exemplifies a typical snapshot of the growing research excellence in bioinformatics of the region as we embark on a trajectory of establishing a solid bioinformatics research culture in the Asia Pacific that is able to contribute fully to the global bioinformatics community.

  9. Emerging strengths in Asia Pacific bioinformatics.

    Science.gov (United States)

    Ranganathan, Shoba; Hsu, Wen-Lian; Yang, Ueng-Cheng; Tan, Tin Wee

    2008-12-12

    The 2008 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998, was organized as the 7th International Conference on Bioinformatics (InCoB), jointly with the Bioinformatics and Systems Biology in Taiwan (BIT 2008) Conference, Oct. 20-23, 2008 at Taipei, Taiwan. Besides bringing together scientists from the field of bioinformatics in this region, InCoB is actively involving researchers from the area of systems biology, to facilitate greater synergy between these two groups. Marking the 10th Anniversary of APBioNet, this InCoB 2008 meeting followed on from a series of successful annual events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea), New Delhi (India) and Hong Kong. Additionally, tutorials and the Workshop on Education in Bioinformatics and Computational Biology (WEBCB) immediately prior to the 20th Federation of Asian and Oceanian Biochemists and Molecular Biologists (FAOBMB) Taipei Conference provided ample opportunity for inducting mainstream biochemists and molecular biologists from the region into a greater level of awareness of the importance of bioinformatics in their craft. In this editorial, we provide a brief overview of the peer-reviewed manuscripts accepted for publication herein, grouped into thematic areas. As the regional research expertise in bioinformatics matures, the papers fall into thematic areas, illustrating the specific contributions made by APBioNet to global bioinformatics efforts.

  10. Identification and characterization of microRNAs related to salt stress in broccoli, using high-throughput sequencing and bioinformatics analysis.

    Science.gov (United States)

    Tian, Yunhong; Tian, Yunming; Luo, Xiaojun; Zhou, Tao; Huang, Zuoping; Liu, Ying; Qiu, Yihan; Hou, Bing; Sun, Dan; Deng, Hongyu; Qian, Shen; Yao, Kaitai

    2014-09-03

    MicroRNAs (miRNAs) are a new class of endogenous regulators of a broad range of physiological processes, which act by regulating gene expression post-transcriptionally. The brassica vegetable, broccoli (Brassica oleracea var. italica), is very popular with a wide range of consumers, but environmental stresses such as salinity are a problem worldwide in restricting its growth and yield. Little is known about the role of miRNAs in the response of broccoli to salt stress. In this study, broccoli subjected to salt stress and broccoli grown under control conditions were analyzed by high-throughput sequencing. Differential miRNA expression was confirmed by real-time reverse transcription polymerase chain reaction (RT-PCR). The prediction of miRNA targets was undertaken using the Kyoto Encyclopedia of Genes and Genomes (KEGG) Orthology (KO) database and Gene Ontology (GO)-enrichment analyses. Two libraries of small (or short) RNAs (sRNAs) were constructed and sequenced by high-throughput Solexa sequencing. A total of 24,511,963 and 21,034,728 clean reads, representing 9,861,236 (40.23%) and 8,574,665 (40.76%) unique reads, were obtained for control and salt-stressed broccoli, respectively. Furthermore, 42 putative known and 39 putative candidate miRNAs that were differentially expressed between control and salt-stressed broccoli were revealed by their read counts and confirmed by the use of stem-loop real-time RT-PCR. Amongst these, the putative conserved miRNAs, miR393 and miR855, and two putative candidate miRNAs, miR3 and miR34, were the most strongly down-regulated when broccoli was salt-stressed, whereas the putative conserved miRNA, miR396a, and the putative candidate miRNA, miR37, were the most up-regulated. Finally, analysis of the predicted gene targets of miRNAs using the GO and KO databases indicated that a range of metabolic and other cellular functions known to be associated with salt stress were up-regulated in broccoli treated with salt. A comprehensive

  11. A Comprehensive Analysis of the Quality of Online Health-Related Information regarding Schizophrenia

    Science.gov (United States)

    Guada, Joseph; Venable, Victoria

    2011-01-01

    Social workers are major mental health providers and, thus, can be key players in guiding consumers and their families to accurate information regarding schizophrenia. The present study, using the WebMedQual scale, is a comprehensive analysis across a one-year period at two different time points of the top for-profit and nonprofit sites that…

  12. Impact of comprehensive two-dimensional gas chromatography with mass spectrometry on food analysis.

    Science.gov (United States)

    Tranchida, Peter Q; Purcaro, Giorgia; Maimone, Mariarosa; Mondello, Luigi

    2016-01-01

    Comprehensive two-dimensional gas chromatography with mass spectrometry has been on the separation-science scene for about 15 years. This three-dimensional method has made a great positive impact on various fields of research, and among these that related to food analysis is certainly at the forefront. The present critical review is based on the use of comprehensive two-dimensional gas chromatography with mass spectrometry in the untargeted (general qualitative profiling and fingerprinting) and targeted analysis of food volatiles; attention is focused not only on its potential in such applications, but also on how recent advances in comprehensive two-dimensional gas chromatography with mass spectrometry will potentially be important for food analysis. Additionally, emphasis is devoted to the many instances in which straightforward gas chromatography with mass spectrometry is a sufficiently-powerful analytical tool. Finally, possible future scenarios in the comprehensive two-dimensional gas chromatography with mass spectrometry food analysis field are discussed. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  13. Comprehensive two-dimensional gas chromatography for the analysis of organohalogenated micro-contaminants

    NARCIS (Netherlands)

    Korytar, P.; Haglund, P.; Boer, de J.; Brinkman, U.A.Th.

    2006-01-01

    We explain the principles of comprehensive two-dimensional gas chromatography (GC × GC), and discuss key instrumental aspects - with emphasis on column combinations and mass spectrometric detection. As the main item of interest, we review the potential of GC × GC for the analysis of

  14. Comprehension-Driven Program Analysis (CPA) for Malware Detection in Android Phones

    Science.gov (United States)

    2015-07-01

    Android source . 3.1.2.2 Analyzers An analyzer conforms to specifications defined by the Security Toolbox. Specifically an analyzer encapsulates a...COMPREHENSION-DRIVEN PROGRAM ANALYSIS (CPA) FOR MALWARE DETECTION IN ANDROID PHONES IOWA STATE UNIVERSITY JULY 2015 FINAL...average 1 hour per response, including the time for reviewing instructions, searching existing data sources , gathering and maintaining the data needed

  15. Quantitative analysis of target components by comprehensive two-dimensional gas chromatography

    NARCIS (Netherlands)

    Mispelaar, V.G. van; Tas, A.C.; Smilde, A.K.; Schoenmakers, P.J.; Asten, A.C. van

    2003-01-01

    Quantitative analysis using comprehensive two-dimensional (2D) gas chromatography (GC) is still rarely reported. This is largely due to a lack of suitable software. The objective of the present study is to generate quantitative results from a large GC x GC data set, consisting of 32 chromatograms.

  16. Use of the Comprehensive Inversion method for Swarm satellite data analysis

    DEFF Research Database (Denmark)

    Sabaka, T. J.; Tøffner-Clausen, Lars; Olsen, Nils

    2013-01-01

    An advanced algorithm, known as the “Comprehensive Inversion” (CI), is presented for the analysis of Swarm measurements to generate a consistent set of Level-2 data products to be delivered by the Swarm “Satellite Constellation Application and Research Facility” (SCARF) to the European Space Agency...

  17. L2 Reading Comprehension and Its Correlates: A Meta-Analysis

    Science.gov (United States)

    Jeon, Eun Hee; Yamashita, Junko

    2014-01-01

    The present meta-analysis examined the overall average correlation (weighted for sample size and corrected for measurement error) between passage-level second language (L2) reading comprehension and 10 key reading component variables investigated in the research domain. Four high-evidence correlates (with 18 or more accumulated effect sizes: L2…

  18. Comprehensive Mass Analysis for Chemical Processes, a Case Study on L-Dopa Manufacture

    Science.gov (United States)

    To evaluate the “greenness” of chemical processes in route selection and process development, we propose a comprehensive mass analysis to inform the stakeholders from different fields. This is carried out by characterizing the mass intensity for each contributing chemical or wast...

  19. Comparative Bioinformatics Analysis of Transcription Factor Genes Indicates Conservation of Key Regulatory Domains among Babesia bovis, Babesia microti, and Theileria equi.

    Science.gov (United States)

    Alzan, Heba F; Knowles, Donald P; Suarez, Carlos E

    2016-11-01

    Apicomplexa tick-borne hemoparasites, including Babesia bovis, Babesia microti, and Theileria equi are responsible for bovine and human babesiosis and equine theileriosis, respectively. These parasites of vast medical, epidemiological, and economic impact have complex life cycles in their vertebrate and tick hosts. Large gaps in knowledge concerning the mechanisms used by these parasites for gene regulation remain. Regulatory genes coding for DNA binding proteins such as members of the Api-AP2, HMG, and Myb families are known to play crucial roles as transcription factors. Although the repertoire of Api-AP2 has been defined and a HMG gene was previously identified in the B. bovis genome, these regulatory genes have not been described in detail in B. microti and T. equi. In this study, comparative bioinformatics was used to: (i) identify and map genes encoding for these transcription factors among three parasites' genomes; (ii) identify a previously unreported HMG gene in B. microti; (iii) define a repertoire of eight conserved Myb genes; and (iv) identify AP2 correlates among B. bovis and the better-studied Plasmodium parasites. Searching the available transcriptome of B. bovis defined patterns of transcription of these three gene families in B. bovis erythrocyte stage parasites. Sequence comparisons show conservation of functional domains and general architecture in the AP2, Myb, and HMG proteins, which may be significant for the regulation of common critical parasite life cycle transitions in B. bovis, B. microti, and T. equi. A detailed understanding of the role of gene families encoding DNA binding proteins will provide new tools for unraveling regulatory mechanisms involved in B. bovis, B. microti, and T. equi life cycles and environmental adaptive responses and potentially contributes to the development of novel convergent strategies for improved control of babesiosis and equine piroplasmosis.

  20. Comparative Bioinformatics Analysis of Transcription Factor Genes Indicates Conservation of Key Regulatory Domains among Babesia bovis, Babesia microti, and Theileria equi.

    Directory of Open Access Journals (Sweden)

    Heba F Alzan

    2016-11-01

    Full Text Available Apicomplexa tick-borne hemoparasites, including Babesia bovis, Babesia microti, and Theileria equi are responsible for bovine and human babesiosis and equine theileriosis, respectively. These parasites of vast medical, epidemiological, and economic impact have complex life cycles in their vertebrate and tick hosts. Large gaps in knowledge concerning the mechanisms used by these parasites for gene regulation remain. Regulatory genes coding for DNA binding proteins such as members of the Api-AP2, HMG, and Myb families are known to play crucial roles as transcription factors. Although the repertoire of Api-AP2 has been defined and a HMG gene was previously identified in the B. bovis genome, these regulatory genes have not been described in detail in B. microti and T. equi. In this study, comparative bioinformatics was used to: (i identify and map genes encoding for these transcription factors among three parasites' genomes; (ii identify a previously unreported HMG gene in B. microti; (iii define a repertoire of eight conserved Myb genes; and (iv identify AP2 correlates among B. bovis and the better-studied Plasmodium parasites. Searching the available transcriptome of B. bovis defined patterns of transcription of these three gene families in B. bovis erythrocyte stage parasites. Sequence comparisons show conservation of functional domains and general architecture in the AP2, Myb, and HMG proteins, which may be significant for the regulation of common critical parasite life cycle transitions in B. bovis, B. microti, and T. equi. A detailed understanding of the role of gene families encoding DNA binding proteins will provide new tools for unraveling regulatory mechanisms involved in B. bovis, B. microti, and T. equi life cycles and environmental adaptive responses and potentially contributes to the development of novel convergent strategies for improved control of babesiosis and equine piroplasmosis.

  1. Association of mRNA expression of TP53 and the TP53 codon 72 Arg/Pro gene polymorphism with colorectal cancer risk in Asian population: a bioinformatics analysis and meta-analysis.

    Science.gov (United States)

    Dong, Zhiyong; Zheng, Longzhi; Liu, Weimin; Wang, Cunchuan

    2018-01-01

    The relationship between TP53 codon 72 Pro/Arg gene polymorphism and colorectal cancer risk in Asians is still controversial, and this bioinformatics analysis and meta-analysis was performed to assess the associations. The association studies were identified from PubMed, and eligible reports were included. RevMan 5.3.1 software, Oncolnc, cBioPortal, and Oncomine online tools were used for statistical analysis. A random/fixed effects model was used in meta-analysis. The data were reported as risk ratios or mean differences with corresponding 95% CI. We confirmed that TP53 was associated with colorectal cancer, the alteration frequency of TP53 was 53% mutation and 7% deep deletion, and TP53 mRNA expression was different in different types of colorectal cancer based on The Cancer Genome Atlas database. Then, 18 studies were included that examine the association of TP53 codon 72 gene polymorphism with colorectal cancer risk in Asians. The meta-analysis indicated that TP53 Pro allele and Pro/Pro genotype were associated with colorectal cancer risk in Asian population, but Arg/Arg genotype was not (Pro allele: odds ratios [OR]=1.20, 95% CI: 1.06 to 1.35, P =0.003; Pro/Pro genotype: OR=1.39, 95% CI: 1.15 to 1.69, P =0.0007; Arg/Arg genotype: OR=0.86, 95% CI: 0.74 to 1.00, P =0.05). Interestingly, in the meta-analysis of the controls from the population-based studies, we found that TP53 codon 72 Pro/Arg gene polymorphism was associated with colorectal cancer risk (Pro allele: OR=1.33, 95% CI: 1.15 to 1.55, P =0.0002; Pro/Pro genotype: OR=1.61, 95% CI: 1.28 to 2.02, P colorectal cancer, but the different value levels of mRNA expression were not associated with survival rate of colon and rectal cancer. TP53 Pro allele and Pro/Pro genotype were associated with colorectal cancer risk in Asians.

  2. The secondary metabolite bioinformatics portal

    DEFF Research Database (Denmark)

    Weber, Tilmann; Kim, Hyun Uk

    2016-01-01

    . In this context, this review gives a summary of tools and databases that currently are available to mine, identify and characterize natural product biosynthesis pathways and their producers based on ‘omics data. A web portal called Secondary Metabolite Bioinformatics Portal (SMBP at http...... analytical and chemical methods gave access to this group of compounds, nowadays genomics-based methods offer complementary approaches to find, identify and characterize such molecules. This paradigm shift also resulted in a high demand for computational tools to assist researchers in their daily work......Natural products are among the most important sources of lead molecules for drug discovery. With the development of affordable whole-genome sequencing technologies and other ‘omics tools, the field of natural products research is currently undergoing a shift in paradigms. While, for decades, mainly...

  3. Biology in 'silico': The Bioinformatics Revolution.

    Science.gov (United States)

    Bloom, Mark

    2001-01-01

    Explains the Human Genome Project (HGP) and efforts to sequence the human genome. Describes the role of bioinformatics in the project and considers it the genetics Swiss Army Knife, which has many different uses, for use in forensic science, medicine, agriculture, and environmental sciences. Discusses the use of bioinformatics in the high school…

  4. Using "Arabidopsis" Genetic Sequences to Teach Bioinformatics

    Science.gov (United States)

    Zhang, Xiaorong

    2009-01-01

    This article describes a new approach to teaching bioinformatics using "Arabidopsis" genetic sequences. Several open-ended and inquiry-based laboratory exercises have been designed to help students grasp key concepts and gain practical skills in bioinformatics, using "Arabidopsis" leucine-rich repeat receptor-like kinase (LRR…

  5. A Mathematical Optimization Problem in Bioinformatics

    Science.gov (United States)

    Heyer, Laurie J.

    2008-01-01

    This article describes the sequence alignment problem in bioinformatics. Through examples, we formulate sequence alignment as an optimization problem and show how to compute the optimal alignment with dynamic programming. The examples and sample exercises have been used by the author in a specialized course in bioinformatics, but could be adapted…

  6. Online Bioinformatics Tutorials | Office of Cancer Genomics

    Science.gov (United States)

    Bioinformatics is a scientific discipline that applies computer science and information technology to help understand biological processes. The NIH provides a list of free online bioinformatics tutorials, either generated by the NIH Library or other institutes, which includes introductory lectures and "how to" videos on using various tools.

  7. Fuzzy Logic in Medicine and Bioinformatics

    Directory of Open Access Journals (Sweden)

    Angela Torres

    2006-01-01

    Full Text Available The purpose of this paper is to present a general view of the current applications of fuzzy logic in medicine and bioinformatics. We particularly review the medical literature using fuzzy logic. We then recall the geometrical interpretation of fuzzy sets as points in a fuzzy hypercube and present two concrete illustrations in medicine (drug addictions and in bioinformatics (comparison of genomes.

  8. [Semantic Network Analysis of Online News and Social Media Text Related to Comprehensive Nursing Care Service].

    Science.gov (United States)

    Kim, Minji; Choi, Mona; Youm, Yoosik

    2017-12-01

    As comprehensive nursing care service has gradually expanded, it has become necessary to explore the various opinions about it. The purpose of this study is to explore the large amount of text data regarding comprehensive nursing care service extracted from online news and social media by applying a semantic network analysis. The web pages of the Korean Nurses Association (KNA) News, major daily newspapers, and Twitter were crawled by searching the keyword 'comprehensive nursing care service' using Python. A morphological analysis was performed using KoNLPy. Nodes on a 'comprehensive nursing care service' cluster were selected, and frequency, edge weight, and degree centrality were calculated and visualized with Gephi for the semantic network. A total of 536 news pages and 464 tweets were analyzed. In the KNA News and major daily newspapers, 'nursing workforce' and 'nursing service' were highly rated in frequency, edge weight, and degree centrality. On Twitter, the most frequent nodes were 'National Health Insurance Service' and 'comprehensive nursing care service hospital.' The nodes with the highest edge weight were 'national health insurance,' 'wards without caregiver presence,' and 'caregiving costs.' 'National Health Insurance Service' was highest in degree centrality. This study provides an example of how to use atypical big data for a nursing issue through semantic network analysis to explore diverse perspectives surrounding the nursing community through various media sources. Applying semantic network analysis to online big data to gather information regarding various nursing issues would help to explore opinions for formulating and implementing nursing policies. © 2017 Korean Society of Nursing Science

  9. Bioinformatic Analysis of Chlamydia trachomatis Polymorphic Membrane Proteins PmpE, PmpF, PmpG and PmpH as Potential Vaccine Antigens.

    Directory of Open Access Journals (Sweden)

    Alexandra Nunes

    Full Text Available Chlamydia trachomatis is the most important infectious cause of infertility in women with important implications in public health and for which a vaccine is urgently needed. Recent immunoproteomic vaccine studies found that four polymorphic membrane proteins (PmpE, PmpF, PmpG and PmpH are immunodominant, recognized by various MHC class II haplotypes and protective in mouse models. In the present study, we aimed to evaluate genetic and protein features of Pmps (focusing on the N-terminal 600 amino acids where MHC class II epitopes were mapped in order to understand antigen variation that may emerge following vaccine induced immune selection. We used several bioinformatics platforms to study: i Pmps' phylogeny and genetic polymorphism; ii the location and distribution of protein features (GGA(I, L/FxxN motifs and cysteine residues that may impact pathogen-host interactions and protein conformation; and iii the existence of phase variation mechanisms that may impact Pmps' expression. We used a well-characterized collection of 53 fully-sequenced strains that represent the C. trachomatis serovars associated with the three disease groups: ocular (N=8, epithelial-genital (N=25 and lymphogranuloma venereum (LGV (N=20. We observed that PmpF and PmpE are highly polymorphic between LGV and epithelial-genital strains, and also within populations of the latter. We also found heterogeneous representation among strains for GGA(I, L/FxxN motifs and cysteine residues, suggesting possible alterations in adhesion properties, tissue specificity and immunogenicity. PmpG and, to a lesser extent, PmpH revealed low polymorphism and high conservation of protein features among the genital strains (including the LGV group. Uniquely among the four Pmps, pmpG has regulatory sequences suggestive of phase variation. In aggregate, the results suggest that PmpG may be the lead vaccine candidate because of sequence conservation but may need to be paired with another protective

  10. Rising Strengths Hong Kong SAR in Bioinformatics.

    Science.gov (United States)

    Chakraborty, Chiranjib; George Priya Doss, C; Zhu, Hailong; Agoramoorthy, Govindasamy

    2017-06-01

    Hong Kong's bioinformatics sector is attaining new heights in combination with its economic boom and the predominance of the working-age group in its population. Factors such as a knowledge-based and free-market economy have contributed towards a prominent position on the world map of bioinformatics. In this review, we have considered the educational measures, landmark research activities and the achievements of bioinformatics companies and the role of the Hong Kong government in the establishment of bioinformatics as strength. However, several hurdles remain. New government policies will assist computational biologists to overcome these hurdles and further raise the profile of the field. There is a high expectation that bioinformatics in Hong Kong will be a promising area for the next generation.

  11. The 2016 Bioinformatics Open Source Conference (BOSC).

    Science.gov (United States)

    Harris, Nomi L; Cock, Peter J A; Chapman, Brad; Fields, Christopher J; Hokamp, Karsten; Lapp, Hilmar; Muñoz-Torres, Monica; Wiencko, Heather

    2016-01-01

    Message from the ISCB: The Bioinformatics Open Source Conference (BOSC) is a yearly meeting organized by the Open Bioinformatics Foundation (OBF), a non-profit group dedicated to promoting the practice and philosophy of Open Source software development and Open Science within the biological research community. BOSC has been run since 2000 as a two-day Special Interest Group (SIG) before the annual ISMB conference. The 17th annual BOSC ( http://www.open-bio.org/wiki/BOSC_2016) took place in Orlando, Florida in July 2016. As in previous years, the conference was preceded by a two-day collaborative coding event open to the bioinformatics community. The conference brought together nearly 100 bioinformatics researchers, developers and users of open source software to interact and share ideas about standards, bioinformatics software development, and open and reproducible science.

  12. Comprehensive identification and expression analysis of Hsp90s gene family in Solanum lycopersicum.

    Science.gov (United States)

    Zai, W S; Miao, L X; Xiong, Z L; Zhang, H L; Ma, Y R; Li, Y L; Chen, Y B; Ye, S G

    2015-07-14

    Heat shock protein 90 (Hsp90) is a protein produced by plants in response to adverse environmental stresses. In this study, we identified and analyzed Hsp90 gene family members using a bioinformatic method based on genomic data from tomato (Solanum lycopersicum L.). The results illustrated that tomato contains at least 7 Hsp90 genes distributed on 6 chromosomes; protein lengths ranged from 267-794 amino acids. Intron numbers ranged from 2-19 in the genes. The phylogenetic tree revealed that Hsp90 genes in tomato (Solanum lycopersicum L.), rice (Oryza sativa L.), and Arabidopsis (Arabidopsis thaliana L.) could be divided into 5 groups, which included 3 pairs of orthologous genes and 4 pairs of paralogous genes. Expression analysis of RNA-sequence data showed that the Hsp90-1 gene was specifically expressed in mature fruits, while Hsp90-5 and Hsp90-6 showed opposite expression patterns in various tissues of cultivated and wild tomatoes. The expression levels of the Hsp90-1, Hsp90-2, and Hsp90- 3 genes in various tissues of cultivated tomatoes were high, while both the expression levels of genes Hsp90-3 and Hsp90-4 were low. Additionally, quantitative real-time polymerase chain reaction showed that these genes were involved in the responses to yellow leaf curl virus in tomato plant leaves. Our results provide a foundation for identifying the function of the Hsp90 gene in tomato.

  13. 9th International Conference on Practical Applications of Computational Biology and Bioinformatics

    CERN Document Server

    Rocha, Miguel; Fdez-Riverola, Florentino; Paz, Juan

    2015-01-01

    This proceedings presents recent practical applications of Computational Biology and  Bioinformatics. It contains the proceedings of the 9th International Conference on Practical Applications of Computational Biology & Bioinformatics held at University of Salamanca, Spain, at June 3rd-5th, 2015. The International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB) is an annual international meeting dedicated to emerging and challenging applied research in Bioinformatics and Computational Biology. Biological and biomedical research are increasingly driven by experimental techniques that challenge our ability to analyse, process and extract meaningful knowledge from the underlying data. The impressive capabilities of next generation sequencing technologies, together with novel and ever evolving distinct types of omics data technologies, have put an increasingly complex set of challenges for the growing fields of Bioinformatics and Computational Biology. The analysis o...

  14. Analysis and Comprehensive Analytical Modeling of Statistical Variations in Subthreshold MOSFET's High Frequency Characteristics

    Directory of Open Access Journals (Sweden)

    Rawid Banchuin

    2014-01-01

    Full Text Available In this research, the analysis of statistical variations in subthreshold MOSFET's high frequency characteristics defined in terms of gate capacitance and transition frequency, have been shown and the resulting comprehensive analytical models of such variations in terms of their variances have been proposed. Major imperfection in the physical level properties including random dopant fluctuation and effects of variations in MOSFET's manufacturing process, have been taken into account in the proposed analysis and modeling. The up to dated comprehensive analytical model of statistical variation in MOSFET's parameter has been used as the basis of analysis and modeling. The resulting models have been found to be both analytic and comprehensive as they are the precise mathematical expressions in terms of physical level variables of MOSFET. Furthermore, they have been verified at the nanometer level by using 65~nm level BSIM4 based benchmarks and have been found to be very accurate with smaller than 5 % average percentages of errors. Hence, the performed analysis gives the resulting models which have been found to be the potential mathematical tool for the statistical and variability aware analysis and design of subthreshold MOSFET based VHF circuits, systems and applications.

  15. Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum

    Energy Technology Data Exchange (ETDEWEB)

    Ditty, Jayna L.; Kvaal, Christopher A.; Goodner, Brad; Freyermuth, Sharyn K.; Bailey, Cheryl; Britton, Robert A.; Gordon, Stuart G.; Heinhorst, Sabine; Reed, Kelynne; Xu, Zhaohui; Sanders-Lorenz, Erin R.; Axen, Seth; Kim, Edwin; Johns, Mitrick; Scott, Kathleen; Kerfeld, Cheryl A.

    2011-08-01

    into courses or independent research projects requires infrastructure for organizing and assessing student work. Here, we present a new platform for faculty to keep current with the rapidly changing field of bioinformatics, the Integrated Microbial Genomes Annotation Collaboration Toolkit (IMG-ACT). It was developed by instructors from both research-intensive and predominately undergraduate institutions in collaboration with the Department of Energy-Joint Genome Institute (DOE-JGI) as a means to innovate and update undergraduate education and faculty development. The IMG-ACT program provides a cadre of tools, including access to a clearinghouse of genome sequences, bioinformatics databases, data storage, instructor course management, and student notebooks for organizing the results of their bioinformatic investigations. In the process, IMG-ACT makes it feasible to provide undergraduate research opportunities to a greater number and diversity of students, in contrast to the traditional mentor-to-student apprenticeship model for undergraduate research, which can be too expensive and time-consuming to provide for every undergraduate. The IMG-ACT serves as the hub for the network of faculty and students that use the system for microbial genome analysis. Open access of the IMG-ACT infrastructure to participating schools ensures that all types of higher education institutions can utilize it. With the infrastructure in place, faculty can focus their efforts on the pedagogy of bioinformatics, involvement of students in research, and use of this tool for their own research agenda. What the original faculty members of the IMG-ACT development team present here is an overview of how the IMG-ACT program has affected our development in terms of teaching and research with the hopes that it will inspire more faculty to get involved.

  16. The structural bioinformatics library: modeling in biomolecular science and beyond.

    Science.gov (United States)

    Cazals, Frédéric; Dreyfus, Tom

    2017-04-01

    Software in structural bioinformatics has mainly been application driven. To favor practitioners seeking off-the-shelf applications, but also developers seeking advanced building blocks to develop novel applications, we undertook the design of the Structural Bioinformatics Library ( SBL , http://sbl.inria.fr ), a generic C ++/python cross-platform software library targeting complex problems in structural bioinformatics. Its tenet is based on a modular design offering a rich and versatile framework allowing the development of novel applications requiring well specified complex operations, without compromising robustness and performances. The SBL involves four software components (1-4 thereafter). For end-users, the SBL provides ready to use, state-of-the-art (1) applications to handle molecular models defined by unions of balls, to deal with molecular flexibility, to model macro-molecular assemblies. These applications can also be combined to tackle integrated analysis problems. For developers, the SBL provides a broad C ++ toolbox with modular design, involving core (2) algorithms , (3) biophysical models and (4) modules , the latter being especially suited to develop novel applications. The SBL comes with a thorough documentation consisting of user and reference manuals, and a bugzilla platform to handle community feedback. The SBL is available from http://sbl.inria.fr. Frederic.Cazals@inria.fr. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  17. Computational biology and bioinformatics in Nigeria.

    Science.gov (United States)

    Fatumo, Segun A; Adoga, Moses P; Ojo, Opeolu O; Oluwagbemi, Olugbenga; Adeoye, Tolulope; Ewejobi, Itunuoluwa; Adebiyi, Marion; Adebiyi, Ezekiel; Bewaji, Clement; Nashiru, Oyekanmi

    2014-04-01

    Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  18. Computational biology and bioinformatics in Nigeria.

    Directory of Open Access Journals (Sweden)

    Segun A Fatumo

    2014-04-01

    Full Text Available Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  19. Comprehensive NMR analysis of compositional changes of black garlic during thermal processing.

    Science.gov (United States)

    Liang, Tingfu; Wei, Feifei; Lu, Yi; Kodani, Yoshinori; Nakada, Mitsuhiko; Miyakawa, Takuya; Tanokura, Masaru

    2015-01-21

    Black garlic is a processed food product obtained by subjecting whole raw garlic to thermal processing that causes chemical reactions, such as the Maillard reaction, which change the composition of the garlic. In this paper, we report a nuclear magnetic resonance (NMR)-based comprehensive analysis of raw garlic and black garlic extracts to determine the compositional changes resulting from thermal processing. (1)H NMR spectra with a detailed signal assignment showed that 38 components were altered by thermal processing of raw garlic. For example, the contents of 11 l-amino acids increased during the first step of thermal processing over 5 days and then decreased. Multivariate data analysis revealed changes in the contents of fructose, glucose, acetic acid, formic acid, pyroglutamic acid, cycloalliin, and 5-(hydroxymethyl)furfural (5-HMF). Our results provide comprehensive information on changes in NMR-detectable components during thermal processing of whole garlic.

  20. Comprehensive sequence analysis of nine Usher syndrome genes in the UK National Collaborative Usher Study

    OpenAIRE

    Le Quesne Stabej, Polona; Saihan, Zubin; Rangesh, Nell; Steele-Stallard, Heather B; Ambrose, John; Coffey, Alison; Emmerson, Jenny; Haralambous, Elene; Hughes, Yasmin; Steel, Karen P; Luxon, Linda M; Webster, Andrew R; Bitner-Glindzicz, Maria

    2011-01-01

    Background Usher syndrome (USH) is an autosomal recessive disorder comprising retinitis pigmentosa, hearing loss and, in some cases, vestibular dysfunction. It is clinically and genetically heterogeneous with three distinctive clinical types (I?III) and nine Usher genes identified. This study is a comprehensive clinical and genetic analysis of 172 Usher patients and evaluates the contribution of digenic inheritance. Methods The genes MYO7A, USH1C, CDH23, PCDH15, USH1G, USH2A, GPR98, WHRN, CLR...

  1. Bioinformatics analyses of Shigella CRISPR structure and spacer classification.

    Science.gov (United States)

    Wang, Pengfei; Zhang, Bing; Duan, Guangcai; Wang, Yingfang; Hong, Lijuan; Wang, Linlin; Guo, Xiangjiao; Xi, Yuanlin; Yang, Haiyan

    2016-03-01

    Clustered regularly interspaced short palindromic repeats (CRISPR) are inheritable genetic elements of a variety of archaea and bacteria and indicative of the bacterial ecological adaptation, conferring acquired immunity against invading foreign nucleic acids. Shigella is an important pathogen for anthroponosis. This study aimed to analyze the features of Shigella CRISPR structure and classify the spacers through bioinformatics approach. Among 107 Shigella, 434 CRISPR structure loci were identified with two to seven loci in different strains. CRISPR-Q1, CRISPR-Q4 and CRISPR-Q5 were widely distributed in Shigella strains. Comparison of the first and last repeats of CRISPR1, CRISPR2 and CRISPR3 revealed several base variants and different stem-loop structures. A total of 259 cas genes were found among these 107 Shigella strains. The cas gene deletions were discovered in 88 strains. However, there is one strain that does not contain cas gene. Intact clusters of cas genes were found in 19 strains. From comprehensive analysis of sequence signature and BLAST and CRISPRTarget score, the 708 spacers were classified into three subtypes: Type I, Type II and Type III. Of them, Type I spacer referred to those linked with one gene segment, Type II spacer linked with two or more different gene segments, and Type III spacer undefined. This study examined the diversity of CRISPR/cas system in Shigella strains, demonstrated the main features of CRISPR structure and spacer classification, which provided critical information for elucidation of the mechanisms of spacer formation and exploration of the role the spacers play in the function of the CRISPR/cas system.

  2. Statistical modelling in biostatistics and bioinformatics selected papers

    CERN Document Server

    Peng, Defen

    2014-01-01

    This book presents selected papers on statistical model development related mainly to the fields of Biostatistics and Bioinformatics. The coverage of the material falls squarely into the following categories: (a) Survival analysis and multivariate survival analysis, (b) Time series and longitudinal data analysis, (c) Statistical model development and (d) Applied statistical modelling. Innovations in statistical modelling are presented throughout each of the four areas, with some intriguing new ideas on hierarchical generalized non-linear models and on frailty models with structural dispersion, just to mention two examples. The contributors include distinguished international statisticians such as Philip Hougaard, John Hinde, Il Do Ha, Roger Payne and Alessandra Durio, among others, as well as promising newcomers. Some of the contributions have come from researchers working in the BIO-SI research programme on Biostatistics and Bioinformatics, centred on the Universities of Limerick and Galway in Ireland and fu...

  3. When cloud computing meets bioinformatics: a review.

    Science.gov (United States)

    Zhou, Shuigeng; Liao, Ruiqi; Guan, Jihong

    2013-10-01

    In the past decades, with the rapid development of high-throughput technologies, biology research has generated an unprecedented amount of data. In order to store and process such a great amount of data, cloud computing and MapReduce were applied to many fields of bioinformatics. In this paper, we first introduce the basic concepts of cloud computing and MapReduce, and their applications in bioinformatics. We then highlight some problems challenging the applications of cloud computing and MapReduce to bioinformatics. Finally, we give a brief guideline for using cloud computing in biology research.

  4. Toward Personalized Pressure Ulcer Care Planning: Development of a Bioinformatics System for Individualized Prioritization of Clinical Pratice Guideline

    Science.gov (United States)

    2016-10-01

    AWARD NUMBER: W81XWH-15-1-0342 TITLE: Toward Personalized Pressure Ulcer Care Planning: Development of a Bioinformatics System for Individualized...Planning: Development of a Bioinformatics System for Individualized Prioritization of Clinical Pratice Guideline 5a. CONTRACT NUMBER 5b. GRANT...recommendations of CPG has been identified by experts in the field. We will use bioinformatics to enable data extraction, storage, and analysis to support

  5. BioWarehouse: a bioinformatics database warehouse toolkit

    Directory of Open Access Journals (Sweden)

    Stringer-Calvert David WJ

    2006-03-01

    Full Text Available Abstract Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the

  6. BioWarehouse: a bioinformatics database warehouse toolkit.

    Science.gov (United States)

    Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David W J; Tenenbaum, Jessica D; Karp, Peter D

    2006-03-23

    This article addresses the problem of interoperation of heterogeneous bioinformatics databases. We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. BioWarehouse embodies significant progress on the database integration problem for bioinformatics.

  7. Systems Bioinformatics: increasing precision of computational diagnostics and therapeutics through network-based approaches.

    Science.gov (United States)

    Oulas, Anastasis; Minadakis, George; Zachariou, Margarita; Sokratous, Kleitos; Bourdakou, Marilena M; Spyrou, George M

    2017-11-27

    Systems Bioinformatics is a relatively new approach, which lies in the intersection of systems biology and classical bioinformatics. It focuses on integrating information across different levels using a bottom-up approach as in systems biology with a data-driven top-down approach as in bioinformatics. The advent of omics technologies has provided the stepping-stone for the emergence of Systems Bioinformatics. These technologies provide a spectrum of information ranging from genomics, transcriptomics and proteomics to epigenomics, pharmacogenomics, metagenomics and metabolomics. Systems Bioinformatics is the framework in which systems approaches are applied to such data, setting the level of resolution as well as the boundary of the system of interest and studying the emerging properties of the system as a whole rather than the sum of the properties derived from the system's individual components. A key approach in Systems Bioinformatics is the construction of multiple networks representing each level of the omics spectrum and their integration in a layered network that exchanges information within and between layers. Here, we provide evidence on how Systems Bioinformatics enhances computational therapeutics and diagnostics, hence paving the way to precision medicine. The aim of this review is to familiarize the reader with the emerging field of Systems Bioinformatics and to provide a comprehensive overview of its current state-of-the-art methods and technologies. Moreover, we provide examples of success stories and case studies that utilize such methods and tools to significantly advance research in the fields of systems biology and systems medicine. © The Author 2017. Published by Oxford University Press.

  8. The Effects of Visual Attention Span and Phonological Decoding in Reading Comprehension in Dyslexia: A Path Analysis

    OpenAIRE

    Chen, C.; Schneps, M.; Masyn, K.; Thomson, J.

    2016-01-01

    Increasing evidence has shown visual attention span to be a factor, distinct from phonological skills, that explains single-word identification (pseudo-word/word reading) performance in dyslexia. Yet, little is known about how well visual attention span explains text comprehension. Observing reading comprehension in a sample of 105 high school students with dyslexia, we used a pathway analysis to examine the direct and indirect path between visual attention span and reading comprehension whil...

  9. Comprehensive experimental analysis of nonlinear dynamics in an optically-injected semiconductor laser

    Directory of Open Access Journals (Sweden)

    Kevin Schires

    2011-09-01

    Full Text Available We present the first comprehensive experimental study, to our knowledge, of the routes between nonlinear dynamics induced in a semiconductor laser under external optical injection based on an analysis of time-averaged measurements of the optical and RF spectra and phasors of real-time series of the laser output. The different means of analysis are compared for several types of routes and the benefits of each are discussed in terms of the identification and mapping of the nonlinear dynamics. Finally, the results are presented in a novel audio/video format that describes the evolution of the dynamics with the injection parameters.

  10. Bioinformatic tools for PCR Primer design

    African Journals Online (AJOL)

    ES

    Bioinformatics is an emerging scientific discipline that uses information ... complex biological questions. ... and computer programs for various purposes of primer ..... polymerase chain reaction: Human Immunodeficiency Virus 1 model studies.

  11. Challenge: A Multidisciplinary Degree Program in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Mudasser Fraz Wyne

    2006-06-01

    Full Text Available Bioinformatics is a new field that is poorly served by any of the traditional science programs in Biology, Computer science or Biochemistry. Known to be a rapidly evolving discipline, Bioinformatics has emerged from experimental molecular biology and biochemistry as well as from the artificial intelligence, database, pattern recognition, and algorithms disciplines of computer science. While institutions are responding to this increased demand by establishing graduate programs in bioinformatics, entrance barriers for these programs are high, largely due to the significant prerequisite knowledge which is required, both in the fields of biochemistry and computer science. Although many schools currently have or are proposing graduate programs in bioinformatics, few are actually developing new undergraduate programs. In this paper I explore the blend of a multidisciplinary approach, discuss the response of academia and highlight challenges faced by this emerging field.

  12. Deciphering psoriasis. A bioinformatic approach.

    Science.gov (United States)

    Melero, Juan L; Andrades, Sergi; Arola, Lluís; Romeu, Antoni

    2018-02-01

    Psoriasis is an immune-mediated, inflammatory and hyperproliferative disease of the skin and joints. The cause of psoriasis is still unknown. The fundamental feature of the disease is the hyperproliferation of keratinocytes and the recruitment of cells from the immune system in the region of the affected skin, which leads to deregulation of many well-known gene expressions. Based on data mining and bioinformatic scripting, here we show a new dimension of the effect of psoriasis at the genomic level. Using our own pipeline of scripts in Perl and MySql and based on the freely available NCBI Gene Expression Omnibus (GEO) database: DataSet Record GDS4602 (Series GSE13355), we explore the extent of the effect of psoriasis on gene expression in the affected tissue. We give greater insight into the effects of psoriasis on the up-regulation of some genes in the cell cycle (CCNB1, CCNA2, CCNE2, CDK1) or the dynamin system (GBPs, MXs, MFN1), as well as the down-regulation of typical antioxidant genes (catalase, CAT; superoxide dismutases, SOD1-3; and glutathione reductase, GSR). We also provide a complete list of the human genes and how they respond in a state of psoriasis. Our results show that psoriasis affects all chromosomes and many biological functions. If we further consider the stable and mitotically inheritable character of the psoriasis phenotype, and the influence of environmental factors, then it seems that psoriasis has an epigenetic origin. This fit well with the strong hereditary character of the disease as well as its complex genetic background. Copyright © 2017 Japanese Society for Investigative Dermatology. Published by Elsevier B.V. All rights reserved.

  13. Open discovery: An integrated live Linux platform of Bioinformatics tools.

    Science.gov (United States)

    Vetrivel, Umashankar; Pilla, Kalabharath

    2008-01-01

    Historically, live linux distributions for Bioinformatics have paved way for portability of Bioinformatics workbench in a platform independent manner. Moreover, most of the existing live Linux distributions limit their usage to sequence analysis and basic molecular visualization programs and are devoid of data persistence. Hence, open discovery - a live linux distribution has been developed with the capability to perform complex tasks like molecular modeling, docking and molecular dynamics in a swift manner. Furthermore, it is also equipped with complete sequence analysis environment and is capable of running windows executable programs in Linux environment. Open discovery portrays the advanced customizable configuration of fedora, with data persistency accessible via USB drive or DVD. The Open Discovery is distributed free under Academic Free License (AFL) and can be downloaded from http://www.OpenDiscovery.org.in.

  14. Concepts and introduction to RNA bioinformatics

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Hofacker, Ivo L.; Ruzzo, Walter L.

    2014-01-01

    RNA bioinformatics and computational RNA biology have emerged from implementing methods for predicting the secondary structure of single sequences. The field has evolved to exploit multiple sequences to take evolutionary information into account, such as compensating (and structure preserving) base...... for interactions between RNA and proteins.Here, we introduce the basic concepts of predicting RNA secondary structure relevant to the further analyses of RNA sequences. We also provide pointers to methods addressing various aspects of RNA bioinformatics and computational RNA biology....

  15. Key Concept Identification: A Comprehensive Analysis of Frequency and Topical Graph-Based Approaches

    Directory of Open Access Journals (Sweden)

    Muhammad Aman

    2018-05-01

    Full Text Available Automatic key concept extraction from text is the main challenging task in information extraction, information retrieval and digital libraries, ontology learning, and text analysis. The statistical frequency and topical graph-based ranking are the two kinds of potentially powerful and leading unsupervised approaches in this area, devised to address the problem. To utilize the potential of these approaches and improve key concept identification, a comprehensive performance analysis of these approaches on datasets from different domains is needed. The objective of the study presented in this paper is to perform a comprehensive empirical analysis of selected frequency and topical graph-based algorithms for key concept extraction on three different datasets, to identify the major sources of error in these approaches. For experimental analysis, we have selected TF-IDF, KP-Miner and TopicRank. Three major sources of error, i.e., frequency errors, syntactical errors and semantical errors, and the factors that contribute to these errors are identified. Analysis of the results reveals that performance of the selected approaches is significantly degraded by these errors. These findings can help us develop an intelligent solution for key concept extraction in the future.

  16. New Link in Bioinformatics Services Value Chain: Position, Organization and Business Model

    Directory of Open Access Journals (Sweden)

    Mladen Čudanov

    2012-11-01

    Full Text Available This paper presents development in the bioinformatics services industry value chain, based on cloud computing paradigm. As genome sequencing costs per Megabase exponentially drop, industry needs to adopt. Paper has two parts: theoretical analysis and practical example of Seven Bridges Genomics Company. We are focused on explaining organizational, business and financial aspects of new business model in bioinformatics services, rather than technical side of the problem. In the light of that we present twofold business model fit for core bioinformatics research and Information and Communication Technologie (ICT support in the new environment, with higher level of capital utilization and better resistance to business risks.

  17. Comprehensive analysis of a straw-fired power plant in Vojvodina

    Directory of Open Access Journals (Sweden)

    Urošević Dragan M.

    2012-01-01

    Full Text Available In recent years, renewable energy sources have played an increasingly important role in potential energy production. The integration of renewable energy technologies into existing national energy system has therefore become a major challenge for many countries. Due to the importance of this matter, this paper deals with the comprehensive analysis for implementation of a power plant on biomass (straw. The analysis is conducted regarding several key indicators: availability of biomass, regulation, reduction of greenhouse gas emissions, location, land use, electricity price and social impacts. The analysis also includes favorable price for electricity produced from biomass relevant to national feed in tariffs. In order to demonstrate all above mentioned indicators, the region in Serbia (Province of Vojvodina with significant potential in biomass, especially in straw, is selected. The results of the analysis are validated trough environmental and social aspects. Special attention is given to identifying risks for this application.

  18. A Comprehensive Sensitivity Analysis of a Data Center Network with Server Virtualization for Business Continuity

    Directory of Open Access Journals (Sweden)

    Tuan Anh Nguyen

    2015-01-01

    Full Text Available Sensitivity assessment of availability for data center networks (DCNs is of paramount importance in design and management of cloud computing based businesses. Previous work has presented a performance modeling and analysis of a fat-tree based DCN using queuing theory. In this paper, we present a comprehensive availability modeling and sensitivity analysis of a DCell-based DCN with server virtualization for business continuity using stochastic reward nets (SRN. We use SRN in modeling to capture complex behaviors and dependencies of the system in detail. The models take into account (i two DCell configurations, respectively, composed of two and three physical hosts in a DCell0 unit, (ii failure modes and corresponding recovery behaviors of hosts, switches, and VMs, and VM live migration mechanism within and between DCell0s, and (iii dependencies between subsystems (e.g., between a host and VMs and between switches and VMs in the same DCell0. The constructed SRN models are analyzed in detail with regard to various metrics of interest to investigate system’s characteristics. A comprehensive sensitivity analysis of system availability is carried out in consideration of the major impacting parameters in order to observe the system’s complicated behaviors and find the bottlenecks of system availability. The analysis results show the availability improvement, capability of fault tolerance, and business continuity of the DCNs complying with DCell network topology. This study provides a basis of designing and management of DCNs for business continuity.

  19. Bioinformatics functional analysis of let-7a, miR-34a, and miR-199a/b reveals novel insights into immune system pathways and cancer hallmarks for hepatocellular carcinoma.

    Science.gov (United States)

    Soliman, Bangly; Salem, Ahmed; Ghazy, Mohamed; Abu-Shahba, Nourhan; El Hefnawi, Mahmoud

    2018-05-01

    Let-7a, miR-34a, and miR-199 a/b have gained a great attention as master regulators for cellular processes. In particular, these three micro-RNAs act as potential onco-suppressors for hepatocellular carcinoma. Bioinformatics can reveal the functionality of these micro-RNAs through target prediction and functional annotation analysis. In the current study, in silico analysis using innovative servers (miRror Suite, DAVID, miRGator V3.0, GeneTrail) has demonstrated the combinatorial and the individual target genes of these micro-RNAs and further explored their roles in hepatocellular carcinoma progression. There were 87 common target messenger RNAs (p ≤ 0.05) that were predicted to be regulated by the three micro-RNAs using miRror 2.0 target prediction tool. In addition, the functional enrichment analysis of these targets that was performed by DAVID functional annotation and REACTOME tools revealed two major immune-related pathways, eight hepatocellular carcinoma hallmarks-linked pathways, and two pathways that mediate interconnected processes between immune system and hepatocellular carcinoma hallmarks. Moreover, protein-protein interaction network for the predicted common targets was obtained by using STRING database. The individual analysis of target genes and pathways for the three micro-RNAs of interest using miRGator V3.0 and GeneTrail servers revealed some novel predicted target oncogenes such as SOX4, which we validated experimentally, in addition to some regulated pathways of immune system and hepatocarcinogenesis such as insulin signaling pathway and adipocytokine signaling pathway. In general, our results demonstrate that let-7a, miR-34a, and miR-199 a/b have novel interactions in different immune system pathways and major hepatocellular carcinoma hallmarks. Thus, our findings shed more light on the roles of these miRNAs as cancer silencers.

  20. Comprehensive Evaluation and Analysis of China's Mainstream Online Map Service Websites

    Science.gov (United States)

    Zhang, H.; Jiang, J.; Huang, W.; Wang, Q.; Gu, X.

    2012-08-01

    With the flourish development of China's Internet market, all kinds of users for map service demand is rising continually, within it contains tremendous commercial interests. Many internet giants have got involved in the field of online map service, and defined it as an important strategic product of the company. The main purpose of this research is to evaluate these online map service websites comprehensively with a model, and analyse the problems according to the evaluation results. Then some corresponding solving measures are proposed, which provides a theoretical and application guidance for the future development of fiercely competitive online map websites. The research consists of three stages: (a) the mainstream online map service websites in China are introduced and the present situation of them is analysed through visit, investigation, consultant, analysis and research. (b) a whole comprehensive evaluation quota system of online map service websites from the view of functions, layout, interaction design color position and so on, combining with the data indexes such as time efficiency, accuracy, objectivity and authority. (c) a comprehensive evaluation to these online map service websites is proceeded based on the fuzzy evaluation mathematical model, and the difficulty that measure the map websites quantitatively is solved.

  1. COMPREHENSIVE EVALUATION AND ANALYSIS OF CHINA’S MAINSTREAM ONLINE MAP SERVICE WEBSITES

    Directory of Open Access Journals (Sweden)

    H. Zhang

    2012-08-01

    Full Text Available With the flourish development of China's Internet market, all kinds of users for map service demand is rising continually, within it contains tremendous commercial interests. Many internet giants have got involved in the field of online map service, and defined it as an important strategic product of the company. The main purpose of this research is to evaluate these online map service websites comprehensively with a model, and analyse the problems according to the evaluation results. Then some corresponding solving measures are proposed, which provides a theoretical and application guidance for the future development of fiercely competitive online map websites. The research consists of three stages: (a the mainstream online map service websites in China are introduced and the present situation of them is analysed through visit, investigation, consultant, analysis and research. (b a whole comprehensive evaluation quota system of online map service websites from the view of functions, layout, interaction design color position and so on, combining with the data indexes such as time efficiency, accuracy, objectivity and authority. (c a comprehensive evaluation to these online map service websites is proceeded based on the fuzzy evaluation mathematical model, and the difficulty that measure the map websites quantitatively is solved.

  2. Transcriptomic and bioinformatics analysis of the early time-course of the response to prostaglandin F2 alpha in the bovine corpus luteum

    Science.gov (United States)

    RNA expression analysis was performed on the corpus luteum tissue at five time points after prostaglandin F2 alpha treatment of midcycle cows using an Affymetrix Bovine Gene v1 Array. The normalized linear microarray data was uploaded to the NCBI GEO repository (GSE94069). Subsequent statistical ana...

  3. THE ROLE OF GENDER IN READING COMPREHENSION: AN ANALYSIS OF COLLEGE-LEVEL EFL STUDENTS’ COMPREHENSION OF DIFFERENT GENRES

    Directory of Open Access Journals (Sweden)

    Didem Koban Koç

    2016-07-01

    Full Text Available The purpose of the present study is to examine the effects of gender on comprehending different types of genre. The study involved 60 first year college students (30 males and 30 females who were taking an advanced reading course at a government university in Turkey. The students were given three reading passages of different genres such as historical fiction, essay and fantasy and were asked to answer comprehension questions related to the passages. Descriptives statistics, one-way ANOVA and repeated measures ANOVA were employed to analyse the relationship between gender and the test scores for each text type. The results showed that (1 the participants, in general, were significantly better at understanding the essay than historical fiction and fantasy (2 there was not a statistically significant difference between males and females regarding comprehending the different types of genres (3 both the male and female participants were significantly better at understanding the essay than historical fiction and fantasy. The study offers suggestions regarding incorporating different types of genre in the classroom.

  4. 2nd Colombian Congress on Computational Biology and Bioinformatics

    CERN Document Server

    Cristancho, Marco; Isaza, Gustavo; Pinzón, Andrés; Rodríguez, Juan

    2014-01-01

    This volume compiles accepted contributions for the 2nd Edition of the Colombian Computational Biology and Bioinformatics Congress CCBCOL, after a rigorous review process in which 54 papers were accepted for publication from 119 submitted contributions. Bioinformatics and Computational Biology are areas of knowledge that have emerged due to advances that have taken place in the Biological Sciences and its integration with Information Sciences. The expansion of projects involving the study of genomes has led the way in the production of vast amounts of sequence data which needs to be organized, analyzed and stored to understand phenomena associated with living organisms related to their evolution, behavior in different ecosystems, and the development of applications that can be derived from this analysis.  .

  5. Bioinformatics for whole-genome shotgun sequencing of microbial communities.

    Directory of Open Access Journals (Sweden)

    Kevin Chen

    2005-07-01

    Full Text Available The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fecal samples. The interpretation of this new kind of data poses a wide variety of exciting and difficult bioinformatics problems. The aim of this review is to introduce the bioinformatics community to this emerging field by surveying existing techniques and promising new approaches for several of the most interesting of these computational problems.

  6. Multichannel microscale system for high throughput preparative separation with comprehensive collection and analysis

    Energy Technology Data Exchange (ETDEWEB)

    Karger, Barry L.; Kotler, Lev; Foret, Frantisek; Minarik, Marek; Kleparnik, Karel

    2003-12-09

    A modular multiple lane or capillary electrophoresis (chromatography) system that permits automated parallel separation and comprehensive collection of all fractions from samples in all lanes or columns, with the option of further on-line automated sample fraction analysis, is disclosed. Preferably, fractions are collected in a multi-well fraction collection unit, or plate (40). The multi-well collection plate (40) is preferably made of a solvent permeable gel, most preferably a hydrophilic, polymeric gel such as agarose or cross-linked polyacrylamide.

  7. TranslatomeDB: a comprehensive database and cloud-based analysis platform for translatome sequencing data.

    Science.gov (United States)

    Liu, Wanting; Xiang, Lunping; Zheng, Tingkai; Jin, Jingjie; Zhang, Gong

    2018-01-04

    Translation is a key regulatory step, linking transcriptome and proteome. Two major methods of translatome investigations are RNC-seq (sequencing of translating mRNA) and Ribo-seq (ribosome profiling). To facilitate the investigation of translation, we built a comprehensive database TranslatomeDB (http://www.translatomedb.net/) which provides collection and integrated analysis of published and user-generated translatome sequencing data. The current version includes 2453 Ribo-seq, 10 RNC-seq and their 1394 corresponding mRNA-seq datasets in 13 species. The database emphasizes the analysis functions in addition to the dataset collections. Differential gene expression (DGE) analysis can be performed between any two datasets of same species and type, both on transcriptome and translatome levels. The translation indices translation ratios, elongation velocity index and translational efficiency can be calculated to quantitatively evaluate translational initiation efficiency and elongation velocity, respectively. All datasets were analyzed using a unified, robust, accurate and experimentally-verifiable pipeline based on the FANSe3 mapping algorithm and edgeR for DGE analyzes. TranslatomeDB also allows users to upload their own datasets and utilize the identical unified pipeline to analyze their data. We believe that our TranslatomeDB is a comprehensive platform and knowledgebase on translatome and proteome research, releasing the biologists from complex searching, analyzing and comparing huge sequencing data without needing local computational power. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  8. Introducing People – Genre Analysis and Oral Comprehension and Oral Production Tasks

    Directory of Open Access Journals (Sweden)

    Keila Rocha Reis de Carvalho

    2012-02-01

    Full Text Available This paper aims at presenting an analysis of the genre introducing people and at suggesting listening comprehension and oral production tasks. This work was developed according to the characterization of the rhetorical organization of situations taken from seventeen films that contain the genre under analysis. Although several studies in the ESP area carried out recently (Andrade, 2003; Cardoso, 2003; Shergue, 2003; Belmonte, 2003; Serafini, 2003 have identified listening comprehension and oral production as the abilities that should be prioritized in an English course, much needs to be done, especially concerning the oral genres that take into account the language the learners of English as a second language need in their target situation. This work is based on Hutchinson & Waters (1987 theoretical background on ESP, Swales’ (1990 genre analysis, Ramos’ (2004 pedagogical proposal, and also on Ellis´ (2003 tasks concept. The familiarization of learners of English as a second language with this genre will provide them with the opportunity to better understand and use the English language in their academic and professional life.

  9. Quantitative Proteomics for the Comprehensive Analysis of Stress Responses of Lactobacillus paracasei subsp. paracasei F19.

    Science.gov (United States)

    Schott, Ann-Sophie; Behr, Jürgen; Geißler, Andreas J; Kuster, Bernhard; Hahne, Hannes; Vogel, Rudi F

    2017-10-06

    Lactic acid bacteria are broadly employed as starter cultures in the manufacture of foods. Upon technological preparation, they are confronted with drying stress that amalgamates numerous stress conditions resulting in losses of fitness and survival. To better understand and differentiate physiological stress responses, discover general and specific markers for the investigated stress conditions, and predict optimal preconditioning for starter cultures, we performed a comprehensive genomic and quantitative proteomic analysis of a commonly used model system, Lactobacillus paracasei subsp. paracasei TMW 1.1434 (isogenic with F19) under 11 typical stress conditions, including among others oxidative, osmotic, pH, and pressure stress. We identified and quantified >1900 proteins in triplicate analyses, representing 65% of all genes encoded in the genome. The identified genes were thoroughly annotated in terms of subcellular localization prediction and biological functions, suggesting unbiased and comprehensive proteome coverage. In total, 427 proteins were significantly differentially expressed in at least one condition. Most notably, our analysis suggests that optimal preconditioning toward drying was predicted to be alkaline and high-pressure stress preconditioning. Taken together, we believe the presented strategy may serve as a prototypic example for the analysis and utility of employing quantitative-mass-spectrometry-based proteomics to study bacterial physiology.

  10. Navigating the changing learning landscape: perspective from bioinformatics.ca

    OpenAIRE

    Brazas, Michelle D.; Ouellette, B. F. Francis

    2013-01-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable...

  11. Expression profile analysis of long noncoding RNA in HER-2-enriched subtype breast cancer by next-generation sequencing and bioinformatics

    Directory of Open Access Journals (Sweden)

    Yang F

    2016-02-01

    Full Text Available Fan Yang, Shixu Lyu, Siyang Dong, Yehuan Liu, Xiaohua Zhang, Ouchen Wang Department of Surgical Oncology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, People’s Republic of China Background: Human epidermal growth factor receptor 2 (HER-2-enriched subtype breast cancer is associated with a more aggressive phenotype and shorter survival time. Long noncoding RNAs (LncRNAs have essential roles in tumorigenesis and occupy a central place in cancer progression. Notably, few studies have focused on the dysregulation of LncRNAs in the HER-2-enriched subtype breast cancer. In this study, we analyzed the expression profile of LncRNAs and mRNAs in this particular subtype of breast cancer. Methods: Seven pairs of HER-2-enriched subtype breast cancer and normal tissue were sequenced. We screened out differently expressed genes and measured the correlation of the expression levels of dysregulated LncRNAs and HER-2 by Pearson’s correlation coefficient analysis. Gene ontology analysis and pathway analysis were used to understand the biological roles of these differently expressed genes. Pathway act network and coexpression network were constructed. Results: More than 1,300 LncRNAs and 2,800 mRNAs, which were significantly differently expressed, were identified. Among these LncRNAs, AFAP1-AS1 was the most dysregulated LncRNA, while ORM2 was the most dysregulated mRNA. LOC100288637 had the highest positive correlation coefficient of 0.93 with HER-2, while RPL13P5 had the highest negative correlation coefficient of -0.87. The pathway act network showed that MAPK signaling pathway, PI3K-Akt signaling pathway, metabolic pathways, cell cycle, and regulation of actin cytoskeleton were highly related with HER-2-enriched subtype breast cancer. Coexpression network recognized LINC00636, LINC01405, ADARB2-AS1, ST8SIA6-AS1, LINC00511, and DPP10-AS1 as core genes. Conclusion: These results analyze the functions of LncRNAs and provide

  12. Using MetaboAnalyst 3.0 for Comprehensive Metabolomics Data Analysis.

    Science.gov (United States)

    Xia, Jianguo; Wishart, David S

    2016-09-07

    MetaboAnalyst (http://www.metaboanalyst.ca) is a comprehensive Web application for metabolomic data analysis and interpretation. MetaboAnalyst handles most of the common metabolomic data types from most kinds of metabolomics platforms (MS and NMR) for most kinds of metabolomics experiments (targeted, untargeted, quantitative). In addition to providing a variety of data processing and normalization procedures, MetaboAnalyst also supports a number of data analysis and data visualization tasks using a range of univariate, multivariate methods such as PCA (principal component analysis), PLS-DA (partial least squares discriminant analysis), heatmap clustering and machine learning methods. MetaboAnalyst also offers a variety of tools for metabolomic data interpretation including MSEA (metabolite set enrichment analysis), MetPA (metabolite pathway analysis), and biomarker selection via ROC (receiver operating characteristic) curve analysis, as well as time series and power analysis. This unit provides an overview of the main functional modules and the general workflow of the latest version of MetaboAnalyst (MetaboAnalyst 3.0), followed by eight detailed protocols. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.

  13. Nispero: a cloud-computing based Scala tool specially suited for bioinformatics data processing

    OpenAIRE

    Evdokim Kovach; Alexey Alekhin; Eduardo Pareja Tobes; Raquel Tobes; Eduardo Pareja; Marina Manrique

    2014-01-01

    Nowadays it is widely accepted that the bioinformatics data analysis is a real bottleneck in many research activities related to life sciences. High-throughput technologies like Next Generation Sequencing (NGS) have completely reshaped the biology and bioinformatics landscape. Undoubtedly NGS has allowed important progress in many life-sciences related fields but has also presented interesting challenges in terms of computation capabilities and algorithms. Many kinds of tasks related with NGS...

  14. libcov: A C++ bioinformatic library to manipulate protein structures, sequence alignments and phylogeny

    OpenAIRE

    Butt, Davin; Roger, Andrew J; Blouin, Christian

    2005-01-01

    Background An increasing number of bioinformatics methods are considering the phylogenetic relationships between biological sequences. Implementing new methodologies using the maximum likelihood phylogenetic framework can be a time consuming task. Results The bioinformatics library libcov is a collection of C++ classes that provides a high and low-level interface to maximum likelihood phylogenetics, sequence analysis and a data structure for structural biological methods. libcov can be used ...

  15. ALF: a strategy for identification of unauthorized GMOs in complex mixtures by a GW-NGS method and dedicated bioinformatics analysis.

    Science.gov (United States)

    Košir, Alexandra Bogožalec; Arulandhu, Alfred J; Voorhuijzen, Marleen M; Xiao, Hongmei; Hagelaar, Rico; Staats, Martijn; Costessi, Adalberto; Žel, Jana; Kok, Esther J; Dijk, Jeroen P van

    2017-10-26

    The majority of feed products in industrialised countries contains materials derived from genetically modified organisms (GMOs). In parallel, the number of reports of unauthorised GMOs (UGMOs) is gradually increasing. There is a lack of specific detection methods for UGMOs, due to the absence of detailed sequence information and reference materials. In this research, an adapted genome walking approach was developed, called ALF: Amplification of Linearly-enriched Fragments. Coupling of ALF to NGS aims for simultaneous detection and identification of all GMOs, including UGMOs, in one sample, in a single analysis. The ALF approach was assessed on a mixture made of DNA extracts from four reference materials, in an uneven distribution, mimicking a real life situation. The complete insert and genomic flanking regions were known for three of the included GMO events, while for MON15985 only partial sequence information was available. Combined with a known organisation of elements, this GMO served as a model for a UGMO. We successfully identified sequences matching with this organisation of elements serving as proof of principle for ALF as new UGMO detection strategy. Additionally, this study provides a first outline of an automated, web-based analysis pipeline for identification of UGMOs containing known GM elements.

  16. The Screening of Genes Sensitive to Long-Term, Low-Level Microwave Exposure and Bioinformatic Analysis of Potential Correlations to Learning and Memory

    Institute of Scientific and Technical Information of China (English)

    ZHAO Ya Li; LI Ying Xian; MA Hong Bo; LI Dong; LI Hai Liang; JIANG Rui; KAN Guang Han; YANG Zhen Zhong; HUANG Zeng Xin

    2015-01-01

    Objective To gain a better understanding of gene expression changes in the brain following microwave exposure in mice. This study hopes to reveal mechanisms contributing to microwave-induced learning and memory dysfunction. Methods Mice were exposed to whole body 2100 MHz microwaves with specific absorption rates (SARs) of 0.45 W/kg, 1.8 W/kg, and 3.6 W/kg for 1 hour daily for 8 weeks. Differentially expressing genes in the brains were screened using high-density oligonucleotide arrays, with genes showing more significant differences further confirmed by RT-PCR. Results The gene chip results demonstrated that 41 genes (0.45 W/kg group), 29 genes (1.8 W/kg group), and 219 genes (3.6 W/kg group) were differentially expressed. GO analysis revealed that these differentially expressed genes were primarily involved in metabolic processes, cellular metabolic processes, regulation of biological processes, macromolecular metabolic processes, biosynthetic processes, cellular protein metabolic processes, transport, developmental processes, cellular component organization, etc. KEGG pathway analysis showed that these genes are mainly involved in pathways related to ribosome, Alzheimer's disease, Parkinson's disease, long-term potentiation, Huntington's disease, and Neurotrophin signaling. Construction of a protein interaction network identified several important regulatory genes including synbindin (sbdn), Crystallin (CryaB), PPP1CA, Ywhaq, Psap, Psmb1, Pcbp2, etc., which play important roles in the processes of learning and memory. Conclusion Long-term, low-level microwave exposure may inhibit learning and memory by affecting protein and energy metabolic processes and signaling pathways relating to neurological functions or diseases.

  17. Bioinformatic analysis of patient-derived ASPS gene expressions and ASPL-TFE3 fusion transcript levels identify potential therapeutic targets.

    Directory of Open Access Journals (Sweden)

    David G Covell

    Full Text Available Gene expression data, collected from ASPS tumors of seven different patients and from one immortalized ASPS cell line (ASPS-1, was analyzed jointly with patient ASPL-TFE3 (t(X;17(p11;q25 fusion transcript data to identify disease-specific pathways and their component genes. Data analysis of the pooled patient and ASPS-1 gene expression data, using conventional clustering methods, revealed a relatively small set of pathways and genes characterizing the biology of ASPS. These results could be largely recapitulated using only the gene expression data collected from patient tumor samples. The concordance between expression measures derived from ASPS-1 and both pooled and individual patient tumor data provided a rationale for extending the analysis to include patient ASPL-TFE3 fusion transcript data. A novel linear model was exploited to link gene expressions to fusion transcript data and used to identify a small set of ASPS-specific pathways and their gene expression. Cellular pathways that appear aberrantly regulated in response to the t(X;17(p11;q25 translocation include the cell cycle and cell adhesion. The identification of pathways and gene subsets characteristic of ASPS support current therapeutic strategies that target the FLT1 and MET, while also proposing additional targeting of genes found in pathways involved in the cell cycle (CHK1, cell adhesion (ARHGD1A, cell division (CDC6, control of meiosis (RAD51L3 and mitosis (BIRC5, and chemokine-related protein tyrosine kinase activity (CCL4.

  18. The Screening of Genes Sensitive to Long-Term, Low-Level Microwave Exposure and Bioinformatic Analysis of Potential Correlations to Learning and Memory.

    Science.gov (United States)

    Zhao, Ya Li; Li, Ying Xian; Ma, Hong Bo; Li, Dong; Li, Hai Liang; Jiang, Rui; Kan, Guang Han; Yang, Zhen Zhong; Huang, Zeng Xin

    2015-08-01

    To gain a better understanding of gene expression changes in the brain following microwave exposure in mice. This study hopes to reveal mechanisms contributing to microwave-induced learning and memory dysfunction. Mice were exposed to whole body 2100 MHz microwaves with specific absorption rates (SARs) of 0.45 W/kg, 1.8 W/kg, and 3.6 W/kg for 1 hour daily for 8 weeks. Differentially expressing genes in the brains were screened using high-density oligonucleotide arrays, with genes showing more significant differences further confirmed by RT-PCR. The gene chip results demonstrated that 41 genes (0.45 W/kg group), 29 genes (1.8 W/kg group), and 219 genes (3.6 W/kg group) were differentially expressed. GO analysis revealed that these differentially expressed genes were primarily involved in metabolic processes, cellular metabolic processes, regulation of biological processes, macromolecular metabolic processes, biosynthetic processes, cellular protein metabolic processes, transport, developmental processes, cellular component organization, etc. KEGG pathway analysis showed that these genes are mainly involved in pathways related to ribosome, Alzheimer's disease, Parkinson's disease, long-term potentiation, Huntington's disease, and Neurotrophin signaling. Construction of a protein interaction network identified several important regulatory genes including synbindin (sbdn), Crystallin (CryaB), PPP1CA, Ywhaq, Psap, Psmb1, Pcbp2, etc., which play important roles in the processes of learning and memorye. Long-term, low-level microwave exposure may inhibit learning and memory by affecting protein and energy metabolic processes and signaling pathways relating to neurological functions or diseases. Copyright © 2015 The Editorial Board of Biomedical and Environmental Sciences. Published by China CDC. All rights reserved.

  19. Comprehensive analysis of temporal alterations in cellular proteome of Bacillus subtilis under curcumin treatment.

    Directory of Open Access Journals (Sweden)

    Panga Jaipal Reddy

    Full Text Available Curcumin is a natural dietary compound with antimicrobial activity against various gram positive and negative bacteria. This study aims to investigate the proteome level alterations in Bacillus subtilis due to curcumin treatment and identification of its molecular/cellular targets to understand the mechanism of action. We have performed a comprehensive proteomic analysis of B. subtilis AH75 strain at different time intervals of curcumin treatment (20, 60 and 120 min after the drug exposure, three replicates to compare the protein expression profiles using two complementary quantitative proteomic techniques, 2D-DIGE and iTRAQ. To the best of our knowledge, this is the first comprehensive longitudinal investigation describing the effect of curcumin treatment on B. subtilis proteome. The proteomics analysis revealed several interesting targets such UDP-N-acetylglucosamine 1-carboxyvinyltransferase 1, putative septation protein SpoVG and ATP-dependent Clp protease proteolytic subunit. Further, in silico pathway analysis using DAVID and KOBAS has revealed modulation of pathways related to the fatty acid metabolism and cell wall synthesis, which are crucial for cell viability. Our findings revealed that curcumin treatment lead to inhibition of the cell wall and fatty acid synthesis in addition to differential expression of many crucial proteins involved in modulation of bacterial metabolism. Findings obtained from proteomics analysis were further validated using 5-cyano-2,3-ditolyl tetrazolium chloride (CTC assay for respiratory activity, resazurin assay for metabolic activity and membrane integrity assay by potassium and inorganic phosphate leakage measurement. The gene expression analysis of selected cell wall biosynthesis enzymes has strengthened the proteomics findings and indicated the major effect of curcumin on cell division.

  20. Clinical value and potential pathways of miR-183-5p in bladder cancer: A study based on miRNA-seq data and bioinformatics analysis.

    Science.gov (United States)

    Gao, Jia-Min; Huang, Lin-Zhen; Huang, Zhi-Guang; He, Rong-Quan

    2018-04-01

    The clinicopathological value and exploration of the potential molecular mechanism of microRNA-183-5p (miR-183-5p) have been investigated in various cancers; however, to the best of the author's knowledge, no similar research has been reported for bladder cancer. In the present study, it was revealed that the expression level of miR-183-5p was notably increased in bladder cancer tissues compared with adjacent non-cancerous tissues (P=0.001) and was markedly increased in the tissue samples of papillary, pathological T stage (T0-T2) and pathological stage (I-II) compared with tissue samples of their counterparts (P=0.05), according to data from The Cancer Genome Atlas. Receiver operating characteristic analysis revealed the robust diagnostic value of miR-183-5p for distinguishing bladder cancer from non-cancerous bladder tissues (area under curve=0.948; 95% confidence interval: 0.919-0.977). Amplification and deep deletion of miR-183-5p were indicated by cBioPortal, accounting for 1% (4/412) of bladder cancer cases. Data from YM500v3 demonstrated that compared with other cancers, bladder cancer exhibited high expression levels of miR-183-5p, and miR-183-5p expression in primary solid tumors was much higher compared with solid normal tissues. A meta-analysis indicated that miR-183-5p was more highly expressed in bladder cancer samples compared with normal counterparts. A total of 88 potential target genes of miR-183-5p were identified, 13 of which were discerned as hub genes by protein-protein interaction. The epithelial-to-mesenchymal transition pathway was the most significantly enriched pathway by FunRich (P=0.0001). In summary, miR-183-5p may participate in the tumorigenesis and development of bladder cancer via certain signaling pathways, particularly the epithelial-to-mesenchymal transition pathway. However, the exact molecular mechanism of miR-183-5p in bladder cancer must be validated by in vitro and in vivo experiments.

  1. Integrated ligand-receptor bioinformatic and in vitro functional analysis identifies active TGFA/EGFR signaling loop in papillary thyroid carcinomas.

    Directory of Open Access Journals (Sweden)

    Debora Degl'Innocenti

    Full Text Available BACKGROUND: Papillary thyroid carcinoma (PTCs, the most frequent thyroid cancer, is usually not life threatening, but may recur or progress to aggressive forms resistant to conventional therapies. A more detailed understanding of the signaling pathways activated in PTCs may help to identify novel therapeutic approaches against these tumors. The aim of this study is to identify signaling pathways activated in PTCs. METHODOLOGY/PRINCIPAL FINDINGS: We examined coordinated gene expression patterns of ligand/receptor (L/R pairs using the L/R database DRLP-rev1 and five publicly available thyroid cancer datasets of gene expression on a total of 41 paired PTC/normal thyroid tissues. We identified 26 (up and 13 (down L/R pairs coordinately and differentially expressed. The relevance of these L/R pairs was confirmed by performing the same analysis on REarranged during Transfection (RET/PTC1-infected thyrocytes with respect to normal thyrocytes. TGFA/EGFR emerged as one of the most tightly regulated L/R pair. Furthermore, PTC clinical samples analyzed by real-time RT-PCR expressed EGFR transcript levels similar to those of 5 normal thyroid tissues from patients with pathologies other than thyroid cancer, whereas significantly elevated levels of TGFA transcripts were only present in PTCs. Biochemical analysis of PTC cell lines demonstrated the presence of EGFR on the cell membrane and TGFA in conditioned media. Moreover, conditioned medium of the PTC cell line NIM-1 activated EGFR expressed on HeLa cells, culminating in both ERK and AKT phosphorylation. In NIM-1 cells harboring BRAF mutation, TGFA stimulated proliferation, contributing to PI3K/AKT activation independent of MEK/ERK signaling. CONCLUSIONS/SIGNIFICANCE: We compiled a reliable list of L/R pairs associated with PTC and validated the biological role of one of the emerged L/R pair, the TGFA/EGFR, in this cancer, in vitro. These data provide a better understanding of the factors involved in the

  2. Structural Bioinformatics and Protein Docking Analysis of the Molecular Chaperone-Kinase Interactions: Towards Allosteric Inhibition of Protein Kinases by Targeting the Hsp90-Cdc37 Chaperone Machinery

    Directory of Open Access Journals (Sweden)

    Gennady Verkhivker

    2013-11-01

    Full Text Available A fundamental role of the Hsp90-Cdc37 chaperone system in mediating maturation of protein kinase clients and supporting kinase functional activity is essential for the integrity and viability of signaling pathways involved in cell cycle control and organism development. Despite significant advances in understanding structure and function of molecular chaperones, the molecular mechanisms and guiding principles of kinase recruitment to the chaperone system are lacking quantitative characterization. Structural and thermodynamic characterization of Hsp90-Cdc37 binding with protein kinase clients by modern experimental techniques is highly challenging, owing to a transient nature of chaperone-mediated interactions. In this work, we used experimentally-guided protein docking to probe the allosteric nature of the Hsp90-Cdc37 binding with the cyclin-dependent kinase 4 (Cdk4 kinase clients. The results of docking simulations suggest that the kinase recognition and recruitment to the chaperone system may be primarily determined by Cdc37 targeting of the N-terminal kinase lobe. The interactions of Hsp90 with the C-terminal kinase lobe may provide additional “molecular brakes” that can lock (or unlock kinase from the system during client loading (release stages. The results of this study support a central role of the Cdc37 chaperone in recognition and recruitment of the kinase clients. Structural analysis may have useful implications in developing strategies for allosteric inhibition of protein kinases by targeting the Hsp90-Cdc37 chaperone machinery.

  3. Bioinformatical Analysis of Organ-Related (Heart, Brain, Liver, and Kidney and Serum Proteomic Data to Identify Protein Regulation Patterns and Potential Sepsis Biomarkers

    Directory of Open Access Journals (Sweden)

    Andreas Hohn

    2018-01-01

    Full Text Available During the last years, proteomic studies have revealed several interesting findings in experimental sepsis models and septic patients. However, most studies investigated protein alterations only in single organs or in whole blood. To identify possible sepsis biomarkers and to evaluate the relationship between protein alteration in sepsis affected organs and blood, proteomics data from the heart, brain, liver, kidney, and serum were analysed. Using functional network analyses in combination with hierarchical cluster analysis, we found that protein regulation patterns in organ tissues as well as in serum are highly dynamic. In the tissue proteome, the main functions and pathways affected were the oxidoreductive activity, cell energy generation, or metabolism, whereas in the serum proteome, functions were associated with lipoproteins metabolism and, to a minor extent, with coagulation, inflammatory response, and organ regeneration. Proteins from network analyses of organ tissue did not correlate with statistically significantly regulated serum proteins or with predicted proteins of serum functions. In this study, the combination of proteomic network analyses with cluster analyses is introduced as an approach to deal with high-throughput proteomics data to evaluate the dynamics of protein regulation during sepsis.

  4. Development and validation of MIX: comprehensive free software for meta-analysis of causal research data

    Directory of Open Access Journals (Sweden)

    Ikeda Noriaki

    2006-10-01

    Full Text Available Abstract Background Meta-analysis has become a well-known method for synthesis of quantitative data from previously conducted research in applied health sciences. So far, meta-analysis has been particularly useful in evaluating and comparing therapies and in assessing causes of disease. Consequently, the number of software packages that can perform meta-analysis has increased over the years. Unfortunately, it can take a substantial amount of time to get acquainted with some of these programs and most contain little or no interactive educational material. We set out to create and validate an easy-to-use and comprehensive meta-analysis package that would be simple enough programming-wise to remain available as a free download. We specifically aimed at students and researchers who are new to meta-analysis, with important parts of the development oriented towards creating internal interactive tutoring tools and designing features that would facilitate usage of the software as a companion to existing books on meta-analysis. Results We took an unconventional approach and created a program that uses Excel as a calculation and programming platform. The main programming language was Visual Basic, as implemented in Visual Basic 6 and Visual Basic for Applications in Excel 2000 and higher. The development took approximately two years and resulted in the 'MIX' program, which can be downloaded from the program's website free of charge. Next, we set out to validate the MIX output with two major software packages as reference standards, namely STATA (metan, metabias, and metatrim and Comprehensive Meta-Analysis Version 2. Eight meta-analyses that had been published in major journals were used as data sources. All numerical and graphical results from analyses with MIX were identical to their counterparts in STATA and CMA. The MIX program distinguishes itself from most other programs by the extensive graphical output, the click-and-go (Excel interface, and the

  5. Bioinformatic Analysis of Plasma Apolipoproteins A-I and A-II Revealed Unique Features of A-I/A-II HDL Particles in Human Plasma

    Science.gov (United States)

    Kido, Toshimi; Kurata, Hideaki; Kondo, Kazuo; Itakura, Hiroshige; Okazaki, Mitsuyo; Urata, Takeyoshi; Yokoyama, Shinji

    2016-01-01

    Plasma concentration of apoA-I, apoA-II and apoA-II-unassociated apoA-I was analyzed in 314 Japanese subjects (177 males and 137 females), including one (male) homozygote and 37 (20 males and 17 females) heterozygotes of genetic CETP deficiency. ApoA-I unassociated with apoA-II markedly and linearly increased with HDL-cholesterol, while apoA-II increased only very slightly and the ratio of apoA-II-associated apoA-I to apoA-II stayed constant at 2 in molar ratio throughout the increase of HDL-cholesterol, among the wild type and heterozygous CETP deficiency. Thus, overall HDL concentration almost exclusively depends on HDL with apoA-I without apoA-II (LpAI) while concentration of HDL containing apoA-I and apoA-II (LpAI:AII) is constant having a fixed molar ratio of 2 : 1 regardless of total HDL and apoA-I concentration. Distribution of apoA-I between LpAI and LpAI:AII is consistent with a model of statistical partitioning regardless of sex and CETP genotype. The analysis also indicated that LpA-I accommodates on average 4 apoA-I molecules and has a clearance rate indistinguishable from LpAI:AII. Independent evidence indicated LpAI:A-II has a diameter 20% smaller than LpAI, consistent with a model having two apoA-I and one apoA-II. The functional contribution of these particles is to be investigated. PMID:27526664

  6. [Recent advances in analysis of petroleum geological samples by comprehensive two-dimensional gas chromatography].

    Science.gov (United States)

    Gao, Xuanbo; Chang, Zhenyang; Dai, Wei; Tong, Ting; Zhang, Wanfeng; He, Sheng; Zhu, Shukui

    2014-10-01

    Abundant geochemical information can be acquired by analyzing the chemical compositions of petroleum geological samples. The information obtained from the analysis provides scientifical evidences for petroleum exploration. However, these samples are complicated and can be easily influenced by physical (e. g. evaporation, emulsification, natural dispersion, dissolution and sorption), chemical (photodegradation) and biological (mainly microbial degradation) weathering processes. Therefore, it is very difficult to analyze the petroleum geological samples and they cannot be effectively separated by traditional gas chromatography/mass spectrometry. A newly developed separation technique, comprehensive two-dimensional gas chromatography (GC x GC), has unique advantages in complex sample analysis, and recently it has been applied to petroleum geological samples. This article mainly reviews the research progres- ses in the last five years, the main problems and the future research about GC x GC applied in the area of petroleum geology.

  7. p3d--Python module for structural bioinformatics.

    Science.gov (United States)

    Fufezan, Christian; Specht, Michael

    2009-08-21

    High-throughput bioinformatic analysis tools are needed to mine the large amount of structural data via knowledge based approaches. The development of such tools requires a robust interface to access the structural data in an easy way. For this the Python scripting language is the optimal choice since its philosophy is to write an understandable source code. p3d is an object oriented Python module that adds a simple yet powerful interface to the Python interpreter to process and analyse three dimensional protein structure files (PDB files). p3d's strength arises from the combination of a) very fast spatial access to the structural data due to the implementation of a binary space partitioning (BSP) tree, b) set theory and c) functions that allow to combine a and b and that use human readable language in the search queries rather than complex computer language. All these factors combined facilitate the rapid development of bioinformatic tools that can perform quick and complex analyses of protein structures. p3d is the perfect tool to quickly develop tools for structural bioinformatics using the Python scripting language.

  8. p3d – Python module for structural bioinformatics

    Directory of Open Access Journals (Sweden)

    Fufezan Christian

    2009-08-01

    Full Text Available Abstract Background High-throughput bioinformatic analysis tools are needed to mine the large amount of structural data via knowledge based approaches. The development of such tools requires a robust interface to access the structural data in an easy way. For this the Python scripting language is the optimal choice since its philosophy is to write an understandable source code. Results p3d is an object oriented Python module that adds a simple yet powerful interface to the Python interpreter to process and analyse three dimensional protein structure files (PDB files. p3d's strength arises from the combination of a very fast spatial access to the structural data due to the implementation of a binary space partitioning (BSP tree, b set theory and c functions that allow to combine a and b and that use human readable language in the search queries rather than complex computer language. All these factors combined facilitate the rapid development of bioinformatic tools that can perform quick and complex analyses of protein structures. Conclusion p3d is the perfect tool to quickly develop tools for structural bioinformatics using the Python scripting language.

  9. mORCA: sailing bioinformatics world with mobile devices.

    Science.gov (United States)

    Díaz-Del-Pino, Sergio; Falgueras, Juan; Perez-Wohlfeil, Esteban; Trelles, Oswaldo

    2018-03-01

    Nearly 10 years have passed since the first mobile apps appeared. Given the fact that bioinformatics is a web-based world and that mobile devices are endowed with web-browsers, it seemed natural that bioinformatics would transit from personal computers to mobile devices but nothing could be further from the truth. The transition demands new paradigms, designs and novel implementations. Throughout an in-depth analysis of requirements of existing bioinformatics applications we designed and deployed an easy-to-use web-based lightweight mobile client. Such client is able to browse, select, compose automatically interface parameters, invoke services and monitor the execution of Web Services using the service's metadata stored in catalogs or repositories. mORCA is available at http://bitlab-es.com/morca/app as a web-app. It is also available in the App store by Apple and Play Store by Google. The software will be available for at least 2 years. ortrelles@uma.es. Source code, final web-app, training material and documentation is available at http://bitlab-es.com/morca. © The Author(s) 2017. Published by Oxford University Press.

  10. GOBLET: The Global Organisation for Bioinformatics Learning, Education and Training

    Science.gov (United States)

    Atwood, Teresa K.; Bongcam-Rudloff, Erik; Brazas, Michelle E.; Corpas, Manuel; Gaudet, Pascale; Lewitter, Fran; Mulder, Nicola; Palagi, Patricia M.; Schneider, Maria Victoria; van Gelder, Celia W. G.

    2015-01-01

    In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy—paradoxically, many are actually closing “niche” bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all. PMID:25856076

  11. Combining medical informatics and bioinformatics toward tools for personalized medicine.

    Science.gov (United States)

    Sarachan, B D; Simmons, M K; Subramanian, P; Temkin, J M

    2003-01-01

    Key bioinformatics and medical informatics research areas need to be identified to advance knowledge and understanding of disease risk factors and molecular disease pathology in the 21 st century toward new diagnoses, prognoses, and treatments. Three high-impact informatics areas are identified: predictive medicine (to identify significant correlations within clinical data using statistical and artificial intelligence methods), along with pathway informatics and cellular simulations (that combine biological knowledge with advanced informatics to elucidate molecular disease pathology). Initial predictive models have been developed for a pilot study in Huntington's disease. An initial bioinformatics platform has been developed for the reconstruction and analysis of pathways, and work has begun on pathway simulation. A bioinformatics research program has been established at GE Global Research Center as an important technology toward next generation medical diagnostics. We anticipate that 21 st century medical research will be a combination of informatics tools with traditional biology wet lab research, and that this will translate to increased use of informatics techniques in the clinic.

  12. Best practices in bioinformatics training for life scientists.

    KAUST Repository

    Via, Allegra

    2013-06-25

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists.

  13. Translational Bioinformatics and Clinical Research (Biomedical) Informatics.

    Science.gov (United States)

    Sirintrapun, S Joseph; Zehir, Ahmet; Syed, Aijazuddin; Gao, JianJiong; Schultz, Nikolaus; Cheng, Donavan T

    2015-06-01

    Translational bioinformatics and clinical research (biomedical) informatics are the primary domains related to informatics activities that support translational research. Translational bioinformatics focuses on computational techniques in genetics, molecular biology, and systems biology. Clinical research (biomedical) informatics involves the use of informatics in discovery and management of new knowledge relating to health and disease. This article details 3 projects that are hybrid applications of translational bioinformatics and clinical research (biomedical) informatics: The Cancer Genome Atlas, the cBioPortal for Cancer Genomics, and the Memorial Sloan Kettering Cancer Center clinical variants and results database, all designed to facilitate insights into cancer biology and clinical/therapeutic correlations. Copyright © 2015 Elsevier Inc. All rights reserved.

  14. The Aspergillus Mine - publishing bioinformatics

    DEFF Research Database (Denmark)

    Vesth, Tammi Camilla; Rasmussen, Jane Lind Nybo; Theobald, Sebastian

    Genome analysis is no longer a field reserved for specialists and experimental laboratories are doing groundbreaking research using genome sequencing and analysis. In this new era, it is essential that data, analysis and results are shared between scientists. But this can be a challenge, even mor...

  15. The GMOD Drupal Bioinformatic Server Framework

    Science.gov (United States)

    Papanicolaou, Alexie; Heckel, David G.

    2010-01-01

    Motivation: Next-generation sequencing technologies have led to the widespread use of -omic applications. As a result, there is now a pronounced bioinformatic bottleneck. The general model organism database (GMOD) tool kit (http://gmod.org) has produced a number of resources aimed at addressing this issue. It lacks, however, a robust online solution that can deploy heterogeneous data and software within a Web content management system (CMS). Results: We present a bioinformatic framework for the Drupal CMS. It consists of three modules. First, GMOD-DBSF is an application programming interface module for the Drupal CMS that simplifies the programming of bioinformatic Drupal modules. Second, the Drupal Bioinformatic Software Bench (biosoftware_bench) allows for a rapid and secure deployment of bioinformatic software. An innovative graphical user interface (GUI) guides both use and administration of the software, including the secure provision of pre-publication datasets. Third, we present genes4all_experiment, which exemplifies how our work supports the wider research community. Conclusion: Given the infrastructure presented here, the Drupal CMS may become a powerful new tool set for bioinformaticians. The GMOD-DBSF base module is an expandable community resource that decreases development time of Drupal modules for bioinformatics. The biosoftware_bench module can already enhance biologists' ability to mine their own data. The genes4all_experiment module has already been responsible for archiving of more than 150 studies of RNAi from Lepidoptera, which were previously unpublished. Availability and implementation: Implemented in PHP and Perl. Freely available under the GNU Public License 2 or later from http://gmod-dbsf.googlecode.com Contact: alexie@butterflybase.org PMID:20971988

  16. The GMOD Drupal bioinformatic server framework.

    Science.gov (United States)

    Papanicolaou, Alexie; Heckel, David G

    2010-12-15

    Next-generation sequencing technologies have led to the widespread use of -omic applications. As a result, there is now a pronounced bioinformatic bottleneck. The general model organism database (GMOD) tool kit (http://gmod.org) has produced a number of resources aimed at addressing this issue. It lacks, however, a robust online solution that can deploy heterogeneous data and software within a Web content management system (CMS). We present a bioinformatic framework for the Drupal CMS. It consists of three modules. First, GMOD-DBSF is an application programming interface module for the Drupal CMS that simplifies the programming of bioinformatic Drupal modules. Second, the Drupal Bioinformatic Software Bench (biosoftware_bench) allows for a rapid and secure deployment of bioinformatic software. An innovative graphical user interface (GUI) guides both use and administration of the software, including the secure provision of pre-publication datasets. Third, we present genes4all_experiment, which exemplifies how our work supports the wider research community. Given the infrastructure presented here, the Drupal CMS may become a powerful new tool set for bioinformaticians. The GMOD-DBSF base module is an expandable community resource that decreases development time of Drupal modules for bioinformatics. The biosoftware_bench module can already enhance biologists' ability to mine their own data. The genes4all_experiment module has already been responsible for archiving of more than 150 studies of RNAi from Lepidoptera, which were previously unpublished. Implemented in PHP and Perl. Freely available under the GNU Public License 2 or later from http://gmod-dbsf.googlecode.com.

  17. Analyses of Brucella pathogenesis, host immunity, and vaccine targets using systems biology and bioinformatics

    Directory of Open Access Journals (Sweden)

    Yongqun eHe

    2012-02-01

    Full Text Available Brucella is a Gram-negative, facultative intracellular bacterium that causes zoonotic brucellosis in humans and various animals. Out of ten classified Brucella species, B. melitensis, B. abortus, B. suis, and B. canis are pathogenic to humans. In the past decade, the mechanisms of Brucella pathogenesis and host immunity have been extensively investigated using the cutting edge systems biology and bioinformatics approaches. This article provides a comprehensive review of the applications of Omics (including genomics, transcriptomics, and proteomics and bioinformatics technologies for the analysis of Brucella pathogenesis, host immune responses, and vaccine targets. Based on more than 30 sequenced Brucella genomes, comparative genomics is able to identify gene variations among Brucella strains that help to explain host specificity and virulence differences among Brucella species. Diverse transcriptomics and proteomics gene expression studies have been conducted to analyze gene expression profiles of wild type Brucella strains and mutants under different laboratory conditions. High throughput Omics analyses of host responses to infections with virulent or attenuated Brucella strains have been focused on responses by mouse and cattle macrophages, bovine trophoblastic cells, mouse and boar splenocytes, and ram buffy coat. Differential serum responses in humans and rams to Brucella infections have been analyzed using high throughput serum antibody screening technology. The Vaxign reverse vaccinology has been used to predict many Brucella vaccine targets. More than 180 Brucella virulence factors and their gene interaction networks have been identified using advanced literature mining methods. The recent development of community-based Vaccine Ontology and Brucellosis Ontology provides an efficient way for Brucella data integration, exchange, and computer-assisted automated reasoning.

  18. Analyses of Brucella Pathogenesis, Host Immunity, and Vaccine Targets using Systems Biology and Bioinformatics

    Science.gov (United States)

    He, Yongqun

    2011-01-01

    Brucella is a Gram-negative, facultative intracellular bacterium that causes zoonotic brucellosis in humans and various animals. Out of 10 classified Brucella species, B. melitensis, B. abortus, B. suis, and B. canis are pathogenic to humans. In the past decade, the mechanisms of Brucella pathogenesis and host immunity have been extensively investigated using the cutting edge systems biology and bioinformatics approaches. This article provides a comprehensive review of the applications of Omics (including genomics, transcriptomics, and proteomics) and bioinformatics technologies for the analysis of Brucella pathogenesis, host immune responses, and vaccine targets. Based on more than 30 sequenced Brucella genomes, comparative genomics is able to identify gene variations among Brucella strains that help to explain host specificity and virulence differences among Brucella species. Diverse transcriptomics and proteomics gene expression studies have been conducted to analyze gene expression profiles of wild type Brucella strains and mutants under different laboratory conditions. High throughput Omics analyses of host responses to infections with virulent or attenuated Brucella strains have been focused on responses by mouse and cattle macrophages, bovine trophoblastic cells, mouse and boar splenocytes, and ram buffy coat. Differential serum responses in humans and rams to Brucella infections have been analyzed using high throughput serum antibody screening technology. The Vaxign reverse vaccinology has been used to predict many Brucella vaccine targets. More than 180 Brucella virulence factors and their gene interaction networks have been identified using advanced literature mining methods. The recent development of community-based Vaccine Ontology and Brucellosis Ontology provides an efficient way for Brucella data integration, exchange, and computer-assisted automated reasoning. PMID:22919594

  19. The Perseus computational platform for comprehensive analysis of (prote)omics data.

    Science.gov (United States)

    Tyanova, Stefka; Temu, Tikira; Sinitcyn, Pavel; Carlson, Arthur; Hein, Marco Y; Geiger, Tamar; Mann, Matthias; Cox, Jürgen

    2016-09-01

    A main bottleneck in proteomics is the downstream biological analysis of highly multivariate quantitative protein abundance data generated using mass-spectrometry-based analysis. We developed the Perseus software platform (http://www.perseus-framework.org) to support biological and biomedical researchers in interpreting protein quantification, interaction and post-translational modification data. Perseus contains a comprehensive portfolio of statistical tools for high-dimensional omics data analysis covering normalization, pattern recognition, time-series analysis, cross-omics comparisons and multiple-hypothesis testing. A machine learning module supports the classification and validation of patient groups for diagnosis and prognosis, and it also detects predictive protein signatures. Central to Perseus is a user-friendly, interactive workflow environment that provides complete documentation of computational methods used in a publication. All activities in Perseus are realized as plugins, and users can extend the software by programming their own, which can be shared through a plugin store. We anticipate that Perseus's arsenal of algorithms and its intuitive usability will empower interdisciplinary analysis of complex large data sets.

  20. FUNCTIONAL ANALYSIS OF FUTURE MUSIC ART TEACHERS’ TRAINING FOR SINGING ACTIVITY OF COMPREHENSIVE SCHOOL SENIOR STUDENTS

    Directory of Open Access Journals (Sweden)

    Ma Chen

    2017-04-01

    Full Text Available In the article the functional analysis of future music art teachers’ training for singing activity of comprehensive school senior students is depicted. This issue is very important because improving educators and musicians’ training contributes not only to professional selfactualisation, but also to young generation’s encouraging for thorough learning music art works and their creative development in the process of group music tuitions. Extracurricular singing activity also plays an important part. It reveals art images to students, enriching creativity experience, forms the spiritual world, develops independent thinking, awakens creativity. The author points out the main functions of future music art teachers’ training. They are system and value, information, communication, creative and transformative, projective functions. The special attention is paid to characterizing the features of each function. The author claims that system and value function relates to the necessity to analyze the results of the educational process that contributes to productive solving problems by students and main tasks of music training. Information function is a subject background of art music teachers’ and pedagogical activities. Communicative function is realized in a teacher’s ability to develop the student’ initiative to plan cooperative activities, to be able to distribute duties, to carry out instructions, to coordinate cooperative activities, to create special situations for the implementation of educational influence. The analysis of pedagogical and methodological literature shows that The creative and transformative function is manifested in the creative use of pedagogical and methodological ideas in specific pedagogical conditions. The projective function is thought to promote the most complete realization of content of comprehensive and art education. Functional analysis of students’ training of art faculties at pedagogical universities to

  1. Comprehensive RNA-Seq Analysis on the Regulation of Tomato Ripening by Exogenous Auxin.

    Directory of Open Access Journals (Sweden)

    Jiayin Li

    Full Text Available Auxin has been shown to modulate the fruit ripening process. However, the molecular mechanisms underlying auxin regulation of fruit ripening are still not clear. Illumina RNA sequencing was performed on mature green cherry tomato fruit 1 and 7 days after auxin treatment, with untreated fruit as a control. The results showed that exogenous auxin maintained system 1 ethylene synthesis and delayed the onset of system 2 ethylene synthesis and the ripening process. At the molecular level, genes associated with stress resistance were significantly up-regulated, but genes related to carotenoid metabolism, cell degradation and energy metabolism were strongly down-regulated by exogenous auxin. Furthermore, genes encoding DNA demethylases were inhibited by auxin, whereas genes encoding cytosine-5 DNA methyltransferases were induced, which contributed to the maintenance of high methylation levels in the nucleus and thus inhibited the ripening process. Additionally, exogenous auxin altered the expression patterns of ethylene and auxin signaling-related genes that were induced or repressed in the normal ripening process, suggesting significant crosstalk between these two hormones during tomato ripening. The present work is the first comprehensive transcriptome analysis of auxin-treated tomato fruit during ripening. Our results provide comprehensive insights into the effects of auxin on the tomato ripening process and the mechanism of crosstalk between auxin and ethylene.

  2. A Principal Component Analysis/Fuzzy Comprehensive Evaluation for Rockburst Potential in Kimberlite

    Science.gov (United States)

    Pu, Yuanyuan; Apel, Derek; Xu, Huawei

    2018-02-01

    Kimberlite is an igneous rock which sometimes bears diamonds. Most of the diamonds mined in the world today are found in kimberlite ores. Burst potential in kimberlite has not been investigated, because kimberlite is mostly mined using open-pit mining, which poses very little threat of rock bursting. However, as the mining depth keeps increasing, the mines convert to underground mining methods, which can pose a threat of rock bursting in kimberlite. This paper focuses on the burst potential of kimberlite at a diamond mine in northern Canada. A combined model with the methods of principal component analysis (PCA) and fuzzy comprehensive evaluation (FCE) is developed to process data from 12 different locations in kimberlite pipes. Based on calculated 12 fuzzy evaluation vectors, 8 locations show a moderate burst potential, 2 locations show no burst potential, and 2 locations show strong and violent burst potential, respectively. Using statistical principles, a Mahalanobis distance is adopted to build a comprehensive fuzzy evaluation vector for the whole mine and the final evaluation for burst potential is moderate, which is verified by a practical rockbursting situation at mine site.

  3. Analysis of oxidised heavy paraffininc products by high temperature comprehensive two-dimensional gas chromatography.

    Science.gov (United States)

    Potgieter, H; Bekker, R; Beigley, J; Rohwer, E

    2017-08-04

    Heavy petroleum fractions are produced during crude and synthetic crude oil refining processes and they need to be upgraded to useable products to increase their market value. Usually these fractions are upgraded to fuel products by hydrocracking, hydroisomerization and hydrogenation processes. These fractions are also upgraded to other high value commercial products like lubricant oils and waxes by distillation, hydrogenation, and oxidation and/or blending. Oxidation of hydrogenated heavy paraffinic fractions produces high value products that contain a variety of oxygenates and the characterization of these heavy oxygenates is very important for the control of oxidation processes. Traditionally titrimetric procedures are used to monitor oxygenate formation, however, these titrimetric procedures are tedious and lack selectivity toward specific oxygenate classes in complex matrices. Comprehensive two-dimensional gas chromatography (GC×GC) is a way of increasing peak capacity for the comprehensive analysis of complex samples. Other groups have used HT-GC×GC to extend the carbon number range attainable by GC×GC and have optimised HT-GC×GC parameters for the separation of aromatics, nitrogen-containing compounds as well as sulphur-containing compounds in heavy petroleum fractions. HT-GC×GC column combinations for the separation of oxygenates in oxidised heavy paraffinic fractions are optimised in this study. The advantages of the HT-GC×GC method in the monitoring of the oxidation reactions of heavy paraffinic fraction samples are illustrated. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. Bioinformatics analysis of SARS coronavirus genome polymorphism

    Directory of Open Access Journals (Sweden)

    Pavlović-Lažetić Gordana M

    2004-05-01

    Full Text Available Abstract Background We have compared 38 isolates of the SARS-CoV complete genome. The main goal was twofold: first, to analyze and compare nucleotide sequences and to identify positions of single nucleotide polymorphism (SNP, insertions and deletions, and second, to group them according to sequence similarity, eventually pointing to phylogeny of SARS-CoV isolates. The comparison is based on genome polymorphism such as insertions or deletions and the number and positions of SNPs. Results The nucleotide structure of all 38 isolates is presented. Based on insertions and deletions and dissimilarity due to SNPs, the dataset of all the isolates has been qualitatively classified into three groups each having their own subgroups. These are the A-group with "regular" isolates (no insertions / deletions except for 5' and 3' ends, the B-group of isolates with "long insertions", and the C-group of isolates with "many individual" insertions and deletions. The isolate with the smallest average number of SNPs, compared to other isolates, has been identified (TWH. The density distribution of SNPs, insertions and deletions for each group or subgroup, as well as cumulatively for all the isolates is also presented, along with the gene map for TWH. Since individual SNPs may have occurred at random, positions corresponding to multiple SNPs (occurring in two or more isolates are identified and presented. This result revises some previous results of a similar type. Amino acid changes caused by multiple SNPs are also identified (for the annotated sequences, as well as presupposed amino acid changes for non-annotated ones. Exact SNP positions for the isolates in each group or subgroup are presented. Finally, a phylogenetic tree for the SARS-CoV isolates has been produced using the CLUSTALW program, showing high compatibility with former qualitative classification. Conclusions The comparative study of SARS-CoV isolates provides essential information for genome polymorphism, indication of strain differences and variants evolution. It may help with the development of effective treatment.

  5. Applied bioinformatics: Genome annotation and transcriptome analysis

    DEFF Research Database (Denmark)

    Gupta, Vikas

    agricultural and biological importance. Its capacity to form symbiotic relationships with rhizobia and microrrhizal fungi has fascinated researchers for years. Lotus has a small genome of approximately 470 Mb and a short life cycle of 2 to 3 months, which has made Lotus a model legume plant for many molecular...

  6. Corrective interpersonal experience in psychodrama group therapy: a comprehensive process analysis of significant therapeutic events.

    Science.gov (United States)

    McVea, Charmaine S; Gow, Kathryn; Lowe, Roger

    2011-07-01

    This study investigated the process of resolving painful emotional experience during psychodrama group therapy, by examining significant therapeutic events within seven psychodrama enactments. A comprehensive process analysis of four resolved and three not-resolved cases identified five meta-processes which were linked to in-session resolution. One was a readiness to engage in the therapeutic process, which was influenced by client characteristics and the client's experience of the group; and four were therapeutic events: (1) re-experiencing with insight; (2) activating resourcefulness; (3) social atom repair with emotional release; and (4) integration. A corrective interpersonal experience (social atom repair) healed the sense of fragmentation and interpersonal disconnection associated with unresolved emotional pain, and emotional release was therapeutically helpful when located within the enactment of this new role relationship. Protagonists who experienced resolution reported important improvements in interpersonal functioning and sense of self which they attributed to this experience.

  7. Generic dynamic wind turbine models for power system stability analysis: A comprehensive review

    DEFF Research Database (Denmark)

    Honrubia-Escribano, A.; Gómez-Lázaro, E.; Fortmann, J.

    2018-01-01

    In recent years, international working groups, mainly from the International Electrotechnical Commission (IEC) and the Western Electricity Coordinating Council (WECC), have made a major effort to develop generic —also known as simplified or standard— dynamic wind turbine models to be used for power...... system stability analysis. These models are required by power system operators to conduct the planning and operation activities of their networks since the use of detailed manufacturer models is not practical. This paper presents a comprehensive review of the work done in this field, based on the results...... obtained by IEC and WECC working groups in the course of their research, which have motivated the publication of the IEC 61400-27 in February 2015. The final published versions of the generic models developed according to the existing four wind turbine technology types are detailed, highlighting...

  8. Comprehensive analysis of the specificity of transcription activator-like effector nucleases

    DEFF Research Database (Denmark)

    Juillerat, Alexandre; Dubois, Gwendoline; Valton, Julien

    2014-01-01

    A key issue when designing and using DNA-targeting nucleases is specificity. Ideally, an optimal DNA-targeting tool has only one recognition site within a genomic sequence. In practice, however, almost all designer nucleases available today can accommodate one to several mutations within...... their target site. The ability to predict the specificity of targeting is thus highly desirable. Here, we describe the first comprehensive experimental study focused on the specificity of the four commonly used repeat variable diresidues (RVDs; NI:A, HD:C, NN:G and NG:T) incorporated in transcription activator......-like effector nucleases (TALEN). The analysis of >15 500 unique TALEN/DNA cleavage profiles allowed us to monitor the specificity gradient of the RVDs along a TALEN/DNA binding array and to present a specificity scoring matrix for RVD/nucleotide association. Furthermore, we report that TALEN can only...

  9. Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome

    DEFF Research Database (Denmark)

    Peng, Zhiyu; Cheng, Yanbing; Tan, Bertrand Chin-Ming

    2012-01-01

    a computational pipeline that carefully controls for false positives while calling RNA editing events from genome and whole-transcriptome data of the same individual. We identified 22,688 RNA editing events in noncoding genes and introns, untranslated regions and coding sequences of protein-coding genes. Most......RNA editing is a post-transcriptional event that recodes hereditary information. Here we describe a comprehensive profile of the RNA editome of a male Han Chinese individual based on analysis of ∼767 million sequencing reads from poly(A)(+), poly(A)(-) and small RNA samples. We developed...... changes (∼93%) converted A to I(G), consistent with known editing mechanisms based on adenosine deaminase acting on RNA (ADAR). We also found evidence of other types of nucleotide changes; however, these were validated at lower rates. We found 44 editing sites in microRNAs (miRNAs), suggesting a potential...

  10. Bioinformatic tools for PCR Primer design

    African Journals Online (AJOL)

    ES

    reaction (PCR), oligo hybridization and DNA sequencing. Proper primer design is actually one of the most important factors/steps in successful DNA sequencing. Various bioinformatics programs are available for selection of primer pairs from a template sequence. The plethora programs for PCR primer design reflects the.

  11. "Extreme Programming" in a Bioinformatics Class

    Science.gov (United States)

    Kelley, Scott; Alger, Christianna; Deutschman, Douglas

    2009-01-01

    The importance of Bioinformatics tools and methodology in modern biological research underscores the need for robust and effective courses at the college level. This paper describes such a course designed on the principles of cooperative learning based on a computer software industry production model called "Extreme Programming" (EP).…

  12. Bioinformatics: A History of Evolution "In Silico"

    Science.gov (United States)

    Ondrej, Vladan; Dvorak, Petr

    2012-01-01

    Bioinformatics, biological databases, and the worldwide use of computers have accelerated biological research in many fields, such as evolutionary biology. Here, we describe a primer of nucleotide sequence management and the construction of a phylogenetic tree with two examples; the two selected are from completely different groups of organisms:…

  13. Protein raftophilicity. How bioinformatics can help membranologists

    DEFF Research Database (Denmark)

    Nielsen, Henrik; Sperotto, Maria Maddalena

    )-based bioinformatics approach. The ANN was trained to recognize feature-based patterns in proteins that are considered to be associated with lipid rafts. The trained ANN was then used to predict protein raftophilicity. We found that, in the case of α-helical membrane proteins, their hydrophobic length does not affect...

  14. Bioinformatics in Undergraduate Education: Practical Examples

    Science.gov (United States)

    Boyle, John A.

    2004-01-01

    Bioinformatics has emerged as an important research tool in recent years. The ability to mine large databases for relevant information has become increasingly central to many different aspects of biochemistry and molecular biology. It is important that undergraduates be introduced to the available information and methodologies. We present a…

  15. Implementing bioinformatic workflows within the bioextract server

    Science.gov (United States)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows typically require the integrated use of multiple, distributed data sources and analytic tools. The BioExtract Server (http://bioextract.org) is a distributed servi...

  16. Privacy Preserving PCA on Distributed Bioinformatics Datasets

    Science.gov (United States)

    Li, Xin

    2011-01-01

    In recent years, new bioinformatics technologies, such as gene expression microarray, genome-wide association study, proteomics, and metabolomics, have been widely used to simultaneously identify a huge number of human genomic/genetic biomarkers, generate a tremendously large amount of data, and dramatically increase the knowledge on human…

  17. Bioboxes: standardised containers for interchangeable bioinformatics software.

    Science.gov (United States)

    Belmann, Peter; Dröge, Johannes; Bremges, Andreas; McHardy, Alice C; Sczyrba, Alexander; Barton, Michael D

    2015-01-01

    Software is now both central and essential to modern biology, yet lack of availability, difficult installations, and complex user interfaces make software hard to obtain and use. Containerisation, as exemplified by the Docker platform, has the potential to solve the problems associated with sharing software. We propose bioboxes: containers with standardised interfaces to make bioinformatics software interchangeable.

  18. Development and implementation of a bioinformatics online ...

    African Journals Online (AJOL)

    Thus, there is the need for appropriate strategies of introducing the basic components of this emerging scientific field to part of the African populace through the development of an online distance education learning tool. This study involved the design of a bioinformatics online distance educative tool an implementation of ...

  19. SPECIES DATABASES AND THE BIOINFORMATICS REVOLUTION.

    Science.gov (United States)

    Biological databases are having a growth spurt. Much of this results from research in genetics and biodiversity, coupled with fast-paced developments in information technology. The revolution in bioinformatics, defined by Sugden and Pennisi (2000) as the "tools and techniques for...

  20. A Comprehensive Database and Analysis Framework To Incorporate Multiscale Data Types and Enable Integrated Analysis of Bioactive Polyphenols.

    Science.gov (United States)

    Ho, Lap; Cheng, Haoxiang; Wang, Jun; Simon, James E; Wu, Qingli; Zhao, Danyue; Carry, Eileen; Ferruzzi, Mario G; Faith, Jeremiah; Valcarcel, Breanna; Hao, Ke; Pasinetti, Giulio M

    2018-03-05

    The development of a given botanical preparation for eventual clinical application requires extensive, detailed characterizations of the chemical composition, as well as the biological availability, biological activity, and safety profiles of the botanical. These issues are typically addressed using diverse experimental protocols and model systems. Based on this consideration, in this study we established a comprehensive database and analysis framework for the collection, collation, and integrative analysis of diverse, multiscale data sets. Using this framework, we conducted an integrative analysis of heterogeneous data from in vivo and in vitro investigation of a complex bioactive dietary polyphenol-rich preparation (BDPP) and built an integrated network linking data sets generated from this multitude of diverse experimental paradigms. We established a comprehensive database and analysis framework as well as a systematic and logical means to catalogue and collate the diverse array of information gathered, which is securely stored and added to in a standardized manner to enable fast query. We demonstrated the utility of the database in (1) a statistical ranking scheme to prioritize response to treatments and (2) in depth reconstruction of functionality studies. By examination of these data sets, the system allows analytical querying of heterogeneous data and the access of information related to interactions, mechanism of actions, functions, etc., which ultimately provide a global overview of complex biological responses. Collectively, we present an integrative analysis framework that leads to novel insights on the biological activities of a complex botanical such as BDPP that is based on data-driven characterizations of interactions between BDPP-derived phenolic metabolites and their mechanisms of action, as well as synergism and/or potential cancellation of biological functions. Out integrative analytical approach provides novel means for a systematic integrative

  1. A comprehensive analysis of high-magnitude streamflow and trends in the Central Valley, California

    Science.gov (United States)

    Kocis, T. N.; Dahlke, H. E.

    2017-12-01

    California's climate is characterized by the largest precipitation and streamflow variability observed within the conterminous US. This, combined with chronic groundwater overdraft of 0.6-3.5 km3 yr-1, creates the need to identify additional surface water sources available for groundwater recharge using methods such as agricultural groundwater banking, aquifer storage and recovery, and spreading basins. High-magnitude streamflow, i.e. flow above the 90th percentile, that exceeds environmental flow requirements and current surface water allocations under California water rights, could be a viable source of surface water for groundwater banking. Here, we present a comprehensive analysis of the magnitude, frequency, duration and timing of high-magnitude streamflow (HMF "metrics") over multiple time periods for 93 stream gauges covering the Sacramento, San Joaquin and Tulare basins in California. In addition, we present trend analyses conducted on the same dataset and all HMF metrics using generalized additive models, the Mann-Kendall trend test, and the Signal to Noise Ratio test. The results of the comprehensive analysis show, in short, that in an average year with HMF approximately 3.2 km3 of high-magnitude flow is exported from the entire Central Valley to the Sacramento-San Joaquin Delta, often at times when environmental flow requirements of the Delta and major rivers are exceeded. High-magnitude flow occurs, on average, during 7 and 4.7 out of 10 years in the Sacramento River and the San Joaquin-Tulare Basins, respectively, from just a few storm events (5-7 1-day peak events) lasting for a total of 25-30 days between November and April. Preliminary trend tests suggest that all HMF metrics show limited change over the last 50 years. As a whole, the results suggest that there is sufficient unmanaged surface water physically available to mitigate long-term groundwater overdraft in the Central Valley.

  2. Comprehensive structural analysis of the HCPB demo blanket under thermal, mechanical, electromagnetic and radiation induced loads

    International Nuclear Information System (INIS)

    Boccaccini, L.V.; Norajitra, P.; Ruatto, P.; Scaffidi-Argentina, F.

    1998-01-01

    For the helium-cooled pebble bed (HCPB) blanket, which is one of the two reference concepts studied within the European Demo Development Program, a comprehensive finite element (FEM) structural analysis has been performed. The analysis refers to the steady-state operating conditions of an outboard blanket segment. On the basis of a three-dimensional model of radial-toroidal sections of the segment box, thermal stresses caused by the temperature gradients in the blanket structure have been calculated. Furthermore, the mechanical loads due to coolant pressure in normal operating conditions as well as an accidental over-pressurization of the blanket box have been accounted for. The stresses caused by a central plasma major disruption from an initial current of 20 MA to zero in 20 ms have been also taken into account. Radiation-induced dimensional changes of breeder and multiplier material caused by both helium production and neutron damage, have also been evaluated and discussed. All the above loads have been combined as input for a FEM stress analysis and the resulting stress distribution has been evaluated according to the American Society of Mechanical Engineers (ASME) norms. (orig.)

  3. Comprehensive neutron cross-section and secondary energy distribution uncertainty analysis for a fusion reactor

    International Nuclear Information System (INIS)

    Gerstl, S.A.W.; LaBauve, R.J.; Young, P.G.

    1980-05-01

    On the example of General Atomic's well-documented Power Generating Fusion Reactor (PGFR) design, this report exercises a comprehensive neutron cross-section and secondary energy distribution (SED) uncertainty analysis. The LASL sensitivity and uncertainty analysis code SENSIT is used to calculate reaction cross-section sensitivity profiles and integral SED sensitivity coefficients. These are then folded with covariance matrices and integral SED uncertainties to obtain the resulting uncertainties of three calculated neutronics design parameters: two critical radiation damage rates and a nuclear heating rate. The report documents the first sensitivity-based data uncertainty analysis, which incorporates a quantitative treatment of the effects of SED uncertainties. The results demonstrate quantitatively that the ENDF/B-V cross-section data files for C, H, and O, including their SED data, are fully adequate for this design application, while the data for Fe and Ni are at best marginally adequate because they give rise to response uncertainties up to 25%. Much higher response uncertainties are caused by cross-section and SED data uncertainties in Cu (26 to 45%), tungsten (24 to 54%), and Cr (up to 98%). Specific recommendations are given for re-evaluations of certain reaction cross-sections, secondary energy distributions, and uncertainty estimates

  4. Comprehensive Method for Culturing Embryonic Dorsal Root Ganglion Neurons for Seahorse Extracellular Flux XF24 Analysis.

    Science.gov (United States)

    Lange, Miranda; Zeng, Yan; Knight, Andrew; Windebank, Anthony; Trushina, Eugenia

    2012-01-01

    Changes in mitochondrial dynamics and function contribute to progression of multiple neurodegenerative diseases including peripheral neuropathies. The Seahorse Extracellular Flux XF24 analyzer provides a comprehensive assessment of the relative state of glycolytic and aerobic metabolism in live cells making this method instrumental in assessing mitochondrial function. One of the most important steps in the analysis of mitochondrial respiration using the Seahorse XF24 analyzer is plating a uniform monolayer of firmly attached cells. However, culturing of primary dorsal root ganglion (DRG) neurons is associated with multiple challenges, including their propensity to form clumps and detach from the culture plate. This could significantly interfere with proper analysis and interpretation of data. We have tested multiple cell culture parameters including coating substrates, culture medium, XF24 microplate plastics, and plating techniques in order to optimize plating conditions. Here we describe a highly reproducible method to obtain neuron-enriched monolayers of securely attached dissociated primary embryonic (E15) rat DRG neurons suitable for analysis with the Seahorse XF24 platform.

  5. Comprehensive method for culturing embryonic dorsal root ganglion neurons for Seahorse Extracellular Flux XF24 Analysis

    Directory of Open Access Journals (Sweden)

    Miranda L. Lange

    2012-12-01

    Full Text Available Changes in mitochondrial dynamics and function contribute to progression of multiple neurodegenerative diseases including peripheral neuropathies. The Seahorse Extracellular Flux XF24 analyzer provides a comprehensive assessment of the relative state of glycolytic and aerobic metabolism in live cells making this method instrumental in assessing mitochondrial function. One of the most important steps in the analysis of mitochondrial respiration using the Seahorse XF24 analyzer is plating a uniform monolayer of firmly attached cells. However, culturing of primary dorsal root ganglion (DRG neurons is associated with multiple challenges, including their propensity to form clumps and detach from the culture plate. This could significantly interfere with proper analysis and interpretation of data. We have tested multiple cell culture parameters including coating substrates, culture medium, XF24 microplate plastics, and plating techniques in order to optimize plating conditions. Here we describe a highly reproducible method to obtain neuron-enriched monolayers of securely attached dissociated primary embryonic (E15 rat DRG neurons suitable for analysis with the Seahorse XF24 platform.

  6. Molecular bioinformatics: algorithms and applications

    National Research Council Canada - National Science Library

    Schulze-Kremer, S

    1996-01-01

    ... on molecular biology, especially D N A sequence analysis and protein structure prediction. These two issues are also central to this book. Other application areas covered here are: interpretation of spectroscopic data and discovery of structure-function relationships in D N A and proteins. Figure 1 depicts the interdependence of computer science,...

  7. Navigating the changing learning landscape: perspective from bioinformatics.ca.

    Science.gov (United States)

    Brazas, Michelle D; Ouellette, B F Francis

    2013-09-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable in the learning continuum. Bioinformatics.ca, which hosts the Canadian Bioinformatics Workshops, has blended more traditional learning styles with current online and social learning styles. Here we share our growing experiences over the past 12 years and look toward what the future holds for bioinformatics training programs.

  8. Primary treatments for clinically localised prostate cancer: a comprehensive lifetime cost-utility analysis.

    Science.gov (United States)

    Cooperberg, Matthew R; Ramakrishna, Naren R; Duff, Steven B; Hughes, Kathleen E; Sadownik, Sara; Smith, Joseph A; Tewari, Ashutosh K

    2013-03-01

    WHAT'S KNOWN ON THE SUBJECT? AND WHAT DOES THE STUDY ADD?: Multiple treatment alternatives exist for localised prostate cancer, with few high-quality studies directly comparing their comparative effectiveness and costs. The present study is the most comprehensive cost-effectiveness analysis to date for localised prostate cancer, conducted with a lifetime horizon and accounting for survival, health-related quality-of-life, and cost impact of secondary treatments and other downstream events, as well as primary treatment choices. The analysis found minor differences, generally slightly favouring surgical methods, in quality-adjusted life years across treatment options. However, radiation therapy (RT) was consistently more expensive than surgery, and some alternatives, e.g. intensity-modulated RT for low-risk disease, were dominated - that is, both more expensive and less effective than competing alternatives. To characterise the costs and outcomes associated with radical prostatectomy (open, laparoscopic, or robot-assisted) and radiation therapy (RT: dose-escalated three-dimensional conformal RT, intensity-modulated RT, brachytherapy, or combination), using a comprehensive, lifetime decision analytical model. A Markov model was constructed to follow hypothetical men with low-, intermediate-, and high-risk prostate cancer over their lifetimes after primary treatment; probabilities of outcomes were based on an exhaustive literature search yielding 232 unique publications. In each Markov cycle, patients could have remission, recurrence, salvage treatment, metastasis, death from prostate cancer, and death from other causes. Utilities for each health state were determined, and disutilities were applied for complications and toxicities of treatment. Costs were determined from the USA payer perspective, with incorporation of patient costs in a sensitivity analysis. Differences across treatments in quality-adjusted life years across methods were modest, ranging from 10.3 to

  9. A Comprehensive, Open-source Platform for Mass Spectrometry-based Glycoproteomics Data Analysis.

    Science.gov (United States)

    Liu, Gang; Cheng, Kai; Lo, Chi Y; Li, Jun; Qu, Jun; Neelamegham, Sriram

    2017-11-01

    Glycosylation is among the most abundant and diverse protein post-translational modifications (PTMs) identified to date. The structural analysis of this PTM is challenging because of the diverse monosaccharides which are not conserved among organisms, the branched nature of glycans, their isomeric structures, and heterogeneity in the glycan distribution at a given site. Glycoproteomics experiments have adopted the traditional high-throughput LC-MS n proteomics workflow to analyze site-specific glycosylation. However, comprehensive computational platforms for data analyses are scarce. To address this limitation, we present a comprehensive, open-source, modular software for glycoproteomics data analysis called GlycoPAT (GlycoProteomics Analysis Toolbox; freely available from www.VirtualGlycome.org/glycopat). The program includes three major advances: (1) "SmallGlyPep," a minimal linear representation of glycopeptides for MS n data analysis. This format allows facile serial fragmentation of both the peptide backbone and PTM at one or more locations. (2) A novel scoring scheme based on calculation of the "Ensemble Score (ES)," a measure that scores and rank-orders MS/MS spectrum for N- and O-linked glycopeptides using cross-correlation and probability based analyses. (3) A false discovery rate (FDR) calculation scheme where decoy glycopeptides are created by simultaneously scrambling the amino acid sequence and by introducing artificial monosaccharides by perturbing the original sugar mass. Parallel computing facilities and user-friendly GUIs (Graphical User Interfaces) are also provided. GlycoPAT is used to catalogue site-specific glycosylation on simple glycoproteins, standard protein mixtures and human plasma cryoprecipitate samples in three common MS/MS fragmentation modes: CID, HCD and ETD. It is also used to identify 960 unique glycopeptides in cell lysates from prostate cancer cells. The results show that the simultaneous consideration of peptide and glycan

  10. Prioritizing strategies for comprehensive liver cancer control in Asia: a conjoint analysis

    Directory of Open Access Journals (Sweden)

    Bridges John FP

    2012-10-01

    Full Text Available Abstract Background Liver cancer is a complex and burdensome disease, with Asia accounting for 75% of known cases. Comprehensive cancer control requires the use of multiple strategies, but various stakeholders may have different views as to which strategies should have the highest priority. This study identified priorities across multiple strategies for comprehensive liver cancer control (CLCC from the perspective of liver cancer clinical, policy, and advocacy stakeholders in China, Japan, South Korea and Taiwan. Concordance of priorities was assessed across the region and across respondent roles. Methods Priorities for CLCC were examined as part of a cross-sectional survey of liver cancer experts. Respondents completed several conjoint-analysis choice tasks to prioritize 11 strategies. In each task, respondents judged which of two competing CLCC plans, consisting of mutually exclusive and exhaustive subsets of the strategies, would have the greatest impact. The dependent variable was the chosen plan, which was then regressed on the strategies of different plans. The restricted least squares (RLS method was utilized to compare aggregate and stratified models, and t-tests and Wald tests were used to test for significance and concordance, respectively. Results Eighty respondents (69.6% were eligible and completed the survey. Their primary interests were hepatitis (26%, hepatocellular carcinoma (HCC (58%, metastatic liver cancer (10% and transplantation (6%. The most preferred strategies were monitoring at-risk populations (pclinician education (pnational guidelines (ptransplantation infrastructure (p=0.009 was valued lower in China, measuring social burden (p=0.037 was valued higher in Taiwan, and national guidelines (p=0.025 was valued higher in China. Priorities did not differ across stakeholder groups (p=0.438. Conclusions Priorities for CLCC in Asia include monitoring at-risk populations, clinician education, national guidelines

  11. Prioritizing strategies for comprehensive liver cancer control in Asia: a conjoint analysis.

    Science.gov (United States)

    Bridges, John F P; Dong, Liming; Gallego, Gisselle; Blauvelt, Barri M; Joy, Susan M; Pawlik, Timothy M

    2012-10-30

    Liver cancer is a complex and burdensome disease, with Asia accounting for 75% of known cases. Comprehensive cancer control requires the use of multiple strategies, but various stakeholders may have different views as to which strategies should have the highest priority. This study identified priorities across multiple strategies for comprehensive liver cancer control (CLCC) from the perspective of liver cancer clinical, policy, and advocacy stakeholders in China, Japan, South Korea and Taiwan. Concordance of priorities was assessed across the region and across respondent roles. Priorities for CLCC were examined as part of a cross-sectional survey of liver cancer experts. Respondents completed several conjoint-analysis choice tasks to prioritize 11 strategies. In each task, respondents judged which of two competing CLCC plans, consisting of mutually exclusive and exhaustive subsets of the strategies, would have the greatest impact. The dependent variable was the chosen plan, which was then regressed on the strategies of different plans. The restricted least squares (RLS) method was utilized to compare aggregate and stratified models, and t-tests and Wald tests were used to test for significance and concordance, respectively. Eighty respondents (69.6%) were eligible and completed the survey. Their primary interests were hepatitis (26%), hepatocellular carcinoma (HCC) (58%), metastatic liver cancer (10%) and transplantation (6%). The most preferred strategies were monitoring at-risk populations (p<0.001), clinician education (p<0.001), and national guidelines (p<0.001). Most priorities were concordant across sites except for three strategies: transplantation infrastructure (p=0.009) was valued lower in China, measuring social burden (p=0.037) was valued higher in Taiwan, and national guidelines (p=0.025) was valued higher in China. Priorities did not differ across stakeholder groups (p=0.438). Priorities for CLCC in Asia include monitoring at-risk populations

  12. Comprehensive Care

    Science.gov (United States)

    ... Comprehensive Care Share this page Facebook Twitter Email Comprehensive Care Understand the importance of comprehensive MS care ... In this article A complex disease requires a comprehensive approach Today multiple sclerosis (MS) is not a ...

  13. Comprehensive curation and analysis of global interaction networks in Saccharomyces cerevisiae

    Science.gov (United States)

    Reguly, Teresa; Breitkreutz, Ashton; Boucher, Lorrie; Breitkreutz, Bobby-Joe; Hon, Gary C; Myers, Chad L; Parsons, Ainslie; Friesen, Helena; Oughtred, Rose; Tong, Amy; Stark, Chris; Ho, Yuen; Botstein, David; Andrews, Brenda; Boone, Charles; Troyanskya, Olga G; Ideker, Trey; Dolinski, Kara; Batada, Nizar N; Tyers, Mike

    2006-01-01

    Background The study of complex biological networks and prediction of gene function has been enabled by high-throughput (HTP) methods for detection of genetic and protein interactions. Sparse coverage in HTP datasets may, however, distort network properties and confound predictions. Although a vast number of well substantiated interactions are recorded in the scientific literature, these data have not yet been distilled into networks that enable system-level inference. Results We describe here a comprehensive database of genetic and protein interactions, and associated experimental evidence, for the budding yeast Saccharomyces cerevisiae, as manually curated from over 31,793 abstracts and online publications. This literature-curated (LC) dataset contains 33,311 interactions, on the order of all extant HTP datasets combined. Surprisingly, HTP protein-interaction datasets currently achieve only around 14% coverage of the interactions in the literature. The LC network nevertheless shares attributes with HTP networks, including scale-free connectivity and correlations between interactions, abundance, localization, and expression. We find that essential genes or proteins are enriched for interactions with other essential genes or proteins, suggesting that the global network may be functionally unified. This interconnectivity is supported by a substantial overlap of protein and genetic interactions in the LC dataset. We show that the LC dataset considerably improves the predictive power of network-analysis approaches. The full LC dataset is available at the BioGRID () and SGD () databases. Conclusion Comprehensive datasets of biological interactions derived from the primary literature provide critical benchmarks for HTP methods, augment functional prediction, and reveal system-level attributes of biological networks. PMID:16762047

  14. Differentiation of endosperm transfer cells of barley: a comprehensive analysis at the micro-scale.

    Science.gov (United States)

    Thiel, Johannes; Riewe, David; Rutten, Twan; Melzer, Michael; Friedel, Swetlana; Bollenbeck, Felix; Weschke, Winfriede; Weber, Hans

    2012-08-01

    Barley endosperm cells differentiate into transfer cells (ETCs) opposite the nucellar projection. To comprehensively analyse ETC differentiation, laser microdissection-based transcript and metabolite profiles were obtained from laser microdissected tissues and cell morphology was analysed. Flange-like secondary-wall ingrowths appeared between 5 and 7 days after pollination within the three outermost cell layers. Gene expression analysis indicated that ethylene-signalling pathways initiate ETC morphology. This is accompanied by gene activity related to cell shape control and vesicle transport, with abundant mitochondria and endomembrane structures. Gene expression analyses indicate predominant formation of hemicelluloses, glucuronoxylans and arabinoxylans, and transient formation of callose, together with proline and 4-hydroxyproline biosynthesis. Activation of the methylation cycle is probably required for biosynthesis of phospholipids, pectins and ethylene. Membrane microdomains involving sterols/sphingolipids and remorins are potentially involved in ETC development. The transcriptional activity of assimilate and micronutrient transporters suggests ETCs as the main uptake organs of solutes into the endosperm. Accordingly, the endosperm grows maximally after ETCs are fully developed. Up-regulated gene expression related to amino acid catabolism, C:N balances, carbohydrate oxidation, mitochondrial activity and starch degradation meets high demands for respiratory energy and carbohydrates, required for cell proliferation and wall synthesis. At 10 days after pollination, ETCs undergo further differentiation, potentially initiated by abscisic acid, and metabolism is reprogrammed as shown by activated storage and stress-related processes. Overall, the data provide a comprehensive view of barley ETC differentiation and development, and identify candidate genes and associated pathways. © 2012 The Authors. The Plant Journal © 2012 Blackwell Publishing Ltd.

  15. Comprehensive analysis of the mutation spectrum in 301 German ALS families.

    Science.gov (United States)

    Müller, Kathrin; Brenner, David; Weydt, Patrick; Meyer, Thomas; Grehl, Torsten; Petri, Susanne; Grosskreutz, Julian; Schuster, Joachim; Volk, Alexander E; Borck, Guntram; Kubisch, Christian; Klopstock, Thomas; Zeller, Daniel; Jablonka, Sibylle; Sendtner, Michael; Klebe, Stephan; Knehr, Antje; Günther, Kornelia; Weis, Joachim; Claeys, Kristl G; Schrank, Berthold; Sperfeld, Anne-Dorte; Hübers, Annemarie; Otto, Markus; Dorst, Johannes; Meitinger, Thomas; Strom, Tim M; Andersen, Peter M; Ludolph, Albert C; Weishaupt, Jochen H

    2018-04-12

    Recent advances in amyotrophic lateral sclerosis (ALS) genetics have revealed that mutations in any of more than 25 genes can cause ALS, mostly as an autosomal-dominant Mendelian trait. Detailed knowledge about the genetic architecture of ALS in a specific population will be important for genetic counselling but also for genotype-specific therapeutic interventions. Here we combined fragment length analysis, repeat-primed PCR, Southern blotting, Sanger sequencing and whole exome sequencing to obtain a comprehensive profile of genetic variants in ALS disease genes in 301 German pedigrees with familial ALS. We report C9orf72 mutations as well as variants in consensus splice sites and non-synonymous variants in protein-coding regions of ALS genes. We furthermore estimate their pathogenicity by taking into account type and frequency of the respective variant as well as segregation within the families. 49% of our German ALS families carried a likely pathogenic variant in at least one of the earlier identified ALS genes. In 45% of the ALS families, likely pathogenic variants were detected in C9orf72, SOD1, FUS, TARDBP or TBK1 , whereas the relative contribution of the other ALS genes in this familial ALS cohort was 4%. We identified several previously unreported rare variants and demonstrated the absence of likely pathogenic variants in some of the recently described ALS disease genes. We here present a comprehensive genetic characterisation of German familial ALS. The present findings are of importance for genetic counselling in clinical practice, for molecular research and for the design of diagnostic gene panels or genotype-specific therapeutic interventions in Europe. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  16. Analyses of Brucella Pathogenesis, Host Immunity, and Vaccine Targets using Systems Biology and Bioinformatics

    OpenAIRE

    He, Yongqun

    2012-01-01

    Brucella is a Gram-negative, facultative intracellular bacterium that causes zoonotic brucellosis in humans and various animals. Out of 10 classified Brucella species, B. melitensis, B. abortus, B. suis, and B. canis are pathogenic to humans. In the past decade, the mechanisms of Brucella pathogenesis and host immunity have been extensively investigated using the cutting edge systems biology and bioinformatics approaches. This article provides a comprehensive review of the applications of Omi...

  17. Component-Based Approach for Educating Students in Bioinformatics

    Science.gov (United States)

    Poe, D.; Venkatraman, N.; Hansen, C.; Singh, G.

    2009-01-01

    There is an increasing need for an effective method of teaching bioinformatics. Increased progress and availability of computer-based tools for educating students have led to the implementation of a computer-based system for teaching bioinformatics as described in this paper. Bioinformatics is a recent, hybrid field of study combining elements of…

  18. A COMPREHENSIVE ANALYSIS OF ISOLATED INFRANUCLEAR ABDUCENS NERVE PALSY IN A TERTIARY EYE CARE CENTRE

    Directory of Open Access Journals (Sweden)

    G. Dhamodara

    2017-01-01

    Full Text Available BACKGROUND A comprehensive analysis of the aetiology and clinical profile of isolated infranuclear abducens nerve palsy in a tertiary eye care centre. MATERIALS AND METHODS A hospital-based retrospective case series analysis of 90 isolated infranuclear neurogenic abducens nerve palsies. Documentation included age, gender, presenting complaints, history of diabetes mellitus, hypertension, mode of onset, progression of the disease, treatment given and recovery rate was evaluated. Detailed ophthalmic evaluation of both eyes including anterior segment examination, extraocular movements, diplopia charting and Hess charting. Thorough central nervous system examination and systemic examination was done. Inclusion Criteria- All isolated infranuclear neurogenic lesions of abducens nerve palsy. Exclusion Criteria- Conditions like supranuclear lesions, myasthenia, orbital inflammation and myopathies, false localising sign of abducens nerve palsy were excluded by appropriate testing and investigations. RESULTS Total cases were 90 patients. Mean age of presentation was between 3rd to 5th decades with male preponderance. Commonest presenting symptom was diplopia (71.1%, commonest cause being idiopathic neuritis (48%, diabetes mellitus (20%, hypertension (15%, trauma (10% and others (7%. CONCLUSION In our study, isolated infranuclear abducens nerve palsy with nonspecific aetiology predominantly affecting males of 3 rd to 5 th decade with variable recovery rates were seen. Hence, careful clinical examination in all cases is essential with close follow up on a long-time basis.

  19. A comprehensive segmentation analysis of crude oil market based on time irreversibility

    Science.gov (United States)

    Xia, Jianan; Shang, Pengjian; Lu, Dan; Yin, Yi

    2016-05-01

    In this paper, we perform a comprehensive entropic segmentation analysis of crude oil future prices from 1983 to 2014 which used the Jensen-Shannon divergence as the statistical distance between segments, and analyze the results from original series S and series begin at 1986 (marked as S∗) to find common segments which have same boundaries. Then we apply time irreversibility analysis of each segment to divide all segments into two groups according to their asymmetry degree. Based on the temporal distribution of the common segments and high asymmetry segments, we figure out that these two types of segments appear alternately and do not overlap basically in daily group, while the common portions are also high asymmetry segments in weekly group. In addition, the temporal distribution of the common segments is fairly close to the time of crises, wars or other events, because the hit from severe events to oil price makes these common segments quite different from their adjacent segments. The common segments can be confirmed in daily group series, or weekly group series due to the large divergence between common segments and their neighbors. While the identification of high asymmetry segments is helpful to know the segments which are not affected badly by the events and can recover to steady states automatically. Finally, we rearrange the segments by merging the connected common segments or high asymmetry segments into a segment, and conjoin the connected segments which are neither common nor high asymmetric.

  20. Comprehensive techno-economic analysis of wastewater-based algal biofuel production: A case study.

    Science.gov (United States)

    Xin, Chunhua; Addy, Min M; Zhao, Jinyu; Cheng, Yanling; Cheng, Sibo; Mu, Dongyan; Liu, Yuhuan; Ding, Rijia; Chen, Paul; Ruan, Roger

    2016-07-01

    Combining algae cultivation and wastewater treatment for biofuel production is considered the feasible way for resource utilization. An updated comprehensive techno-economic analysis method that integrates resources availability into techno-economic analysis was employed to evaluate the wastewater-based algal biofuel production with the consideration of wastewater treatment improvement, greenhouse gases emissions, biofuel production costs, and coproduct utilization. An innovative approach consisting of microalgae cultivation on centrate wastewater, microalgae harvest through flocculation, solar drying of biomass, pyrolysis of biomass to bio-oil, and utilization of co-products, was analyzed and shown to yield profound positive results in comparison with others. The estimated break even selling price of biofuel ($2.23/gallon) is very close to the acceptable level. The approach would have better overall benefits and the internal rate of return would increase up to 18.7% if three critical components, namely cultivation, harvest, and downstream conversion could achieve breakthroughs. Copyright © 2016 Elsevier Ltd. All rights reserved.

  1. A Comprehensive Analysis on Wearable Acceleration Sensors in Human Activity Recognition

    Directory of Open Access Journals (Sweden)

    Majid Janidarmian

    2017-03-01

    Full Text Available Sensor-based motion recognition integrates the emerging area of wearable sensors with novel machine learning techniques to make sense of low-level sensor data and provide rich contextual information in a real-life application. Although Human Activity Recognition (HAR problem has been drawing the attention of researchers, it is still a subject of much debate due to the diverse nature of human activities and their tracking methods. Finding the best predictive model in this problem while considering different sources of heterogeneities can be very difficult to analyze theoretically, which stresses the need of an experimental study. Therefore, in this paper, we first create the most complete dataset, focusing on accelerometer sensors, with various sources of heterogeneities. We then conduct an extensive analysis on feature representations and classification techniques (the most comprehensive comparison yet with 293 classifiers for activity recognition. Principal component analysis is applied to reduce the feature vector dimension while keeping essential information. The average classification accuracy of eight sensor positions is reported to be 96.44% ± 1.62% with 10-fold evaluation, whereas accuracy of 79.92% ± 9.68% is reached in the subject-independent evaluation. This study presents significant evidence that we can build predictive models for HAR problem under more realistic conditions, and still achieve highly accurate results.

  2. A Comprehensive Analysis on Wearable Acceleration Sensors in Human Activity Recognition.

    Science.gov (United States)

    Janidarmian, Majid; Roshan Fekr, Atena; Radecka, Katarzyna; Zilic, Zeljko

    2017-03-07

    Sensor-based motion recognition integrates the emerging area of wearable sensors with novel machine learning techniques to make sense of low-level sensor data and provide rich contextual information in a real-life application. Although Human Activity Recognition (HAR) problem has been drawing the attention of researchers, it is still a subject of much debate due to the diverse nature of human activities and their tracking methods. Finding the best predictive model in this problem while considering different sources of heterogeneities can be very difficult to analyze theoretically, which stresses the need of an experimental study. Therefore, in this paper, we first create the most complete dataset, focusing on accelerometer sensors, with various sources of heterogeneities. We then conduct an extensive analysis on feature representations and classification techniques (the most comprehensive comparison yet with 293 classifiers) for activity recognition. Principal component analysis is applied to reduce the feature vector dimension while keeping essential information. The average classification accuracy of eight sensor positions is reported to be 96.44% ± 1.62% with 10-fold evaluation, whereas accuracy of 79.92% ± 9.68% is reached in the subject-independent evaluation. This study presents significant evidence that we can build predictive models for HAR problem under more realistic conditions, and still achieve highly accurate results.

  3. Comparative analysis of peak-detection techniques for comprehensive two-dimensional chromatography.

    Science.gov (United States)

    Latha, Indu; Reichenbach, Stephen E; Tao, Qingping

    2011-09-23

    Comprehensive two-dimensional gas chromatography (GC×GC) is a powerful technology for separating complex samples. The typical goal of GC×GC peak detection is to aggregate data points of analyte peaks based on their retention times and intensities. Two techniques commonly used for two-dimensional peak detection are the two-step algorithm and the watershed algorithm. A recent study [4] compared the performance of the two-step and watershed algorithms for GC×GC data with retention-time shifts in the second-column separations. In that analysis, the peak retention-time shifts were corrected while applying the two-step algorithm but the watershed algorithm was applied without shift correction. The results indicated that the watershed algorithm has a higher probability of erroneously splitting a single two-dimensional peak than the two-step approach. This paper reconsiders the analysis by comparing peak-detection performance for resolved peaks after correcting retention-time shifts for both the two-step and watershed algorithms. Simulations with wide-ranging conditions indicate that when shift correction is employed with both algorithms, the watershed algorithm detects resolved peaks with greater accuracy than the two-step method. Copyright © 2011 Elsevier B.V. All rights reserved.

  4. IRSN global process for leading a comprehensive fire safety analysis for nuclear installations

    International Nuclear Information System (INIS)

    Ormieres, Yannick; Lacoue, Jocelyne

    2013-01-01

    A fire safety analysis (FSA) is requested to justify the adequacy of fire protection measures set by the operator. A recent document written by IRSN outlines a global process for such a comprehensive fire safety analysis. Thanks to the French nuclear fire safety regulation evolutions, from prescriptive requirements to objective requirements, the proposed fire safety justification process focuses on compliance with performance criteria for fire protection measures. These performance criteria are related to the vulnerability of targets to effects of fire, and not only based upon radiological consequences out side the installation caused by a fire. In his FSA, the operator has to define the safety functions that should continue to ensure its mission even in the case of fire in order to be in compliance with nuclear safety objectives. Then, in order to maintain these safety functions, the operator has to justify the adequacy of fire protection measures, defined according to defence in depth principles. To reach the objective, the analysis process is based on the identification of targets to be protected in order to maintain safety functions, taken into account facility characteristics. These targets include structures, systems, components and personal important to safety. Facility characteristics include, for all operating conditions, potential ignition sources and fire protections systems. One of the key points of the fire analysis is the assessment of possible fire scenarios in the facility. Given the large number of possible fire scenarios, it is then necessary to evaluate 'reference fires' which are the worst case scenarios of all possible fire scenarios and which are used by the operator for the design of fire protection measures. (authors)

  5. A comprehensive association analysis of homocysteine metabolic pathway genes in Singaporean Chinese with ischemic stroke.

    Directory of Open Access Journals (Sweden)

    Hui-Qi Low

    Full Text Available BACKGROUND: The effect of genetic factors, apart from 5,10-methylenetetrahydrofolate reductase (MTHFR polymorphisms, on elevated plasma homocysteine levels and increasing ischemic stroke risk have not been fully elucidated. We conducted a comprehensive analysis of 25 genes involved in homocysteine metabolism to investigate association of common variants within these genes with ischemic stroke risk. METHODOLOGY/PRINCIPAL FINDINGS: The study was done in two stages. In the initial study, SNP and haplotype-based association analyses were performed using 147 tagging Single Nucleotide Polymorphisms (SNPs in 360 stroke patients and 354 non-stroke controls of Singaporean Chinese ethnicity. Joint association analysis of significant SNPs was then performed to assess the cumulative effect of these variants on ischemic stroke risk. In the replication study, 8 SNPs were selected for validation in an independent set of 420 matched case-control pairs of Singaporean Chinese ethnicity. SNP analysis from the initial study suggested 3 risk variants in the MTRR, SHMT1 and TCN2 genes which were moderately associated with ischemic stroke risk, independent of known stroke risk factors. Although the replication study failed to support single-SNP associations observed in the initial study, joint association analysis of the 3 variants in combined initial and replication samples revealed a trend of elevated risk with an increased number of risk alleles (Joint P(trend = 1.2×10(-6. CONCLUSIONS: Our study did not find direct evidence of associations between any single polymorphisms of homocysteine metabolic pathway genes and ischemic stroke, but suggests that the cumulative effect of several small to moderate risk variants from genes involved in homocysteine metabolism may jointly confer a significant impact on ischemic stroke risk.

  6. Bioinformatics and systems biology research update from the 15th International Conference on Bioinformatics (InCoB2016).

    Science.gov (United States)

    Schönbach, Christian; Verma, Chandra; Bond, Peter J; Ranganathan, Shoba

    2016-12-22

    The International Conference on Bioinformatics (InCoB) has been publishing peer-reviewed conference papers in BMC Bioinformatics since 2006. Of the 44 articles accepted for publication in supplement issues of BMC Bioinformatics, BMC Genomics, BMC Medical Genomics and BMC Systems Biology, 24 articles with a bioinformatics or systems biology focus are reviewed in this editorial. InCoB2017 is scheduled to be held in Shenzen, China, September 20-22, 2017.

  7. Biowep: a workflow enactment portal for bioinformatics applications.

    Science.gov (United States)

    Romano, Paolo; Bartocci, Ezio; Bertolini, Guglielmo; De Paoli, Flavio; Marra, Domenico; Mauri, Giancarlo; Merelli, Emanuela; Milanesi, Luciano

    2007-03-08

    The huge amount of biological information, its distribution over the Internet and the heterogeneity of available software tools makes the adoption of new data integration and analysis network tools a necessity in bioinformatics. ICT standards and tools, like Web Services and Workflow Management Systems (WMS), can support the creation and deployment of such systems. Many Web Services are already available and some WMS have been proposed. They assume that researchers know which bioinformatics resources can be reached through a programmatic interface and that they are skilled in programming and building workflows. Therefore, they are not viable to the majority of unskilled researchers. A portal enabling these to take profit from new technologies is still missing. We designed biowep, a web based client application that allows for the selection and execution of a set of predefined workflows. The system is available on-line. Biowep architecture includes a Workflow Manager, a User Interface and a Workflow Executor. The task of the Workflow Manager is the creation and annotation of workflows. These can be created by using either the Taverna Workbench or BioWMS. Enactment of workflows is carried out by FreeFluo for Taverna workflows and by BioAgent/Hermes, a mobile agent-based middleware, for BioWMS ones. Main workflows' processing steps are annotated on the basis of their input and output, elaboration type and application domain by using a classification of bioinformatics data and tasks. The interface supports users authentication and profiling. Workflows can be selected on the basis of users' profiles and can be searched through their annotations. Results can be saved. We developed a web system that support the selection and execution of predefined workflows, thus simplifying access for all researchers. The implementation of Web Services allowing specialized software to interact with an exhaustive set of biomedical databases and analysis software and the creation of

  8. Biowep: a workflow enactment portal for bioinformatics applications

    Directory of Open Access Journals (Sweden)

    Romano Paolo

    2007-03-01

    Full Text Available Abstract Background The huge amount of biological information, its distribution over the Internet and the heterogeneity of available software tools makes the adoption of new data integration and analysis network tools a necessity in bioinformatics. ICT standards and tools, like Web Services and Workflow Management Systems (WMS, can support the creation and deployment of such systems. Many Web Services are already available and some WMS have been proposed. They assume that researchers know which bioinformatics resources can be reached through a programmatic interface and that they are skilled in programming and building workflows. Therefore, they are not viable to the majority of unskilled researchers. A portal enabling these to take profit from new technologies is still missing. Results We designed biowep, a web based client application that allows for the selection and execution of a set of predefined workflows. The system is available on-line. Biowep architecture includes a Workflow Manager, a User Interface and a Workflow Executor. The task of the Workflow Manager is the creation and annotation of workflows. These can be created by using either the Taverna Workbench or BioWMS. Enactment of workflows is carried out by FreeFluo for Taverna workflows and by BioAgent/Hermes, a mobile agent-based middleware, for BioWMS ones. Main workflows' processing steps are annotated on the basis of their input and output, elaboration type and application domain by using a classification of bioinformatics data and tasks. The interface supports users authentication and profiling. Workflows can be selected on the basis of users' profiles and can be searched through their annotations. Results can be saved. Conclusion We developed a web system that support the selection and execution of predefined workflows, thus simplifying access for all researchers. The implementation of Web Services allowing specialized software to interact with an exhaustive set of biomedical

  9. MetaComp: comprehensive analysis software for comparative meta-omics including comparative metagenomics.

    Science.gov (United States)

    Zhai, Peng; Yang, Longshu; Guo, Xiao; Wang, Zhe; Guo, Jiangtao; Wang, Xiaoqi; Zhu, Huaiqiu

    2017-10-02

    During the past decade, the development of high throughput nucleic sequencing and mass spectrometry analysis techniques have enabled the characterization of microbial communities through metagenomics, metatranscriptomics, metaproteomics and metabolomics data. To reveal the diversity of microbial communities and interactions between living conditions and microbes, it is necessary to introduce comparative analysis based upon integration of all four types of data mentioned above. Comparative meta-omics, especially comparative metageomics, has been established as a routine process to highlight the significant differences in taxon composition and functional gene abundance among microbiota samples. Meanwhile, biologists are increasingly concerning about the correlations between meta-omics features and environmental factors, which may further decipher the adaptation strategy of a microbial community. We developed a graphical comprehensive analysis software named MetaComp comprising a series of statistical analysis approaches with visualized results for metagenomics and other meta-omics data comparison. This software is capable to read files generated by a variety of upstream programs. After data loading, analyses such as multivariate statistics, hypothesis testing of two-sample, multi-sample as well as two-group sample and a novel function-regression analysis of environmental factors are offered. Here, regression analysis regards meta-omic features as independent variable and environmental factors as dependent variables. Moreover, MetaComp is capable to automatically choose an appropriate two-group sample test based upon the traits of input abundance profiles. We further evaluate the performance of its choice, and exhibit applications for metagenomics, metaproteomics and metabolomics samples. MetaComp, an integrative software capable for applying to all meta-omics data, originally distills the influence of living environment on microbial community by regression analysis

  10. Bioinformatics in New Generation Flavivirus Vaccines

    Directory of Open Access Journals (Sweden)

    Penelope Koraka

    2010-01-01

    Full Text Available Flavivirus infections are the most prevalent arthropod-borne infections world wide, often causing severe disease especially among children, the elderly, and the immunocompromised. In the absence of effective antiviral treatment, prevention through vaccination would greatly reduce morbidity and mortality associated with flavivirus infections. Despite the success of the empirically developed vaccines against yellow fever virus, Japanese encephalitis virus and tick-borne encephalitis virus, there is an increasing need for a more rational design and development of safe and effective vaccines. Several bioinformatic tools are available to support such rational vaccine design. In doing so, several parameters have to be taken into account, such as safety for the target population, overall immunogenicity of the candidate vaccine, and efficacy and longevity of the immune responses triggered. Examples of how bio-informatics is applied to assist in the rational design and improvements of vaccines, particularly flavivirus vaccines, are presented and discussed.

  11. Language and Cognitive Predictors of Text Comprehension: Evidence from Multivariate Analysis

    Science.gov (United States)

    Kim, Young-Suk

    2015-01-01

    Using data from children in South Korea (N = 145, M[subscript age] = 6.08), it was determined how low-level language and cognitive skills (vocabulary, syntactic knowledge, and working memory) and high-level cognitive skills (comprehension monitoring and theory of mind [ToM]) are related to listening comprehension and whether listening…

  12. A comprehensive evaluation of the sl1p pipeline for 16S rRNA gene sequencing analysis.

    Science.gov (United States)

    Whelan, Fiona J; Surette, Michael G

    2017-08-14

    Advances in next-generation sequencing technologies have allowed for detailed, molecular-based studies of microbial communities such as the human gut, soil, and ocean waters. Sequencing of the 16S rRNA gene, specific to prokaryotes, using universal PCR primers has become a common approach to studying the composition of these microbiota. However, the bioinformatic processing of the resulting millions of DNA sequences can be challenging, and a standardized protocol would aid in reproducible analyses. The short-read library 16S rRNA gene sequencing pipeline (sl1p, pronounced "slip") was designed with the purpose of mitigating this lack of reproducibility by combining pre-existing tools into a computational pipeline. This pipeline automates the processing of raw 16S rRNA gene sequencing data to create human-readable tables, graphs, and figures to make the collected data more readily accessible. Data generated from mock communities were compared using eight OTU clustering algorithms, two taxon assignment approaches, and three 16S rRNA gene reference databases. While all of these algorithms and options are available to sl1p users, through testing with human-associated mock communities, AbundantOTU+, the RDP Classifier, and the Greengenes 2011 reference database were chosen as sl1p's defaults based on their ability to best represent the known input communities. sl1p promotes reproducible research by providing a comprehensive log file, and reduces the computational knowledge needed by the user to process next-generation sequencing data. sl1p is freely available at https://bitbucket.org/fwhelan/sl1p .

  13. Comprehensive metabolic panel

    Science.gov (United States)

    Metabolic panel - comprehensive; Chem-20; SMA20; Sequential multi-channel analysis with computer-20; SMAC20; Metabolic panel 20 ... Chernecky CC, Berger BJ. Comprehensive metabolic panel (CMP) - blood. In: ... Tests and Diagnostic Procedures . 6th ed. St Louis, MO: ...

  14. Comprehensive Analysis of the Gas- and Particle-Phase Products of VOC Oxidation

    Science.gov (United States)

    Bakker-Arkema, J.; Ziemann, P. J.

    2017-12-01

    Controlled environmental chamber studies are important for determining atmospheric reaction mechanisms and gas and aerosol products formed in the oxidation of volatile organic compounds (VOCs). Such information is necessary for developing detailed chemical models for use in predicting the atmospheric fate of VOCs and also secondary organic aerosol (SOA) formation. However, complete characterization of atmospheric oxidation reactions, including gas- and particle-phase product yields, and reaction branching ratios, are difficult to achieve. In this work, we investigated the reactions of terminal and internal alkenes with OH radicals in the presence of NOx in an attempt to fully characterize the chemistry of these systems while minimizing and accounting for the inherent uncertainties associated with environmental chamber experiments. Gas-phase products (aldehydes formed by alkoxy radical decomposition) and particle-phase products (alkyl nitrates, β-hydroxynitrates, dihydroxynitrates, 1,4-hydroxynitrates, 1,4-hydroxycarbonyls, and dihydroxycarbonyls) formed through pathways involving addition of OH to the C=C double bond as well as H-atom abstraction were identified and quantified using a suite of analytical techniques. Particle-phase products were analyzed in real time with a thermal desorption particle beam mass spectrometer; and off-line by collection onto filters, extraction, and subsequent analysis of functional groups by derivatization-spectrophotometric methods developed in our lab. Derivatized products were also separated by liquid chromatography for molecular quantitation by UV absorbance and identification using chemical ionization-ion trap mass spectrometry. Gas phase aldehydes were analyzed off-line by collection onto Tenax and a 5-channel denuder with subsequent analysis by gas chromatography, or by collection onto DNPH-coated cartridges and subsequent analysis by liquid chromatography. The full product identification and quantitation, with careful

  15. Exergetic performance analysis of an ice-cream manufacturing plant: A comprehensive survey

    International Nuclear Information System (INIS)

    Dowlati, Majid; Aghbashlo, Mortaza; Mojarab Soufiyan, Mohamad

    2017-01-01

    In this study, a comprehensive exergetic performance analysis of an ice-cream manufacturing plant was conducted in order to pinpoint the locations of thermodynamic inefficiencies. Exergetic performance parameters of each subunit of the plant were determined and illustrated individually through writing and solving energy and exergy balance equations on the basis of real operational data. The required data were acquired from a local ice-cream factory located in Tehran, Iran. The plant included three main subsystems including water steam generator, refrigeration system, and ice-cream production line. An attempt was also made to quantify the specific exergy destruction of the ice-cream manufacturing process. The functional exergetic efficiency of the water steam generator, refrigeration system, and ice-cream production line was determined at 17.45%, 25.52%, and 5.71%, respectively. The overall functional exergetic efficiency of the process was found to be 2.15%, while the specific exergy destruction was calculated as 719.80 kJ/kg. In general, exergy analysis and its derivatives could provide invaluable information over the conventional energy analysis, suggesting potential locations for the plant performance improvement. - Highlights: • An ice-cream manufacturing plant was exergetically analyzed using the actual data. • Water steaming unit had the highest irreversibility rate among the plant subunits. • The specific exergy destruction of the ice-cream manufacturing was 719.80 kJ/kg. • The overall process exergetic efficiency of the process was found to be 2.15%.

  16. Comprehensive analysis of the soybean (Glycine max GmLAX auxin transporter gene family

    Directory of Open Access Journals (Sweden)

    Chenglin eChai

    2016-03-01

    Full Text Available The phytohormone auxin plays a critical role in regulation of plant growth and development as well as plant responses to abiotic stresses. This is mainly achieved through its uneven distribution in plants via a polar auxin transport process. Auxin transporters are major players in polar auxin transport. The AUXIN RESISTANT 1 ⁄ LIKE AUX1 (AUX⁄LAX auxin influx carriers belong to the amino acid permease family of proton-driven transporters and function in the uptake of indole-3-acetic acid (IAA. In this study, genome-wide comprehensive analysis of the soybean AUX⁄LAX (GmLAX gene family, including phylogenic relationships, chromosome localization, and gene structure, were carried out. A total of 15 GmLAX genes, including seven duplicated gene pairs, were identified in the soybean genome. They were distributed on 10 chromosomes. Despite their higher percentage identities at the protein level, GmLAXs exhibited versatile tissue-specific expression patterns, indicating coordinated functioning during plant growth and development. Most GmLAXs were responsive to drought and dehydration stresses and auxin and abscisic acid (ABA stimuli, in a tissue- and/or time point- sensitive mode. Several GmLAX members were involved in responding to salt stress. Sequence analysis revealed that promoters of GmLAXs contained different combinations of stress-related cis-regulatory elements. These studies suggest that the soybean GmLAXs were under control of a very complex regulatory network, responding to various internal and external signals. This study helps to identity candidate GmLAXs for further analysis of their roles in soybean development and adaption to adverse environments.

  17. The Effects of Visual Attention Span and Phonological Decoding in Reading Comprehension in Dyslexia: A Path Analysis.

    Science.gov (United States)

    Chen, Chen; Schneps, Matthew H; Masyn, Katherine E; Thomson, Jennifer M

    2016-11-01

    Increasing evidence has shown visual attention span to be a factor, distinct from phonological skills, that explains single-word identification (pseudo-word/word reading) performance in dyslexia. Yet, little is known about how well visual attention span explains text comprehension. Observing reading comprehension in a sample of 105 high school students with dyslexia, we used a pathway analysis to examine the direct and indirect path between visual attention span and reading comprehension while controlling for other factors such as phonological awareness, letter identification, short-term memory, IQ and age. Integrating phonemic decoding efficiency skills in the analytic model, this study aimed to disentangle how visual attention span and phonological skills work together in reading comprehension for readers with dyslexia. We found visual attention span to have a significant direct effect on more difficult reading comprehension but not on an easier level. It also had a significant direct effect on pseudo-word identification but not on word identification. In addition, we found that visual attention span indirectly explains reading comprehension through pseudo-word reading and word reading skills. This study supports the hypothesis that at least part of the dyslexic profile can be explained by visual attention abilities. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  18. Penalized feature selection and classification in bioinformatics

    OpenAIRE

    Ma, Shuangge; Huang, Jian

    2008-01-01

    In bioinformatics studies, supervised classification with high-dimensional input variables is frequently encountered. Examples routinely arise in genomic, epigenetic and proteomic studies. Feature selection can be employed along with classifier construction to avoid over-fitting, to generate more reliable classifier and to provide more insights into the underlying causal relationships. In this article, we provide a review of several recently developed penalized feature selection and classific...

  19. Bioinformatics Training: A Review of Challenges, Actions and Support Requirements

    DEFF Research Database (Denmark)

    Schneider, M.V.; Watson, J.; Attwood, T.

    2010-01-01

    As bioinformatics becomes increasingly central to research in the molecular life sciences, the need to train non-bioinformaticians to make the most of bioinformatics resources is growing. Here, we review the key challenges and pitfalls to providing effective training for users of bioinformatics...... services, and discuss successful training strategies shared by a diverse set of bioinformatics trainers. We also identify steps that trainers in bioinformatics could take together to advance the state of the art in current training practices. The ideas presented in this article derive from the first...

  20. Adapting bioinformatics curricula for big data.

    Science.gov (United States)

    Greene, Anna C; Giffin, Kristine A; Greene, Casey S; Moore, Jason H

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. © The Author 2015. Published by Oxford University Press.