WorldWideScience

Sample records for bioinformatic cis-element analyses

  1. Metabolome, transcriptome, and bioinformatic cis-element analyses point to HNF-4 as a central regulator of gene expression during enterocyte differentiation

    DEFF Research Database (Denmark)

    Stegmann, Anders; Hansen, Morten; Wang, Yulan;

    2006-01-01

    DNA-binding transcription factors bind to promoters that carry their binding sites. Transcription factors therefore function as nodes in gene regulatory networks. In the present work we used a bioinformatic approach to search for transcription factors that might function as nodes in gene regulatory...

  2. Metabolome, transcriptome, and bioinformatic cis-element analyses point to HNF-4 as a central regulator of gene expression during enterocyte differentiation

    DEFF Research Database (Denmark)

    Stegmann, Anders; Hansen, Morten; Wang, Yulan

    2006-01-01

    networks during the differentiation of the small intestinal epithelial cell. In addition we have searched for connections between transcription factors and the villus metabolome. Transcriptome data were generated from mouse small intestinal villus, crypt, and fetal intestinal epithelial cells. Metabolome...... data were generated from crypt and villus cells. Our results show that genes that are upregulated during fetal to adult and crypt to villus differentiation have an overrepresentation of potential hepatocyte nuclear factor (HNF)-4 binding sites in their promoters. Moreover, metabolome analyses by magic...... that are involved in lipid metabolism. Our approach also identifies transcription factors of importance for crypt functions such as DNA replication (E2F) and stem cell maintenance (c-Myc)....

  3. Bioinformatic cis-element analyses performed in Arabidopsis and rice disclose bZIP- and MYB-related binding sites as potential AuxRE-coupling elements in auxin-mediated transcription

    Directory of Open Access Journals (Sweden)

    Berendzen Kenneth W

    2012-08-01

    Full Text Available Abstract Background In higher plants, a diverse array of developmental and growth-related processes is regulated by the plant hormone auxin. Recent publications have proposed that besides the well-characterized Auxin Response Factors (ARFs that bind Auxin Response Elements (AuxREs, also members of the bZIP- and MYB-transcription factor (TF families participate in transcriptional control of auxin-regulated genes via bZIP Response Elements (ZREs or Myb Response Elements (MREs, respectively. Results Applying a novel bioinformatic algorithm, we demonstrate on a genome-wide scale that singular motifs or composite modules of AuxREs, ZREs, MREs but also of MYC2 related elements are significantly enriched in promoters of auxin-inducible genes. Despite considerable, species-specific differences in the genome structure in terms of the GC content, this enrichment is generally conserved in dicot (Arabidopsis thaliana and monocot (Oryza sativa model plants. Moreover, an enrichment of defined composite modules has been observed in selected auxin-related gene families. Consistently, a bipartite module, which encompasses a bZIP-associated G-box Related Element (GRE and an AuxRE motif, has been found to be highly enriched. Making use of transient reporter studies in protoplasts, these findings were experimentally confirmed, demonstrating that GREs functionally interact with AuxREs in regulating auxin-mediated transcription. Conclusions Using genome-wide bioinformatic analyses, evolutionary conserved motifs have been defined which potentially function as AuxRE-dependent coupling elements to establish auxin-specific expression patterns. Based on these findings, experimental approaches can be designed to broaden our understanding of combinatorial, auxin-controlled gene regulation.

  4. The twilight zone of cis element alignments.

    Science.gov (United States)

    Sebastian, Alvaro; Contreras-Moreira, Bruno

    2013-02-01

    Sequence alignment of proteins and nucleic acids is a routine task in bioinformatics. Although the comparison of complete peptides, genes or genomes can be undertaken with a great variety of tools, the alignment of short DNA sequences and motifs entails pitfalls that have not been fully addressed yet. Here we confront the structural superposition of transcription factors with the sequence alignment of their recognized cis elements. Our goals are (i) to test TFcompare (http://floresta.eead.csic.es/tfcompare), a structural alignment method for protein-DNA complexes; (ii) to benchmark the pairwise alignment of regulatory elements; (iii) to define the confidence limits and the twilight zone of such alignments and (iv) to evaluate the relevance of these thresholds with elements obtained experimentally. We find that the structure of cis elements and protein-DNA interfaces is significantly more conserved than their sequence and measures how this correlates with alignment errors when only sequence information is considered. Our results confirm that DNA motifs in the form of matrices produce better alignments than individual sequences. Finally, we report that empirical and theoretically derived twilight thresholds are useful for estimating the natural plasticity of regulatory sequences, and hence for filtering out unreliable alignments.

  5. Bioinformatics tools for analysing viral genomic data.

    Science.gov (United States)

    Orton, R J; Gu, Q; Hughes, J; Maabar, M; Modha, S; Vattipally, S B; Wilkie, G S; Davison, A J

    2016-04-01

    The field of viral genomics and bioinformatics is experiencing a strong resurgence due to high-throughput sequencing (HTS) technology, which enables the rapid and cost-effective sequencing and subsequent assembly of large numbers of viral genomes. In addition, the unprecedented power of HTS technologies has enabled the analysis of intra-host viral diversity and quasispecies dynamics in relation to important biological questions on viral transmission, vaccine resistance and host jumping. HTS also enables the rapid identification of both known and potentially new viruses from field and clinical samples, thus adding new tools to the fields of viral discovery and metagenomics. Bioinformatics has been central to the rise of HTS applications because new algorithms and software tools are continually needed to process and analyse the large, complex datasets generated in this rapidly evolving area. In this paper, the authors give a brief overview of the main bioinformatics tools available for viral genomic research, with a particular emphasis on HTS technologies and their main applications. They summarise the major steps in various HTS analyses, starting with quality control of raw reads and encompassing activities ranging from consensus and de novo genome assembly to variant calling and metagenomics, as well as RNA sequencing.

  6. Bioinformatics analyses for signal transduction networks

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    Research in signaling networks contributes to a deeper understanding of organism living activities. With the development of experimental methods in the signal transduction field, more and more mechanisms of signaling pathways have been discovered. This paper introduces such popular bioin-formatics analysis methods for signaling networks as the common mechanism of signaling pathways and database resource on the Internet, summerizes the methods of analyzing the structural properties of networks, including structural Motif finding and automated pathways generation, and discusses the modeling and simulation of signaling networks in detail, as well as the research situation and tendency in this area. Now the investigation of signal transduction is developing from small-scale experiments to large-scale network analysis, and dynamic simulation of networks is closer to the real system. With the investigation going deeper than ever, the bioinformatics analysis of signal transduction would have immense space for development and application.

  7. Whale song analyses using bioinformatics sequence analysis approaches

    Science.gov (United States)

    Chen, Yian A.; Almeida, Jonas S.; Chou, Lien-Siang

    2005-04-01

    Animal songs are frequently analyzed using discrete hierarchical units, such as units, themes and songs. Because animal songs and bio-sequences may be understood as analogous, bioinformatics analysis tools DNA/protein sequence alignment and alignment-free methods are proposed to quantify the theme similarities of the songs of false killer whales recorded off northeast Taiwan. The eighteen themes with discrete units that were identified in an earlier study [Y. A. Chen, masters thesis, University of Charleston, 2001] were compared quantitatively using several distance metrics. These metrics included the scores calculated using the Smith-Waterman algorithm with the repeated procedure; the standardized Euclidian distance and the angle metrics based on word frequencies. The theme classifications based on different metrics were summarized and compared in dendrograms using cluster analyses. The results agree with earlier classifications derived by human observation qualitatively. These methods further quantify the similarities among themes. These methods could be applied to the analyses of other animal songs on a larger scale. For instance, these techniques could be used to investigate song evolution and cultural transmission quantifying the dissimilarities of humpback whale songs across different seasons, years, populations, and geographic regions. [Work supported by SC Sea Grant, and Ilan County Government, Taiwan.

  8. Bioinformatics

    DEFF Research Database (Denmark)

    Baldi, Pierre; Brunak, Søren

    , and medicine will be particularly affected by the new results and the increased understanding of life at the molecular level. Bioinformatics is the development and application of computer methods for analysis, interpretation, and prediction, as well as for the design of experiments. It has emerged...... as a strategic frontier between biology and computer science. Machine learning approaches (e.g. neural networks, hidden Markov models, and belief networsk) are ideally suited for areas in which there is a lot of data but little theory. The goal in machine learning is to extract useful information from a body...... of data by building good probabilistic models. The particular twist behind machine learning, however, is to automate the process as much as possible.In this book, the authors present the key machine learning approaches and apply them to the computational problems encountered in the analysis of biological...

  9. Bioinformatics analyses of Shigella CRISPR structure and spacer classification.

    Science.gov (United States)

    Wang, Pengfei; Zhang, Bing; Duan, Guangcai; Wang, Yingfang; Hong, Lijuan; Wang, Linlin; Guo, Xiangjiao; Xi, Yuanlin; Yang, Haiyan

    2016-03-01

    Clustered regularly interspaced short palindromic repeats (CRISPR) are inheritable genetic elements of a variety of archaea and bacteria and indicative of the bacterial ecological adaptation, conferring acquired immunity against invading foreign nucleic acids. Shigella is an important pathogen for anthroponosis. This study aimed to analyze the features of Shigella CRISPR structure and classify the spacers through bioinformatics approach. Among 107 Shigella, 434 CRISPR structure loci were identified with two to seven loci in different strains. CRISPR-Q1, CRISPR-Q4 and CRISPR-Q5 were widely distributed in Shigella strains. Comparison of the first and last repeats of CRISPR1, CRISPR2 and CRISPR3 revealed several base variants and different stem-loop structures. A total of 259 cas genes were found among these 107 Shigella strains. The cas gene deletions were discovered in 88 strains. However, there is one strain that does not contain cas gene. Intact clusters of cas genes were found in 19 strains. From comprehensive analysis of sequence signature and BLAST and CRISPRTarget score, the 708 spacers were classified into three subtypes: Type I, Type II and Type III. Of them, Type I spacer referred to those linked with one gene segment, Type II spacer linked with two or more different gene segments, and Type III spacer undefined. This study examined the diversity of CRISPR/cas system in Shigella strains, demonstrated the main features of CRISPR structure and spacer classification, which provided critical information for elucidation of the mechanisms of spacer formation and exploration of the role the spacers play in the function of the CRISPR/cas system.

  10. Optimizing selection of microsatellite loci from 454 pyrosequencing via post-sequencing bioinformatic analyses.

    Science.gov (United States)

    Fernandez-Silva, Iria; Toonen, Robert J

    2013-01-01

    The comparatively low cost of massive parallel sequencing technology, also known as next-generation sequencing (NGS), has transformed the isolation of microsatellite loci. The most common NGS approach consists of obtaining large amounts of sequence data from genomic DNA or enriched microsatellite libraries, which is then mined for the discovery of microsatellite repeats using bioinformatics analyses. Here, we describe a bioinformatics approach to isolate microsatellite loci, starting from the raw sequence data through a subset of microsatellite primer pairs. The primary difference to previously published approaches includes analyses to select the most accurate sequence data and to eliminate repetitive elements prior to the design of primers. These analyses aim to minimize the testing of primer pairs by identifying the most promising microsatellite loci.

  11. Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses.

    Science.gov (United States)

    Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

    2014-06-01

    Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach.

  12. Identification of a Positive Cis-Element Upstream of Human NKX3.1 Gene

    Institute of Scientific and Technical Information of China (English)

    An-Li JIANG; Peng-Ju ZHANG; Xiao-Yan HU; Wei-Wen CHEN; Feng KONG; Zhi-Fang LIU; Hui-Qing YUAN; Jian-Ye ZHANG

    2005-01-01

    NKX3.1 is a prostate-specific homeobox gene related to prostate development and prostate cancer. In this work, we aimed to identify precisely the functional cis-element in the 197 bp region (from -1032 to -836 bp) of the NKX3.1 promoter (from -1032 to +8 bp), which was previously identified to present positive regulatory activity on NKX3.1 expression, by deletion mutagenesis analysis and electrophoretic mobility shift assay (EMSA). A 16 bp positive cis-element located between -920 and -905 bp upstream of the NKX3.1 gene was identified by deletion mutation analysis and proved to be a functional positive cis-element by EMSA. It will be important to further study the functions and regulatory mechanisms of this positive cis-element in NKX3.1 gene expression.

  13. Mechanical stress induces biotic and abiotic stress responses via a novel cis-element.

    Directory of Open Access Journals (Sweden)

    Justin W Walley

    2007-10-01

    Full Text Available Plants are continuously exposed to a myriad of abiotic and biotic stresses. However, the molecular mechanisms by which these stress signals are perceived and transduced are poorly understood. To begin to identify primary stress signal transduction components, we have focused on genes that respond rapidly (within 5 min to stress signals. Because it has been hypothesized that detection of physical stress is a mechanism common to mounting a response against a broad range of environmental stresses, we have utilized mechanical wounding as the stress stimulus and performed whole genome microarray analysis of Arabidopsis thaliana leaf tissue. This led to the identification of a number of rapid wound responsive (RWR genes. Comparison of RWR genes with published abiotic and biotic stress microarray datasets demonstrates a large overlap across a wide range of environmental stresses. Interestingly, RWR genes also exhibit a striking level and pattern of circadian regulation, with induced and repressed genes displaying antiphasic rhythms. Using bioinformatic analysis, we identified a novel motif overrepresented in the promoters of RWR genes, herein designated as the Rapid Stress Response Element (RSRE. We demonstrate in transgenic plants that multimerized RSREs are sufficient to confer a rapid response to both biotic and abiotic stresses in vivo, thereby establishing the functional involvement of this motif in primary transcriptional stress responses. Collectively, our data provide evidence for a novel cis-element that is distributed across the promoters of an array of diverse stress-responsive genes, poised to respond immediately and coordinately to stress signals. This structure suggests that plants may have a transcriptional network resembling the general stress signaling pathway in yeast and that the RSRE element may provide the key to this coordinate regulation.

  14. Managing, analysing, and integrating big data in medical bioinformatics: open problems and future perspectives.

    Science.gov (United States)

    Merelli, Ivan; Pérez-Sánchez, Horacio; Gesing, Sandra; D'Agostino, Daniele

    2014-01-01

    The explosion of the data both in the biomedical research and in the healthcare systems demands urgent solutions. In particular, the research in omics sciences is moving from a hypothesis-driven to a data-driven approach. Healthcare is additionally always asking for a tighter integration with biomedical data in order to promote personalized medicine and to provide better treatments. Efficient analysis and interpretation of Big Data opens new avenues to explore molecular biology, new questions to ask about physiological and pathological states, and new ways to answer these open issues. Such analyses lead to better understanding of diseases and development of better and personalized diagnostics and therapeutics. However, such progresses are directly related to the availability of new solutions to deal with this huge amount of information. New paradigms are needed to store and access data, for its annotation and integration and finally for inferring knowledge and making it available to researchers. Bioinformatics can be viewed as the "glue" for all these processes. A clear awareness of present high performance computing (HPC) solutions in bioinformatics, Big Data analysis paradigms for computational biology, and the issues that are still open in the biomedical and healthcare fields represent the starting point to win this challenge.

  15. Managing, Analysing, and Integrating Big Data in Medical Bioinformatics: Open Problems and Future Perspectives

    Directory of Open Access Journals (Sweden)

    Ivan Merelli

    2014-01-01

    Full Text Available The explosion of the data both in the biomedical research and in the healthcare systems demands urgent solutions. In particular, the research in omics sciences is moving from a hypothesis-driven to a data-driven approach. Healthcare is additionally always asking for a tighter integration with biomedical data in order to promote personalized medicine and to provide better treatments. Efficient analysis and interpretation of Big Data opens new avenues to explore molecular biology, new questions to ask about physiological and pathological states, and new ways to answer these open issues. Such analyses lead to better understanding of diseases and development of better and personalized diagnostics and therapeutics. However, such progresses are directly related to the availability of new solutions to deal with this huge amount of information. New paradigms are needed to store and access data, for its annotation and integration and finally for inferring knowledge and making it available to researchers. Bioinformatics can be viewed as the “glue” for all these processes. A clear awareness of present high performance computing (HPC solutions in bioinformatics, Big Data analysis paradigms for computational biology, and the issues that are still open in the biomedical and healthcare fields represent the starting point to win this challenge.

  16. Analyses of Brucella pathogenesis, host immunity, and vaccine targets using systems biology and bioinformatics.

    Science.gov (United States)

    He, Yongqun

    2012-01-01

    Brucella is a Gram-negative, facultative intracellular bacterium that causes zoonotic brucellosis in humans and various animals. Out of 10 classified Brucella species, B. melitensis, B. abortus, B. suis, and B. canis are pathogenic to humans. In the past decade, the mechanisms of Brucella pathogenesis and host immunity have been extensively investigated using the cutting edge systems biology and bioinformatics approaches. This article provides a comprehensive review of the applications of Omics (including genomics, transcriptomics, and proteomics) and bioinformatics technologies for the analysis of Brucella pathogenesis, host immune responses, and vaccine targets. Based on more than 30 sequenced Brucella genomes, comparative genomics is able to identify gene variations among Brucella strains that help to explain host specificity and virulence differences among Brucella species. Diverse transcriptomics and proteomics gene expression studies have been conducted to analyze gene expression profiles of wild type Brucella strains and mutants under different laboratory conditions. High throughput Omics analyses of host responses to infections with virulent or attenuated Brucella strains have been focused on responses by mouse and cattle macrophages, bovine trophoblastic cells, mouse and boar splenocytes, and ram buffy coat. Differential serum responses in humans and rams to Brucella infections have been analyzed using high throughput serum antibody screening technology. The Vaxign reverse vaccinology has been used to predict many Brucella vaccine targets. More than 180 Brucella virulence factors and their gene interaction networks have been identified using advanced literature mining methods. The recent development of community-based Vaccine Ontology and Brucellosis Ontology provides an efficient way for Brucella data integration, exchange, and computer-assisted automated reasoning.

  17. Bioinformatics analyses provide insight into distant homology of the Keap1-Nrf2 pathway.

    Science.gov (United States)

    Gacesa, Ranko; Dunlap, Walter C; Long, Paul F

    2015-11-01

    An essential requirement for the evolution of early eukaryotic life was the development of effective means to protect against metabolic oxidative stress and exposure to environmental toxicants. In present-day mammals, the master transcription factor Nrf2 regulates basal level homeostasis and inducible expression of numerous detoxifying and antioxidant genes. To examine early evolution of the Keap1-Nrf2 pathway, we present bioinformatics analyses of distant homology of mammalian Keap1 and Nrf2 proteins across the Kingdoms of Life. Software written for this analysis is made freely available on-line. Furthermore, utilizing protein modeling and virtual screening methods, we demonstrate potential for Nrf2 activation by competitive inhibition of its binding to Keap1, specifically by UV-protective fungal mycosporines and marine mycosporine-like amino acids (MAAs). We contend that coevolution of Nrf2-activating secondary metabolites by fungi and other extant microbiota may provide prospective compound leads for the design of new therapeutics to target activation of the human Keap1-Nrf2 pathway for treating degenerative diseases of ageing.

  18. Structural, Bioinformatic, and In Vivo Analyses of Two Treponema pallidum Lipoproteins Reveal a Unique TRAP Transporter

    Energy Technology Data Exchange (ETDEWEB)

    Deka, Ranjit K.; Brautigam, Chad A.; Goldberg, Martin; Schuck, Peter; Tomchick, Diana R.; Norgard, Michael V. (NIH); (UTSMC)

    2012-05-25

    Treponema pallidum, the bacterial agent of syphilis, is predicted to encode one tripartite ATP-independent periplasmic transporter (TRAP-T). TRAP-Ts typically employ a periplasmic substrate-binding protein (SBP) to deliver the cognate ligand to the transmembrane symporter. Herein, we demonstrate that the genes encoding the putative TRAP-T components from T. pallidum, tp0957 (the SBP), and tp0958 (the symporter), are in an operon with an uncharacterized third gene, tp0956. We determined the crystal structure of recombinant Tp0956; the protein is trimeric and perforated by a pore. Part of Tp0956 forms an assembly similar to those of 'tetratricopeptide repeat' (TPR) motifs. The crystal structure of recombinant Tp0957 was also determined; like the SBPs of other TRAP-Ts, there are two lobes separated by a cleft. In these other SBPs, the cleft binds a negatively charged ligand. However, the cleft of Tp0957 has a strikingly hydrophobic chemical composition, indicating that its ligand may be substantially different and likely hydrophobic. Analytical ultracentrifugation of the recombinant versions of Tp0956 and Tp0957 established that these proteins associate avidly. This unprecedented interaction was confirmed for the native molecules using in vivo cross-linking experiments. Finally, bioinformatic analyses suggested that this transporter exemplifies a new subfamily of TPATs (TPR-protein-associated TRAP-Ts) that require the action of a TPR-containing accessory protein for the periplasmic transport of a potentially hydrophobic ligand(s).

  19. Role of NSC319726 in ovarian cancer based on the bioinformatics analyses

    Science.gov (United States)

    Xue, Ji; Yang, Guang; Ding, Hong; Wang, Pu; Wang, Changhong

    2015-01-01

    Aim This study aimed to explore the molecular mechanisms of NSC319726 in ovarian cancer by bioinformatics analyses. Methods Gene expression profile GSE35972 was downloaded from the Gene Expression Omnibus database. The data set contains six samples, including three samples of TOV112D cells untreated and three samples of TOV112D cells treated with NSC319726. The differentially expressed genes (DEGs) between untreated and treated samples were analyzed using the limma package. Kyoto Encyclopedia of Genes and Genomes pathway analysis was performed using the signaling pathway impact analysis package, followed by the functional annotation of DEGs and protein–protein interaction network construction. Finally, the subnetwork was identified, and Gene Ontology functional enrichment analysis was performed on the DEGs enriched in the subnetwork. Results A total of 120 upregulated and 126 downregulated DEGs were identified. Six Kyoto Encyclopedia of Genes and Genomes pathways were significantly perturbed by DEGs, and the pathway of oocyte meiosis was identified as the most perturbed one. Oocyte meiosis was enriched by eight downregulated DEGs, such as ribosomal protein S6 kinase, 90 kDa, and polypeptide 6 (RPS6KA6). After functional annotation, eight transcription factors were upregulated (such as B-cell CLL/lymphoma 6 [BCL6]), and three transcription factors were downregulated. Seven tumor suppressor genes, such as forkhead box O3 (FOXO3), were upregulated. Additionally, in the protein–protein interaction network and subnetwork, cyclin B1 (CCNB1) and cell division cycle 20 (CDC20) were hub genes, which were also involved in the functions related to mitotic cell cycle. Conclusion NSC319726 may play an efficient role against ovarian cancer via targeting genes, such as RPS6KA6, BCL6, FOXO3, CCNB1, and CDC20, which are involved in oocyte meiosis pathway. PMID:26719703

  20. Solid Phase Biosensors for Arsenic or Cadmium Composed of A trans Factor and cis Element Complex

    Directory of Open Access Journals (Sweden)

    Mohammad Shohel Rana Siddiki

    2011-10-01

    Full Text Available The presence of toxic metals in drinking water has hazardous effects on human health. This study was conducted to develop GFP-based-metal-binding biosensors for on-site assay of toxic metal ions. GFP-tagged ArsR and CadC proteins bound to a cis element, and lost the capability of binding to it in their As- and Cd-binding conformational states, respectively. Water samples containing toxic metals were incubated on a complex of GFP-tagged ArsR or CadC and cis element which was immobilized on a solid surface. Metal concentrations were quantified with fluorescence intensity of the metal-binding states released from the cis element. Fluorescence intensity obtained with the assay significantly increased with increasing concentrations of toxic metals. Detection limits of 1 μg/L for Cd(II and 5 μg/L for As(III in purified water and 10 µg/L for Cd(II and As(III in tap water and bottled mineral water were achieved by measurement with a battery-powered portable fluorometer after 15-min and 30-min incubation, respectively. A complex of freeze dried GFP-tagged ArsR or CadC binding to cis element was stable at 4 °C and responded to 5 μg/L As(III or Cd(II. The solid phase biosensors are sensitive, less time-consuming, portable, and could offer a protocol for on-site evaluation of the toxic metals in drinking water.

  1. Role of NSC319726 in ovarian cancer based on the bioinformatics analyses

    Directory of Open Access Journals (Sweden)

    Xue J

    2015-12-01

    Full Text Available Ji Xue,1 Guang Yang,1 Hong Ding,1 Pu Wang,2 Changhong Wang3 1Department of Chinese Medicine, The Second Hospital of Jilin University, 2The Clinical Medical College of Jilin University (Grade 2013, 3Department of Chinese Medicine, China–Japan Union Hospital of Jilin University, Changchun City, People’s Republic of China Aim: This study aimed to explore the molecular mechanisms of NSC319726 in ovarian cancer by bioinformatics analyses.Methods: Gene expression profile GSE35972 was downloaded from the Gene Expression Omnibus database. The data set contains six samples, including three samples of TOV112D cells untreated and three samples of TOV112D cells treated with NSC319726. The differentially expressed genes (DEGs between untreated and treated samples were analyzed using the limma package. Kyoto Encyclopedia of Genes and Genomes pathway analysis was performed using the signaling pathway impact analysis package, followed by the functional annotation of DEGs and protein–protein interaction network construction. Finally, the subnetwork was identified, and Gene Ontology functional enrichment analysis was performed on the DEGs enriched in the subnetwork.Results: A total of 120 upregulated and 126 downregulated DEGs were identified. Six Kyoto Encyclopedia of Genes and Genomes pathways were significantly perturbed by DEGs, and the pathway of oocyte meiosis was identified as the most perturbed one. Oocyte meiosis was enriched by eight downregulated DEGs, such as ribosomal protein S6 kinase, 90 kDa, and polypeptide 6 (RPS6KA6. After functional annotation, eight transcription factors were upregulated (such as B-cell CLL/lymphoma 6 [BCL6], and three transcription factors were downregulated. Seven tumor suppressor genes, such as forkhead box O3 (FOXO3, were upregulated. Additionally, in the protein–protein interaction network and subnetwork, cyclin B1 (CCNB1 and cell division cycle 20 (CDC20 were hub genes, which were also involved in the functions

  2. Isolation, Characterization, and Bioinformatic Analyses of Lytic Salmonella Enteritidis Phages and Tests of Their Antibacterial Activity in Food.

    Science.gov (United States)

    Han, Han; Wei, Xiaoting; Wei, Yi; Zhang, Xiufeng; Li, Xuemin; Jiang, Jinzhong; Wang, Ran

    2017-02-01

    Salmonella Enteritidis remains a major threat for food safety. To take efforts to develop phage-based biocontrol for S. Enteritidis contamination in food, in this study, the phages against S. Enteritidis were isolated from sewage samples, characterized by host range assays, DNA restriction enzyme pattern analyses, and transmission electron microscope observations, and tested for antibacterial activity in food; some potent phages were further characterized by bioinformatic analyses. Results showed that based on the plaque quality and host range, seven lytic phages targeting S. Enteritidis were selected, considered as seven distinct phages through DNA physical maps, and classified as Myoviridae or Siphoviridae family by morphologic observations; the combined use of such seven strain phages as a "food additive" could succeed in controlling the artificial S. Enteritidis contamination in the different physical forms of food at a range of temperatures; by bioinformatic analyses, both selected phage BPS11Q3 and BPS15Q2 seemed to be newfound obligate lytic phage strains with no indications for any potentially harmful genes in their genomes. In conclusion, our results showed a potential of isolated phages as food additives for controlling S. Enteritidis contamination in some salmonellosis outbreak-associated food vehicles, and there could be minimized potential risk associated with using BPS11Q3 and BPS15Q2 in food.

  3. Phosphoproteomics and bioinformatics analyses of spinal cord proteins in rats with morphine tolerance.

    Directory of Open Access Journals (Sweden)

    Wen-Jinn Liaw

    Full Text Available INTRODUCTION: Morphine is the most effective pain-relieving drug, but it can cause unwanted side effects. Direct neuraxial administration of morphine to spinal cord not only can provide effective, reliable pain relief but also can prevent the development of supraspinal side effects. However, repeated neuraxial administration of morphine may still lead to morphine tolerance. METHODS: To better understand the mechanism that causes morphine tolerance, we induced tolerance in rats at the spinal cord level by giving them twice-daily injections of morphine (20 µg/10 µL for 4 days. We confirmed tolerance by measuring paw withdrawal latencies and maximal possible analgesic effect of morphine on day 5. We then carried out phosphoproteomic analysis to investigate the global phosphorylation of spinal proteins associated with morphine tolerance. Finally, pull-down assays were used to identify phosphorylated types and sites of 14-3-3 proteins, and bioinformatics was applied to predict biological networks impacted by the morphine-regulated proteins. RESULTS: Our proteomics data showed that repeated morphine treatment altered phosphorylation of 10 proteins in the spinal cord. Pull-down assays identified 2 serine/threonine phosphorylated sites in 14-3-3 proteins. Bioinformatics further revealed that morphine impacted on cytoskeletal reorganization, neuroplasticity, protein folding and modulation, signal transduction and biomolecular metabolism. CONCLUSIONS: Repeated morphine administration may affect multiple biological networks by altering protein phosphorylation. These data may provide insight into the mechanism that underlies the development of morphine tolerance.

  4. ADN-Viewer: a 3D approach for bioinformatic analyses of large DNA sequences.

    Science.gov (United States)

    Hérisson, Joan; Ferey, Nicolas; Gros, Pierre-Emmanuel; Gherbi, Rachid

    2007-01-20

    Most of biologists work on textual DNA sequences that are limited to the linear representation of DNA. In this paper, we address the potential offered by Virtual Reality for 3D modeling and immersive visualization of large genomic sequences. The representation of the 3D structure of naked DNA allows biologists to observe and analyze genomes in an interactive way at different levels. We developed a powerful software platform that provides a new point of view for sequences analysis: ADNViewer. Nevertheless, a classical eukaryotic chromosome of 40 million base pairs requires about 6 Gbytes of 3D data. In order to manage these huge amounts of data in real-time, we designed various scene management algorithms and immersive human-computer interaction for user-friendly data exploration. In addition, one bioinformatics study scenario is proposed.

  5. Bioinformatic analyses and conceptual synthesis of evidence linking ZNF804A to risk for schizophrenia and bipolar disorder.

    Science.gov (United States)

    Hess, Jonathan L; Quinn, Thomas P; Akbarian, Schahram; Glatt, Stephen J

    2015-01-01

    Advances in molecular genetics, fueled by the results of large-scale genome-wide association studies, meta-analyses, and mega-analyses, have provided the means of identifying genetic risk factors for human disease, thereby enriching our understanding of the functionality of the genome in the post-genomic era. In the past half-decade, research on neuropsychiatric disorders has reached an important milestone: the identification of susceptibility genes reliably associated with complex psychiatric disorders at genome-wide levels of significance. This age of discovery provides the groundwork for follow-up studies designed to elucidate the mechanism(s) by which genetic variants confer susceptibility to these disorders. The gene encoding zinc-finger protein 804 A (ZNF804A) is among these candidate genes, recently being found to be strongly associated with schizophrenia and bipolar disorder via one of its non-coding mutations, rs1344706. Neurobiological, molecular, and bioinformatic analyses have improved our understanding of ZNF804A in general and this variant in particular; however, more work is needed to establish the mechanism(s) by which ZNF804A variants impinge on the biological substrates of the two disorders. Here, we review literature recently published on ZNF804A, and analyze critical concepts related to the biology of ZNF804A and the role of rs1344706 in schizophrenia and bipolar disorder. We synthesize the results of new bioinformatic analyses of ZNF804A with key elements of the existing literature and knowledge base. Furthermore, we suggest some potentially fruitful short- and long-term research goals in the assessment of ZNF804A.

  6. cDNA sequence and protein bioinformatics analyses of MSTN in African catfish (Clarias gariepinus).

    Science.gov (United States)

    Kanjanaworakul, Poonmanee; Sawatdichaikul, Orathai; Poompuang, Supawadee

    2016-04-01

    Myostatin, also known as growth differentiation factor 8, has been identified as a potent negative regulator of skeletal muscle growth. The purpose of this study was to characterize and predict function of the myostatin gene of the African catfish (Cg-MSTN). Expression of Cg-MSTN was determined at three growth stages to establish the relationship between the levels of MSTN transcript and skeletal muscle growth. The partial cDNA sequence of Cg-MSTN was cloned by using published information from its congener walking catfish (Cm-MSTN). The Cg-MSTN was 1194 bp in length encoding a protein of 397 amino acids. The deduced MSTN sequence exhibited key functional sites similar to those of other members of the TGF-β superfamily, especially, the proteolytic processing site (RXXR motif) and nine conserved cysteines at the C-terminal. Expression of MSTN appeared to be correlated with muscle development and growth of African catfish. Protein bioinformatics revealed that the primary sequence of Cg-MSTN shared 98 % sequence identity with that of walking catfish Cm-MSTN with only two different residues, [Formula: see text]. and [Formula: see text]. The proposed model of Cg-MSTN revealed the key point mutation [Formula: see text] causing a 7.35 Å shorter distance between the N- and C-lobes and an approximately 11° narrow angle than those of Cm-MSTN. The substitution of a proline residue near the proteolytic processing site which altered the structure of myostatin may play a critical role in reducing proteolytic activity of this protein in African catfish.

  7. Bioinformatics analyses of significant prognostic risk markers for thyroid papillary carcinoma.

    Science.gov (United States)

    Min, Xiao-Shan; Huang, Peng; Liu, Xu; Dong, Chao; Jiang, Xiao-Lin; Yuan, Zheng-Tai; Mao, Lin-Feng; Chang, Shi

    2015-09-01

    This study was aimed to identify the prognostic risk markers for thyroid papillary carcinoma (TPC) by bioinformatics. The clinical data of TPC and their microRNAs (miRNAs) and genes expression profile data were downloaded from The Cancer Genome Atlas. Elastic net-Cox's proportional regression hazards model (EN-COX) was used to identify the prognostic associated factors. The receiver operating characteristic (ROC) curve and Kaplan-Meier (KM) curve were used to screen the significant prognostic risk miRNA and genes. Then, the target genes of the obtained miRNAs were predicted followed by function prediction. Finally, the significant risk genes were performed literature mining and function analysis. Total 1046 miRNAs and 20531 genes in 484 cases samples were identified after data preprocessing. From the EN-COX model, 30 prognostic risk factors were obtained. Based on the 30 risk factors, 3 miRNAs and 11 genes were identified from the ROC and KM curves. The target genes of miRNA-342 such as B-cell CLL/lymphoma 2 (BCL2) were mainly enriched in the biological process related to cellular metabolic process and Disease Ontology terms of lymphoma. The target genes of miRNA-93 were mainly enriched in the pathway of G1 phase. Among the 11 prognostic risk genes, v-maf avian musculoaponeurotic fibrosarcoma oncogene homologue F (MAFF), SRY (sex-determining region Y)-box 4 (SOX4), and retinoic acid receptor, alpha (RARA) encoded transcription factors. Besides, RARA was enriched in four pathways. These prognostic markers such as miRNA-93, miRNA-342, RARA, MAFF, SOX4, and BCL2 may be used as targets for TPC chemoprevention.

  8. Cis-element of the rice PDIL2-3 promoter is responsible for inducing the endoplasmic reticulum stress response.

    Science.gov (United States)

    Takahashi, Hideyuki; Wang, Shuyi; Hayashi, Shimpei; Wakasa, Yuhya; Takaiwa, Fumio

    2014-05-01

    A protein disulfide isomerase (PDI) family oxidoreductase, PDIL2-3, is involved in endoplasmic reticulum (ER) stress responses in rice. We identified a critical cis-element required for induction of the ER stress response. The activation of PDIL2-3 in response to ER stress strongly depends on the IRE1-OsbZIP50 signaling pathway.

  9. Bioinformatics and expression analyses of the Ixodes scapularis tick cystatin family.

    Science.gov (United States)

    Ibelli, Adriana Mércia Guaratini; Hermance, Meghan M; Kim, Tae Kwon; Gonzalez, Cassandra Lee; Mulenga, Albert

    2013-05-01

    The cystatins are inhibitors of papain- and legumain-like cysteine proteinases, classified in MEROPS subfamilies I25A-I25C. This study shows that 84 % (42/50) of tick cystatins are putatively extracellular in subfamily I25B and the rest are putatively intracellular in subfamily I25A. On the neighbor joining phylogeny guide tree, subfamily I25A members cluster together, while subfamily I25B cystatins segregate among prostriata or metastriata ticks. Two Ixodes scapularis cystatins, AAY66864 and ISCW011771 that show 50-71 % amino acid identity to metastriata tick cystatins may be linked to pathways that are common to all ticks, while ISCW000447 100 % conserved in I. ricinus is important among prostriata ticks. Likewise metastriata tick cystatins, Dermacentor variabilis-ACF35512, Rhipicephalus microplus-ACX53850, A. americanum-AEO36092, R. sanguineus-ACX53922, D. variabilis-ACF35514, R. sanguineus-ACX54033 and A. maculatum-AEO35155 that show 73-86 % amino acid identity may be essential to metastriata tick physiology. RT-PCR expression analyses revealed that I. scapularis cystatins were constitutively expressed in the salivary glands, midguts and other tissues of unfed ticks and ticks that were fed for 24-120 h, except for ISCW017861 that are restricted to the 24 h feeding time point. On the basis of mRNA expression patterns, I. scapularis cystatins, ISCW017861, ISCW011771, ISCW002215 and ISCW0024528 that are highly expressed at 24 h are likely involved in regulating early stage tick feeding events such as tick attachment onto host skin and creation of the feeding lesion. Similarly, ISCW018602, ISCW018603 and ISCW000447 that show 2-3 fold transcript increase by 120 h of feeding are likely associated with blood meal up take, while those that maintain steady state expression levels (ISCW018600, ISCW018601 and ISCW018604) during feeding may not be associated with tick feeding regulation. We discuss our findings in the context of advancing our knowledge of tick

  10. Sequencing and bioinformatics-based analyses of the microRNA transcriptome in hepatitis B-related hepatocellular carcinoma.

    Directory of Open Access Journals (Sweden)

    Yoshiaki Mizuguchi

    Full Text Available MicroRNAs (miRNAs participate in crucial biological processes, and it is now evident that miRNA alterations are involved in the progression of human cancers. Recent studies on miRNA profiling performed with cloning suggest that sequencing is useful for the detection of novel miRNAs, modifications, and precise compositions and that miRNA expression levels calculated by clone count are reproducible. Here we focus on sequencing of miRNA to obtain a comprehensive profile and characterization of these transcriptomes as they relate to human liver. Sequencing using 454 sequencing and conventional cloning from 22 pair of HCC and adjacent normal liver (ANL and 3 HCC cell lines identified reliable reads of more than 314000 miRNAs from HCC and more than 268000 from ANL for registered human miRNAs. Computational bioinformatics identified 7 novel miRNAs with high conservation, 15 novel opposite miRNAs, and 3 novel antisense miRNAs. Moreover sequencing can detect miRNA modifications including adenosine-to-inosine editing in miR-376 families. Expression profiling using clone count analysis was used to identify miRNAs that are expressed aberrantly in liver cancer including miR-122, miR-21, and miR-34a. Furthermore, sequencing-based miRNA clustering, but not individual miRNA, detects high risk patients who have high potentials for early tumor recurrence after liver surgery (P = 0.006, and which is the only significant variable among pathological and clinical and variables (P = 0,022. We believe that the combination of sequencing and bioinformatics will accelerate the discovery of novel miRNAs and biomarkers involved in human liver cancer.

  11. NMR spectroscopic and bioinformatic analyses of the LTBP1 C-terminus reveal a highly dynamic domain organisation.

    Directory of Open Access Journals (Sweden)

    Ian B Robertson

    Full Text Available Proteins from the LTBP/fibrillin family perform key structural and functional roles in connective tissues. LTBP1 forms the large latent complex with TGFβ and its propeptide LAP, and sequesters the latent growth factor to the extracellular matrix. Bioinformatics studies suggest the main structural features of the LTBP1 C-terminus are conserved through evolution. NMR studies were carried out on three overlapping C-terminal fragments of LTBP1, comprising four domains with characterised homologues, cbEGF14, TB3, EGF3 and cbEGF15, and three regions with no homology to known structures. The NMR data reveal that the four domains adopt canonical folds, but largely lack the interdomain interactions observed with homologous fibrillin domains; the exception is the EGF3-cbEGF15 domain pair which has a well-defined interdomain interface. (15N relaxation studies further demonstrate that the three interdomain regions act as flexible linkers, allowing a wide range of motion between the well-structured domains. This work is consistent with the LTBP1 C-terminus adopting a flexible "knotted rope" structure, which may facilitate cell matrix interactions, and the accessibility to proteases or other factors that could contribute to TGFβ activation.

  12. MYC cis-Elements in PsMPT Promoter Is Involved in Chilling Response of Paeonia suffruticosa.

    Directory of Open Access Journals (Sweden)

    Yuxi Zhang

    Full Text Available The MPT transports Pi to synthesize ATP. PsMPT, a chilling-induced gene, was previously reported to promote energy metabolism during bud dormancy release in tree peony. In this study, the regulatory elements of PsMPT promoter involved in chilling response were further analyzed. The PsMPT transcript was detected in different tree peony tissues and was highly expressed in the flower organs, including petal, stigma and stamen. An 1174 bp of the PsMPT promoter was isolated by TAIL-PCR, and the PsMPT promoter::GUS transgenic Arabidopsis was generated and analyzed. GUS staining and qPCR showed that the promoter was active in mainly the flower stigma and stamen. Moreover, it was found that the promoter activity was enhanced by chilling, NaCl, GA, ACC and NAA, but inhibited by ABA, mannitol and PEG. In transgenic plants harboring 421 bp of the PsMPT promoter, the GUS gene expression and the activity were significantly increased by chilling treatment. When the fragment from -421 to -408 containing a MYC cis-element was deleted, the chilling response could not be observed. Further mutation analysis confirmed that the MYC element was one of the key motifs responding to chilling in the PsMPT promoter. The present study provides useful information for further investigation of the regulatory mechanism of PsMPT during the endo-dormancy release.

  13. Insights into SCP/TAPS proteins of liver flukes based on large-scale bioinformatic analyses of sequence datasets.

    Directory of Open Access Journals (Sweden)

    Cinzia Cantacessi

    Full Text Available BACKGROUND: SCP/TAPS proteins of parasitic helminths have been proposed to play key roles in fundamental biological processes linked to the invasion of and establishment in their mammalian host animals, such as the transition from free-living to parasitic stages and the modulation of host immune responses. Despite the evidence that SCP/TAPS proteins of parasitic nematodes are involved in host-parasite interactions, there is a paucity of information on this protein family for parasitic trematodes of socio-economic importance. METHODOLOGY/PRINCIPAL FINDINGS: We conducted the first large-scale study of SCP/TAPS proteins of a range of parasitic trematodes of both human and veterinary importance (including the liver flukes Clonorchis sinensis, Opisthorchis viverrini, Fasciola hepatica and F. gigantica as well as the blood flukes Schistosoma mansoni, S. japonicum and S. haematobium. We mined all current transcriptomic and/or genomic sequence datasets from public databases, predicted secondary structures of full-length protein sequences, undertook systematic phylogenetic analyses and investigated the differential transcription of SCP/TAPS genes in O. viverrini and F. hepatica, with an emphasis on those that are up-regulated in the developmental stages infecting the mammalian host. CONCLUSIONS: This work, which sheds new light on SCP/TAPS proteins, guides future structural and functional explorations of key SCP/TAPS molecules associated with diseases caused by flatworms. Future fundamental investigations of these molecules in parasites and the integration of structural and functional data could lead to new approaches for the control of parasitic diseases.

  14. Application of bioinformatics in tropical medicine

    Institute of Scientific and Technical Information of China (English)

    Wiwanitkit V

    2008-01-01

    Bioinformatics is a usage of information technology to help solve biological problems by designing novel and in-cisive algorithms and methods of analyses.Bioinformatics becomes a discipline vital in the era of post-genom-ics.In this review article,the application of bioinformatics in tropical medicine will be presented and dis-cussed.

  15. Trends in IT Innovation to Build a Next Generation Bioinformatics Solution to Manage and Analyse Biological Big Data Produced by NGS Technologies

    Directory of Open Access Journals (Sweden)

    Alexandre G. de Brevern

    2015-01-01

    Full Text Available Sequencing the human genome began in 1994, and 10 years of work were necessary in order to provide a nearly complete sequence. Nowadays, NGS technologies allow sequencing of a whole human genome in a few days. This deluge of data challenges scientists in many ways, as they are faced with data management issues and analysis and visualization drawbacks due to the limitations of current bioinformatics tools. In this paper, we describe how the NGS Big Data revolution changes the way of managing and analysing data. We present how biologists are confronted with abundance of methods, tools, and data formats. To overcome these problems, focus on Big Data Information Technology innovations from web and business intelligence. We underline the interest of NoSQL databases, which are much more efficient than relational databases. Since Big Data leads to the loss of interactivity with data during analysis due to high processing time, we describe solutions from the Business Intelligence that allow one to regain interactivity whatever the volume of data is. We illustrate this point with a focus on the Amadea platform. Finally, we discuss visualization challenges posed by Big Data and present the latest innovations with JavaScript graphic libraries.

  16. Trends in IT Innovation to Build a Next Generation Bioinformatics Solution to Manage and Analyse Biological Big Data Produced by NGS Technologies.

    Science.gov (United States)

    de Brevern, Alexandre G; Meyniel, Jean-Philippe; Fairhead, Cécile; Neuvéglise, Cécile; Malpertuy, Alain

    2015-01-01

    Sequencing the human genome began in 1994, and 10 years of work were necessary in order to provide a nearly complete sequence. Nowadays, NGS technologies allow sequencing of a whole human genome in a few days. This deluge of data challenges scientists in many ways, as they are faced with data management issues and analysis and visualization drawbacks due to the limitations of current bioinformatics tools. In this paper, we describe how the NGS Big Data revolution changes the way of managing and analysing data. We present how biologists are confronted with abundance of methods, tools, and data formats. To overcome these problems, focus on Big Data Information Technology innovations from web and business intelligence. We underline the interest of NoSQL databases, which are much more efficient than relational databases. Since Big Data leads to the loss of interactivity with data during analysis due to high processing time, we describe solutions from the Business Intelligence that allow one to regain interactivity whatever the volume of data is. We illustrate this point with a focus on the Amadea platform. Finally, we discuss visualization challenges posed by Big Data and present the latest innovations with JavaScript graphic libraries.

  17. Regulation of alternative splicing of the receptor for advanced glycation endproducts (RAGE) through G-rich cis-elements and heterogenous nuclear ribonucleoprotein H.

    Science.gov (United States)

    Ohe, Kazuyo; Watanabe, Takuo; Harada, Shin-ichi; Munesue, Seiichi; Yamamoto, Yasuhiko; Yonekura, Hideto; Yamamoto, Hiroshi

    2010-05-01

    Receptor for advanced glycation endproducts (RAGE) is a cell-surface receptor. The binding of ligands to membrane-bound RAGE (mRAGE) evokes cellular responses involved in various pathological processes. Previously, we identified a novel soluble form, endogenous secretory RAGE (esRAGE) generated by alternative 5' splice site selection in intron 9 that leads to extension of exon 9 (exon 9B). Because esRAGE works as an antagonistic decoy receptor, the elucidation of regulatory mechanism of the alternative splicing is important to understand RAGE-related pathological processes. Here, we identified G-rich cis-elements within exon 9B for regulation of the alternative splicing using a RAGE minigene. Mutagenesis of the G-rich cis-elements caused a drastic increase in the esRAGE/mRAGE ratio in the minigene-transfected cells and in loss of binding of the RNA motif to heterogenous nuclear ribonucleoprotein (hnRNP) H. On the other hand, the artificial introduction of a G-stretch in exon 9B caused a drastic decrease in the esRAGE/mRAGE ratio accompanied by the binding of hnRNP H to the RNA motif. Thus, the G-stretches within exon 9B regulate RAGE alternative splicing via interaction with hnRNP H. The findings should provide a molecular basis for the development of medicines for RAGE-related disorders that could modulate esRAGE/mRAGE ratio.

  18. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software.

    Science.gov (United States)

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians.

  19. Proteomic and bioinformatic analyses of possible target-related proteins of gambogic acid in human breast carcinoma MDA-MB-231 cells.

    Science.gov (United States)

    Li, Dong; Song, Xiao-Yi; Yue, Qing-Xi; Cui, Ya-Jun; Liu, Miao; Feng, Li-Xing; Wu, Wan-Ying; Jiang, Bao-Hong; Yang, Min; Qu, Xiao-Bo; Liu, Xuan; Guo, De-An

    2015-01-01

    Gambogic acid (GA) is an anticancer agent in phase ‖b clinical trial in China but its mechanism of action has not been fully clarified. The present study was designed to search the possible target-related proteins of GA in cancer cells using proteomic method and establish possible network using bioinformatic analysis. Cytotoxicity and anti-migration effects of GA in MDA-MB-231 cells were checked using MTT assay, flow cytometry, wound migration assay, and chamber migration assay. Possible target-related proteins of GA at early (3 h) and late stage (24 h) of treatment were searched using a proteomic technology, two-dimensional electrophoresis (2-DE). The possible network of GA was established using bioinformatic analysis. The intracellular expression levels of vimentin, keratin 18, and calumenin were determined using Western blotting. GA inhibited cell proliferation and induced cell cycle arrest at G2/M phase and apoptosis in MDA-MB-231 cells. Additionally, GA exhibited anti-migration effects at non-toxic doses. In 2-DE analysis, totally 23 possible GA targeted proteins were found, including those with functions in cytoskeleton and transport, regulation of redox state, metabolism, ubiquitin-proteasome system, transcription and translation, protein transport and modification, and cytokine. Network analysis of these proteins suggested that cytoskeleton-related proteins might play important roles in the effects of GA. Results of Western blotting confirmed the cleavage of vimentin, increase in keratin 18, and decrease in calumenin levels in GA-treated cells. In summary, GA is a multi-target compound and its anti-cancer effects may be based on several target-related proteins such as cytoskeleton-related proteins.

  20. Engineering BioInformatics

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    @@ With the completion of human genome sequencing, a new era of bioinformatics st arts. On one hand, due to the advance of high throughput DNA microarray technol ogies, functional genomics such as gene expression information has increased exp onentially and will continue to do so for the foreseeable future. Conventional m eans of storing, analysing and comparing related data are already overburdened. Moreover, the rich information in genes , their functions and their associated wide biological implication requires new technologies of analysing data that employ sophisticated statistical and machine learning algorithms, powerful com puters and intensive interaction together different data sources such as seque nce data, gene expression data, proteomics data and metabolic pathway informati on to discover complex genomic structures and functional patterns with other bi ological process to gain a comprehensive understanding of cell physiology.

  1. An Introduction to Bioinformatics

    Institute of Scientific and Technical Information of China (English)

    SHENG Qi-zheng; De Moor Bart

    2004-01-01

    As a newborn interdisciplinary field, bioinformatics is receiving increasing attention from biologists, computer scientists, statisticians, mathematicians and engineers. This paper briefly introduces the birth, importance, and extensive applications of bioinformatics in the different fields of biological research. A major challenge in bioinformatics - the unraveling of gene regulation - is discussed in detail.

  2. Statement complementing the EFSA opinion on application EFSA GMO UK 2007 41 (cotton MON 88913 for food and feed uses, import and processing taking into consideration updated bioinformatic analyses

    Directory of Open Access Journals (Sweden)

    EFSA Panel on Genetically Modified Organisms (GMO

    2014-03-01

    Full Text Available In this statement, the EFSA GMO Panel responds to a request from the European Commission (EC to complement its partially inconclusive scientific opinion on cotton MON 88913 taking into consideration updated bioinformatic analyses submitted by the applicant after the adoption. Similarity searches assessed the identity of the genomic sequences flanking the MON 88913 insert, the potential of creating open reading frames (ORFs showing similarity to known allergens or toxins and the similarity of the newly expressed CP4 EPSPS protein to known allergens or toxins. Having assessed these searches, the EFSA GMO Panel did not identify interruptions of known cotton genes or any safety issue arising from the identified ORFs including the newly expressed CP4 EPSPS protein. In conclusion, the EFSA GMO Panel considers that cotton MON 88913, as assessed in the scientific opinion on application EFSA-GMO-UK-2007-41 and in the supplementary bioinformatic dataset, is as safe and nutritious as its conventional counterpart and commercial cotton varieties with respect to potential effects on human and animal health and the environment in the context of its intended uses.

  3. Bioinformatics and moonlighting proteins

    Directory of Open Access Journals (Sweden)

    Sergio eHernández

    2015-06-01

    Full Text Available Multitasking or moonlighting is the capability of some proteins to execute two or more biochemical functions. Usually, moonlighting proteins are experimentally revealed by serendipity. For this reason, it would be helpful that Bioinformatics could predict this multifunctionality, especially because of the large amounts of sequences from genome projects. In the present work, we analyse and describe several approaches that use sequences, structures, interactomics and current bioinformatics algorithms and programs to try to overcome this problem. Among these approaches are: a remote homology searches using Psi-Blast, b detection of functional motifs and domains, c analysis of data from protein-protein interaction databases (PPIs, d match the query protein sequence to 3D databases (i.e., algorithms as PISITE, e mutation correlation analysis between amino acids by algorithms as MISTIC. Programs designed to identify functional motif/domains detect mainly the canonical function but usually fail in the detection of the moonlighting one, Pfam and ProDom being the best methods. Remote homology search by Psi-Blast combined with data from interactomics databases (PPIs have the best performance. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can only be used in very specific situations –it requires the existence of multialigned family protein sequences - but can suggest how the evolutionary process of second function acquisition took place. The multitasking protein database MultitaskProtDB (http://wallace.uab.es/multitask/, previously published by our group, has been used as a benchmark for the all of the analyses.

  4. CstF-64 and 3′-UTR cis-element determine Star-PAP specificity for target mRNA selection by excluding PAPα

    Science.gov (United States)

    Kandala, Divya T.; Mohan, Nimmy; A, Vivekanand; AP, Sudheesh; G, Reshmi; Laishram, Rakesh S.

    2016-01-01

    Almost all eukaryotic mRNAs have a poly (A) tail at the 3′-end. Canonical PAPs (PAPα/γ) polyadenylate nuclear pre-mRNAs. The recent identification of the non-canonical Star-PAP revealed specificity of nuclear PAPs for pre-mRNAs, yet the mechanism how Star-PAP selects mRNA targets is still elusive. Moreover, how Star-PAP target mRNAs having canonical AAUAAA signal are not regulated by PAPα is unclear. We investigate specificity mechanisms of Star-PAP that selects pre-mRNA targets for polyadenylation. Star-PAP assembles distinct 3′-end processing complex and controls pre-mRNAs independent of PAPα. We identified a Star-PAP recognition nucleotide motif and showed that suboptimal DSE on Star-PAP target pre-mRNA 3′-UTRs inhibit CstF-64 binding, thus preventing PAPα recruitment onto it. Altering 3′-UTR cis-elements on a Star-PAP target pre-mRNA can switch the regulatory PAP from Star-PAP to PAPα. Our results suggest a mechanism of poly (A) site selection that has potential implication on the regulation of alternative polyadenylation. PMID:26496945

  5. Deep Learning in Bioinformatics

    OpenAIRE

    Min, Seonwoo; Lee, Byunghan; Yoon, Sungroh

    2016-01-01

    In the era of big data, transformation of biomedical big data into valuable knowledge has been one of the most important challenges in bioinformatics. Deep learning has advanced rapidly since the early 2000s and now demonstrates state-of-the-art performance in various fields. Accordingly, application of deep learning in bioinformatics to gain insight from data has been emphasized in both academia and industry. Here, we review deep learning in bioinformatics, presenting examples of current res...

  6. Arabidopsis miR171-targeted scarecrow-like proteins bind to GT cis-elements and mediate gibberellin-regulated chlorophyll biosynthesis under light conditions.

    Science.gov (United States)

    Ma, Zhaoxue; Hu, Xupeng; Cai, Wenjuan; Huang, Weihua; Zhou, Xin; Luo, Qian; Yang, Hongquan; Wang, Jiawei; Huang, Jirong

    2014-08-01

    An extraordinarily precise regulation of chlorophyll biosynthesis is essential for plant growth and development. However, our knowledge on the complex regulatory mechanisms of chlorophyll biosynthesis is very limited. Previous studies have demonstrated that miR171-targeted scarecrow-like proteins (SCL6/22/27) negatively regulate chlorophyll biosynthesis via an unknown mechanism. Here we showed that SCLs inhibit the expression of the key gene encoding protochlorophyllide oxidoreductase (POR) in light-grown plants, but have no significant effect on protochlorophyllide biosynthesis in etiolated seedlings. Histochemical analysis of β-glucuronidase (GUS) activity in transgenic plants expressing pSCL27::rSCL27-GUS revealed that SCL27-GUS accumulates at high levels and suppresses chlorophyll biosynthesis at the leaf basal proliferation region during leaf development. Transient gene expression assays showed that the promoter activity of PORC is indeed regulated by SCL27. Consistently, chromatin immunoprecipitation and quantitative PCR assays showed that SCL27 binds to the promoter region of PORC in vivo. An electrophoretic mobility shift assay revealed that SCL27 is directly interacted with G(A/G)(A/T)AA(A/T)GT cis-elements of the PORC promoter. Furthermore, genetic analysis showed that gibberellin (GA)-regulated chlorophyll biosynthesis is mediated, at least in part, by SCLs. We demonstrated that SCL27 interacts with DELLA proteins in vitro and in vivo by yeast-two-hybrid and coimmunoprecipitation analysis and found that their interaction reduces the binding activity of SCL27 to the PORC promoter. Additionally, we showed that SCL27 activates MIR171 gene expression, forming a feedback regulatory loop. Taken together, our data suggest that the miR171-SCL module is critical for mediating GA-DELLA signaling in the coordinate regulation of chlorophyll biosynthesis and leaf growth in light.

  7. Concepts and introduction to RNA bioinformatics

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Hofacker, Ivo L.; Ruzzo, Walter L.

    2014-01-01

    RNA bioinformatics and computational RNA biology have emerged from implementing methods for predicting the secondary structure of single sequences. The field has evolved to exploit multiple sequences to take evolutionary information into account, such as compensating (and structure preserving) base...... for interactions between RNA and proteins.Here, we introduce the basic concepts of predicting RNA secondary structure relevant to the further analyses of RNA sequences. We also provide pointers to methods addressing various aspects of RNA bioinformatics and computational RNA biology....

  8. Systematic analysis of cis-elements in unstable mRNAs demonstrates that CUGBP1 is a key regulator of mRNA decay in muscle cells.

    Directory of Open Access Journals (Sweden)

    Jerome E Lee

    Full Text Available BACKGROUND: Dramatic changes in gene expression occur in response to extracellular stimuli and during differentiation. Although transcriptional effects are important, alterations in mRNA decay also play a major role in achieving rapid and massive changes in mRNA abundance. Moreover, just as transcription factor activity varies between different cell types, the factors influencing mRNA decay are also cell-type specific. PRINCIPAL FINDINGS: We have established the rates of decay for over 7000 transcripts expressed in mouse C2C12 myoblasts. We found that GU-rich (GRE and AU-rich (ARE elements are over-represented in the 3'UTRs of short-lived mRNAs and that these mRNAs tend to encode factors involved in cell cycle and transcription regulation. Stabilizing elements were also identified. By comparing mRNA decay rates in C2C12 cells with those previously measured for pluripotent and differentiating embryonic stem (ES cells, we identified several groups of transcripts that exhibit cell-type specific decay rates. Further, whereas in C2C12 cells the impact of GREs on mRNA decay appears to be greater than that of AREs, AREs are more significant in ES cells, supporting the idea that cis elements make a cell-specific contribution to mRNA stability. GREs are recognized by CUGBP1, an RNA-binding protein and instability factor whose function is affected in several neuromuscular diseases. We therefore utilized RNA immunoprecipitation followed by microarray (RIP-Chip to identify CUGBP1-associated transcripts. These mRNAs also showed dramatic enrichment of GREs in their 3'UTRs and encode proteins linked with cell cycle, and intracellular transport. Interestingly several CUGBP1 substrate mRNAs, including those encoding the myogenic transcription factors Myod1 and Myog, are also bound by the stabilizing factor HuR in C2C12 cells. Finally, we show that several CUGBP1-associated mRNAs containing 3'UTR GREs, including Myod1, are stabilized in cells depleted of CUGBP1

  9. Deep learning in bioinformatics.

    Science.gov (United States)

    Min, Seonwoo; Lee, Byunghan; Yoon, Sungroh

    2016-07-29

    In the era of big data, transformation of biomedical big data into valuable knowledge has been one of the most important challenges in bioinformatics. Deep learning has advanced rapidly since the early 2000s and now demonstrates state-of-the-art performance in various fields. Accordingly, application of deep learning in bioinformatics to gain insight from data has been emphasized in both academia and industry. Here, we review deep learning in bioinformatics, presenting examples of current research. To provide a useful and comprehensive perspective, we categorize research both by the bioinformatics domain (i.e. omics, biomedical imaging, biomedical signal processing) and deep learning architecture (i.e. deep neural networks, convolutional neural networks, recurrent neural networks, emergent architectures) and present brief descriptions of each study. Additionally, we discuss theoretical and practical issues of deep learning in bioinformatics and suggest future research directions. We believe that this review will provide valuable insights and serve as a starting point for researchers to apply deep learning approaches in their bioinformatics studies.

  10. Bioinformatic Analyses of Subgroup-A Members of the Wheat bZIP Transcription Factor Family and Functional Identification of TabZIP174 Involved in Drought Stress Response

    Directory of Open Access Journals (Sweden)

    Xueyin Li

    2016-11-01

    Full Text Available Extensive studies in Arabidopsis and rice have demonstrated that Subgroup-A members of the bZIP transcription factor family play important roles in plant responses to multiple abiotic stresses. Although common wheat (Triticum aestivum is one of the most widely cultivated and consumed food crops in the world, there are limited investigations into Subgroup A of the bZIP family in wheat. In this study, we performed bioinformatic analyses of the 41 Subgroup-A members of the wheat bZIP family. Phylogenetic and conserved motif analyses showed that most of the Subgroup-A bZIP proteins involved in abiotic stress responses of wheat, Arabidopsis and rice clustered in Clade A1 of the phylogenetic tree, and shared a majority of conserved motifs, suggesting the potential importance of Clade-A1 members in abiotic stress responses. Gene structure analysis showed that TabZIP genes with close phylogenetic relationships tended to possess similar exon-intron compositions, and the positions of introns in the hinge regions of the bZIP domains were highly conserved, whereas introns in the leucine zipper regions were at variable positions. Additionally, eleven groups of homologs and two groups of tandem paralogs were also identified in Subgroup A of the wheat bZIP family. Expression profiling analysis indicated that most Subgroup-A TabZIP genes were responsive to abscisic acid and various abiotic stress treatments. TabZIP27, TabZIP74, TabZIP138 and TabZIP174 proteins were localized in the nucleus of wheat protoplasts, whereas TabZIP9-GFP fusion protein was simultaneously present in the nucleus, cytoplasm and cell membrane. Transgenic Arabidopsis overexpressing TabZIP174 displayed increased seed germination rates and primary root lengths under drought treatments. Overexpression of TabZIP174 in transgenic Arabidopsis conferred enhanced drought tolerance, and transgenic plants exhibited lower water loss rates, higher survival rates, higher proline, soluble sugar and leaf

  11. Bioinformatic Analyses of Subgroup-A Members of the Wheat bZIP Transcription Factor Family and Functional Identification of TabZIP174 Involved in Drought Stress Response.

    Science.gov (United States)

    Li, Xueyin; Feng, Biane; Zhang, Fengjie; Tang, Yimiao; Zhang, Liping; Ma, Lingjian; Zhao, Changping; Gao, Shiqing

    2016-01-01

    Extensive studies in Arabidopsis and rice have demonstrated that Subgroup-A members of the bZIP transcription factor family play important roles in plant responses to multiple abiotic stresses. Although common wheat (Triticum aestivum) is one of the most widely cultivated and consumed food crops in the world, there are limited investigations into Subgroup A of the bZIP family in wheat. In this study, we performed bioinformatic analyses of the 41 Subgroup-A members of the wheat bZIP family. Phylogenetic and conserved motif analyses showed that most of the Subgroup-A bZIP proteins involved in abiotic stress responses of wheat, Arabidopsis, and rice clustered in Clade A1 of the phylogenetic tree, and shared a majority of conserved motifs, suggesting the potential importance of Clade-A1 members in abiotic stress responses. Gene structure analysis showed that TabZIP genes with close phylogenetic relationships tended to possess similar exon-intron compositions, and the positions of introns in the hinge regions of the bZIP domains were highly conserved, whereas introns in the leucine zipper regions were at variable positions. Additionally, eleven groups of homologs and two groups of tandem paralogs were also identified in Subgroup A of the wheat bZIP family. Expression profiling analysis indicated that most Subgroup-A TabZIP genes were responsive to abscisic acid and various abiotic stress treatments. TabZIP27, TabZIP74, TabZIP138, and TabZIP174 proteins were localized in the nucleus of wheat protoplasts, whereas TabZIP9-GFP fusion protein was simultaneously present in the nucleus, cytoplasm, and cell membrane. Transgenic Arabidopsis overexpressing TabZIP174 displayed increased seed germination rates and primary root lengths under drought treatments. Overexpression of TabZIP174 in transgenic Arabidopsis conferred enhanced drought tolerance, and transgenic plants exhibited lower water loss rates, higher survival rates, higher proline, soluble sugar, and leaf chlorophyll

  12. Bioinformatics for Exploration

    Science.gov (United States)

    Johnson, Kathy A.

    2006-01-01

    For the purpose of this paper, bioinformatics is defined as the application of computer technology to the management of biological information. It can be thought of as the science of developing computer databases and algorithms to facilitate and expedite biological research. This is a crosscutting capability that supports nearly all human health areas ranging from computational modeling, to pharmacodynamics research projects, to decision support systems within autonomous medical care. Bioinformatics serves to increase the efficiency and effectiveness of the life sciences research program. It provides data, information, and knowledge capture which further supports management of the bioastronautics research roadmap - identifying gaps that still remain and enabling the determination of which risks have been addressed.

  13. Feature selection in bioinformatics

    Science.gov (United States)

    Wang, Lipo

    2012-06-01

    In bioinformatics, there are often a large number of input features. For example, there are millions of single nucleotide polymorphisms (SNPs) that are genetic variations which determine the dierence between any two unrelated individuals. In microarrays, thousands of genes can be proled in each test. It is important to nd out which input features (e.g., SNPs or genes) are useful in classication of a certain group of people or diagnosis of a given disease. In this paper, we investigate some powerful feature selection techniques and apply them to problems in bioinformatics. We are able to identify a very small number of input features sucient for tasks at hand and we demonstrate this with some real-world data.

  14. Distributed computing in bioinformatics.

    Science.gov (United States)

    Jain, Eric

    2002-01-01

    This paper provides an overview of methods and current applications of distributed computing in bioinformatics. Distributed computing is a strategy of dividing a large workload among multiple computers to reduce processing time, or to make use of resources such as programs and databases that are not available on all computers. Participating computers may be connected either through a local high-speed network or through the Internet.

  15. Phylogenetic trees in bioinformatics

    Energy Technology Data Exchange (ETDEWEB)

    Burr, Tom L [Los Alamos National Laboratory

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  16. Flow cytometry bioinformatics.

    Directory of Open Access Journals (Sweden)

    Kieran O'Neill

    Full Text Available Flow cytometry bioinformatics is the application of bioinformatics to flow cytometry data, which involves storing, retrieving, organizing, and analyzing flow cytometry data using extensive computational resources and tools. Flow cytometry bioinformatics requires extensive use of and contributes to the development of techniques from computational statistics and machine learning. Flow cytometry and related methods allow the quantification of multiple independent biomarkers on large numbers of single cells. The rapid growth in the multidimensionality and throughput of flow cytometry data, particularly in the 2000s, has led to the creation of a variety of computational analysis methods, data standards, and public databases for the sharing of results. Computational methods exist to assist in the preprocessing of flow cytometry data, identifying cell populations within it, matching those cell populations across samples, and performing diagnosis and discovery using the results of previous steps. For preprocessing, this includes compensating for spectral overlap, transforming data onto scales conducive to visualization and analysis, assessing data for quality, and normalizing data across samples and experiments. For population identification, tools are available to aid traditional manual identification of populations in two-dimensional scatter plots (gating, to use dimensionality reduction to aid gating, and to find populations automatically in higher dimensional space in a variety of ways. It is also possible to characterize data in more comprehensive ways, such as the density-guided binary space partitioning technique known as probability binning, or by combinatorial gating. Finally, diagnosis using flow cytometry data can be aided by supervised learning techniques, and discovery of new cell types of biological importance by high-throughput statistical methods, as part of pipelines incorporating all of the aforementioned methods. Open standards, data

  17. Primary structure and promoter analysis of leghemoglobin genes of the stem-nodulated tropical legume Sesbania rostrata: conserved coding sequences, cis-elements and trans-acting factors

    DEFF Research Database (Denmark)

    Metz, B A; Welters, P; Hoffmann, H J;

    1988-01-01

    The primary structure of a leghemoglobin (lb) gene from the stem-nodulated, tropical legume Sesbania rostrata and two lb gene promoter regions was analysed. The S. rostrata lb gene structure and Lb amino acid composition were found to be highly conserved with previously described lb genes and Lb...

  18. Pattern recognition in bioinformatics.

    Science.gov (United States)

    de Ridder, Dick; de Ridder, Jeroen; Reinders, Marcel J T

    2013-09-01

    Pattern recognition is concerned with the development of systems that learn to solve a given problem using a set of example instances, each represented by a number of features. These problems include clustering, the grouping of similar instances; classification, the task of assigning a discrete label to a given instance; and dimensionality reduction, combining or selecting features to arrive at a more useful representation. The use of statistical pattern recognition algorithms in bioinformatics is pervasive. Classification and clustering are often applied to high-throughput measurement data arising from microarray, mass spectrometry and next-generation sequencing experiments for selecting markers, predicting phenotype and grouping objects or genes. Less explicitly, classification is at the core of a wide range of tools such as predictors of genes, protein function, functional or genetic interactions, etc., and used extensively in systems biology. A course on pattern recognition (or machine learning) should therefore be at the core of any bioinformatics education program. In this review, we discuss the main elements of a pattern recognition course, based on material developed for courses taught at the BSc, MSc and PhD levels to an audience of bioinformaticians, computer scientists and life scientists. We pay attention to common problems and pitfalls encountered in applications and in interpretation of the results obtained.

  19. Virtual Bioinformatics Distance Learning Suite

    Science.gov (United States)

    Tolvanen, Martti; Vihinen, Mauno

    2004-01-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material…

  20. Bioinformatics meets parasitology.

    Science.gov (United States)

    Cantacessi, C; Campbell, B E; Jex, A R; Young, N D; Hall, R S; Ranganathan, S; Gasser, R B

    2012-05-01

    The advent and integration of high-throughput '-omics' technologies (e.g. genomics, transcriptomics, proteomics, metabolomics, glycomics and lipidomics) are revolutionizing the way biology is done, allowing the systems biology of organisms to be explored. These technologies are now providing unique opportunities for global, molecular investigations of parasites. For example, studies of a transcriptome (all transcripts in an organism, tissue or cell) have become instrumental in providing insights into aspects of gene expression, regulation and function in a parasite, which is a major step to understanding its biology. The purpose of this article was to review recent applications of next-generation sequencing technologies and bioinformatic tools to large-scale investigations of the transcriptomes of parasitic nematodes of socio-economic significance (particularly key species of the order Strongylida) and to indicate the prospects and implications of these explorations for developing novel methods of parasite intervention.

  1. Emergent Computation Emphasizing Bioinformatics

    CERN Document Server

    Simon, Matthew

    2005-01-01

    Emergent Computation is concerned with recent applications of Mathematical Linguistics or Automata Theory. This subject has a primary focus upon "Bioinformatics" (the Genome and arising interest in the Proteome), but the closing chapter also examines applications in Biology, Medicine, Anthropology, etc. The book is composed of an organized examination of DNA, RNA, and the assembly of amino acids into proteins. Rather than examine these areas from a purely mathematical viewpoint (that excludes much of the biochemical reality), the author uses scientific papers written mostly by biochemists based upon their laboratory observations. Thus while DNA may exist in its double stranded form, triple stranded forms are not excluded. Similarly, while bases exist in Watson-Crick complements, mismatched bases and abasic pairs are not excluded, nor are Hoogsteen bonds. Just as there are four bases naturally found in DNA, the existence of additional bases is not ignored, nor amino acids in addition to the usual complement of...

  2. Virtual bioinformatics distance learning suite*.

    Science.gov (United States)

    Tolvanen, Martti; Vihinen, Mauno

    2004-05-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material over the Internet. Currently, we provide two fully computer-based courses, "Introduction to Bioinformatics" and "Bioinformatics in Functional Genomics." Here we will discuss the application of distance learning in bioinformatics training and our experiences gained during the 3 years that we have run the courses, with about 400 students from a number of universities. The courses are available at bioinf.uta.fi.

  3. Bioinformatics Approaches for Human Gut Microbiome Research

    Directory of Open Access Journals (Sweden)

    Zhijun Zheng

    2016-07-01

    Full Text Available The human microbiome has received much attention because many studies have reported that the human gut microbiome is associated with several diseases. The very large datasets that are produced by these kinds of studies means that bioinformatics approaches are crucial for their analysis. Here, we systematically reviewed bioinformatics tools that are commonly used in microbiome research, including a typical pipeline and software for sequence alignment, abundance profiling, enterotype determination, taxonomic diversity, identifying differentially abundant species/genes, gene cataloging, and functional analyses. We also summarized the algorithms and methods used to define metagenomic species and co-abundance gene groups to expand our understanding of unclassified and poorly understood gut microbes that are undocumented in the current genome databases. Additionally, we examined the methods used to identify metagenomic biomarkers based on the gut microbiome, which might help to expand the knowledge and approaches for disease detection and monitoring.

  4. Exon B of human surfactant protein A2 mRNA, alone or within its surrounding sequences, interacts with 14-3-3; role of cis-elements and secondary structure.

    Science.gov (United States)

    Noutsios, Georgios T; Silveyra, Patricia; Bhatti, Faizah; Floros, Joanna

    2013-06-01

    Human surfactant protein A, an innate immunity molecule, is encoded by two genes: SFTPA1 (SP-A1) and SFTPA2 (SP-A2). The 5' untranslated (5'UTR) splice variant of SP-A2 (ABD), but not of SP-A1 (AD), contains exon B (eB), which is an enhancer for transcription and translation. We investigated whether eB contains cis-regulatory elements that bind trans-acting factors in a sequence-specific manner as well as the role of the eB mRNA secondary structure. Binding of cytoplasmic NCI-H441 proteins to wild-type eB, eB mutant, AD, and ABD 5'UTR mRNAs were studied by RNA electromobility shift assays (REMSAs). The bound proteins were identified by mass spectroscopy and specific antibodies (Abs). We found that 1) proteins bind eB mRNA in a sequence-specific manner, with two cis-elements identified within eB to be important; 2) eB secondary structure is necessary for binding; 3) mass spectroscopy and specific Abs in REMSAs identified 14-3-3 proteins to bind (directly or indirectly) eB and the natural SP-A2 (ABD) splice variant but not the SP-A1 (AD) splice variant; 4) other ribosomal and cytoskeletal proteins, and translation factors, are also present in the eB mRNA-protein complex; 5) knockdown of 14-3-3 β/α isoform resulted in a downregulation of SP-A2 expression. In conclusion, proteins including the 14-3-3 family bind two cis-elements within eB of hSP-A2 mRNA in a sequence- and secondary structure-specific manner. Differential regulation of SP-A1 and SP-A2 is mediated by the 14-3-3 protein family as well as by a number of other proteins that bind UTRs with or without eB mRNA.

  5. String Mining in Bioinformatics

    Science.gov (United States)

    Abouelhoda, Mohamed; Ghanem, Moustafa

    Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word "data-mining" is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].

  6. Genome Exploitation and Bioinformatics Tools

    Science.gov (United States)

    de Jong, Anne; van Heel, Auke J.; Kuipers, Oscar P.

    Bioinformatic tools can greatly improve the efficiency of bacteriocin screening efforts by limiting the amount of strains. Different classes of bacteriocins can be detected in genomes by looking at different features. Finding small bacteriocins can be especially challenging due to low homology and because small open reading frames (ORFs) are often omitted from annotations. In this chapter, several bioinformatic tools/strategies to identify bacteriocins in genomes are discussed.

  7. Clustering Techniques in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Muhammad Ali Masood

    2015-01-01

    Full Text Available Dealing with data means to group information into a set of categories either in order to learn new artifacts or understand new domains. For this purpose researchers have always looked for the hidden patterns in data that can be defined and compared with other known notions based on the similarity or dissimilarity of their attributes according to well-defined rules. Data mining, having the tools of data classification and data clustering, is one of the most powerful techniques to deal with data in such a manner that it can help researchers identify the required information. As a step forward to address this challenge, experts have utilized clustering techniques as a mean of exploring hidden structure and patterns in underlying data. Improved stability, robustness and accuracy of unsupervised data classification in many fields including pattern recognition, machine learning, information retrieval, image analysis and bioinformatics, clustering has proven itself as a reliable tool. To identify the clusters in datasets algorithm are utilized to partition data set into several groups based on the similarity within a group. There is no specific clustering algorithm, but various algorithms are utilized based on domain of data that constitutes a cluster and the level of efficiency required. Clustering techniques are categorized based upon different approaches. This paper is a survey of few clustering techniques out of many in data mining. For the purpose five of the most common clustering techniques out of many have been discussed. The clustering techniques which have been surveyed are: K-medoids, K-means, Fuzzy C-means, Density-Based Spatial Clustering of Applications with Noise (DBSCAN and Self-Organizing Map (SOM clustering.

  8. Kaposi's sarcoma-associated herpesvirus (human herpesvirus 8) replication and transcription factor activates the K9 (vIRF) gene through two distinct cis elements by a non-DNA-binding mechanism.

    Science.gov (United States)

    Ueda, Keiji; Ishikawa, Kayo; Nishimura, Ken; Sakakibara, Shuhei; Do, Eunju; Yamanishi, Koichi

    2002-12-01

    The replication and transcription activator (RTA) of Kaposi's sarcoma-associated herpesvirus (KSHV), or human herpesvirus 8, a homologue of Epstein-Barr virus BRLF1 or Rta, is a strong transactivator and inducer of lytic replication. RTA acting alone can induce lytic replication of KSHV in infected cell lines that originated from primary effusion lymphomas, leading to virus production. During the lytic replication process, RTA activates many kinds of genes, including polyadenylated nuclear RNA, K8, K9 (vIRF), ORF57, and so on. We focused here on the mechanism of how RTA upregulates the K9 (vIRF) promoter and identified two independent cis-acting elements in the K9 (vIRF) promoter that responded to RTA. These elements were finally confined to the sequence 5'-TCTGGGACAGTC-3' in responsive element (RE) I-2B and the sequence 5'-GTACTTAAAATA-3' in RE IIC-2, both of which did not share sequence homology. Multiple factors bound specifically with these elements, and their binding was correlated with the RTA-responsive activity. Electrophoretic mobility shift assay with nuclear extract from infected cells and the N-terminal part of RTA expressed in Escherichia coli, however, did not show that RTA interacted directly with these elements, in contrast to the RTA responsive elements in the PAN/K12 promoter region, the ORF57/K8 promoter region. Thus, it was likely that RTA could transactivate several kinds of unique cis elements without directly binding to the responsive elements, probably through cooperation with other DNA-binding factors.

  9. Training Experimental Biologists in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Pedro Fernandes

    2012-01-01

    Full Text Available Bioinformatics, for its very nature, is devoted to a set of targets that constantly evolve. Training is probably the best response to the constant need for the acquisition of bioinformatics skills. It is interesting to assess the effects of training in the different sets of researchers that make use of it. While training bench experimentalists in the life sciences, we have observed instances of changes in their attitudes in research that, if well exploited, can have beneficial impacts in the dialogue with professional bioinformaticians and influence the conduction of the research itself.

  10. Bioinformatics interoperability: all together now !

    NARCIS (Netherlands)

    Meganck, B.; Mergen, P.; Meirte, D.

    2009-01-01

    The following text presents some personal ideas about the way (bio)informatics2 is heading, along with some examples of how our institution – the Royal Museum for Central Africa (RMCA) – is gearing up for these new times ahead. It tries to find the important trends amongst the buzzwords, and to demo

  11. The secondary metabolite bioinformatics portal

    DEFF Research Database (Denmark)

    Weber, Tilmann; Kim, Hyun Uk

    2016-01-01

    . In this context, this review gives a summary of tools and databases that currently are available to mine, identify and characterize natural product biosynthesis pathways and their producers based on ‘omics data. A web portal called Secondary Metabolite Bioinformatics Portal (SMBP at http...

  12. Bioinformatics and the Undergraduate Curriculum

    Science.gov (United States)

    Maloney, Mark; Parker, Jeffrey; LeBlanc, Mark; Woodard, Craig T.; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of…

  13. Visualising "Junk" DNA through Bioinformatics

    Science.gov (United States)

    Elwess, Nancy L.; Latourelle, Sandra M.; Cauthorn, Olivia

    2005-01-01

    One of the hottest areas of science today is the field in which biology, information technology,and computer science are merged into a single discipline called bioinformatics. This field enables the discovery and analysis of biological data, including nucleotide and amino acid sequences that are easily accessed through the use of computers. As…

  14. Reproducible Bioinformatics Research for Biologists

    Science.gov (United States)

    This book chapter describes the current Big Data problem in Bioinformatics and the resulting issues with performing reproducible computational research. The core of the chapter provides guidelines and summaries of current tools/techniques that a noncomputational researcher would need to learn to pe...

  15. Virginia Bioinformatics Institute awards Transdisciplinary Team Science

    OpenAIRE

    Bland, Susan

    2009-01-01

    The Virginia Bioinformatics Institute at Virginia Tech, in collaboration with Virginia Tech's Ph.D. program in genetics, bioinformatics, and computational biology, has awarded three fellowships in support of graduate work in transdisciplinary team science.

  16. Pay-as-you-go data integration for bio-informatics

    NARCIS (Netherlands)

    Wanders, Brend

    2012-01-01

    Scientific research in bio-informatics is often data-driven and supported by numerous biological databases. A biological database contains factual information collected from scientific experiments and computational analyses about areas including genomics, proteomics, metabolomics, microarray gene ex

  17. Using registries to integrate bioinformatics tools and services into workbench environments

    DEFF Research Database (Denmark)

    Ménager, Hervé; Kalaš, Matúš; Rapacki, Kristoffer;

    2016-01-01

    The diversity and complexity of bioinformatics resources presents significant challenges to their localisation, deployment and use, creating a need for reliable systems that address these issues. Meanwhile, users demand increasingly usable and integrated ways to access and analyse data, especiall...

  18. Undergraduate Bioinformatics Workshops Provide Perceived Skills

    Directory of Open Access Journals (Sweden)

    Robin Herlands Cresiski

    2014-07-01

    Full Text Available Bioinformatics is becoming an important part of undergraduate curriculum, but expertise and well-evaluated teaching materials may not be available on every campus. Here, a guest speaker was utilized to introduce bioinformatics and web-available exercises were adapted for student investigation. Students used web-based nucleotide comparison tools to examine the medical and evolutionary relevance of a unidentified genetic sequence. Based on pre- and post-workshop surveys, there were significant gains in the students understanding of bioinformatics, as well as their perceived skills in using bioinformatics tools. The relevance of bioinformatics to a student’s career seemed dependent on career aspirations.

  19. A Bioinformatics Facility for NASA

    Science.gov (United States)

    Schweighofer, Karl; Pohorille, Andrew

    2006-01-01

    Building on an existing prototype, we have fielded a facility with bioinformatics technologies that will help NASA meet its unique requirements for biological research. This facility consists of a cluster of computers capable of performing computationally intensive tasks, software tools, databases and knowledge management systems. Novel computational technologies for analyzing and integrating new biological data and already existing knowledge have been developed. With continued development and support, the facility will fulfill strategic NASA s bioinformatics needs in astrobiology and space exploration. . As a demonstration of these capabilities, we will present a detailed analysis of how spaceflight factors impact gene expression in the liver and kidney for mice flown aboard shuttle flight STS-108. We have found that many genes involved in signal transduction, cell cycle, and development respond to changes in microgravity, but that most metabolic pathways appear unchanged.

  20. An introduction to proteome bioinformatics.

    Science.gov (United States)

    Jones, Andrew R; Hubbard, Simon J

    2010-01-01

    This book is part of the Methods in Molecular Biology series, and provides a general overview of computational approaches used in proteome research. In this chapter, we give an overview of the scope of the book in terms of current proteomics experimental techniques and the reasons why computational approaches are needed. We then give a summary of each chapter, which together provide a picture of the state of the art in proteome bioinformatics research.

  1. POMA analyses as new efficient bioinformatics' platform to predict and optimise bioactivity of synthesized 3a,4-dihydro-3H-indeno[1,2-c]pyrazole-2-carboxamide/carbothioamide analogues.

    Science.gov (United States)

    Ahsan, Mohamed Jawed; Govindasamy, Jeyabalan; Khalilullah, Habibullah; Mohan, Govind; Stables, James P

    2012-12-01

    A series of 43, 3a,4-dihydro-3H-indeno[1,2-c]pyrazole-2-carboxamide/carbothioamide analogues (D01-D43) were analysed using Petra, Osiris, Molinspiration and ALOGPS (POMA) to identify pharmacophore, toxicity prediction, lipophilicity and bioactivity. All the compounds were evaluated for anti-HIV activity. 3-(4-Chlorophenyl)-N-(4-fluorophenyl)-6,7-dimethoxy-3a,4-dihydro-3H-indeno[1,2-c]pyrazole-2-carboxamide (D07) was found to be the most active with IC(50)>4.83 μM and CC(50) 4.83 μM. 3-(4-Fluorophenyl)-6,7-dimethoxy-3a,4-dihydro-3H-indeno[1,2-c]pyrazole-2-carbothioamide (D41) was found to be the most active compound against bacterial strains with MIC of 4 μg/ml, comparable to the standard drug ciprofloxacin while 3-(4-methoxyphenyl)-6,7-dimethoxy-3a,4-dihydro-3H-indeno[1,2-c]pyrazole-2-carboxamide (D38) was found to be the most active compound against fungal strains with MIC 2-4 μg/ml, however less active than standard fluconazole. Toxicities prediction by Osiris were well supported and experimentally verified with exception of some compounds. In anticonvulsant screening, 3-(4-fluorophenyl)-N-(4-chlorophenyl)-6,7-dimethoxy-3a,4-dihydro-3H-indeno[1,2-c]pyrazole-2-carboxamide (D09) showed maximum activity showing 100% (4/4, 0.25-0.5h) and 75% (3/4, 1.0 h) protection against minimal clonic seizure test without any toxicity.

  2. 9th International Conference on Practical Applications of Computational Biology and Bioinformatics

    CERN Document Server

    Rocha, Miguel; Fdez-Riverola, Florentino; Paz, Juan

    2015-01-01

    This proceedings presents recent practical applications of Computational Biology and  Bioinformatics. It contains the proceedings of the 9th International Conference on Practical Applications of Computational Biology & Bioinformatics held at University of Salamanca, Spain, at June 3rd-5th, 2015. The International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB) is an annual international meeting dedicated to emerging and challenging applied research in Bioinformatics and Computational Biology. Biological and biomedical research are increasingly driven by experimental techniques that challenge our ability to analyse, process and extract meaningful knowledge from the underlying data. The impressive capabilities of next generation sequencing technologies, together with novel and ever evolving distinct types of omics data technologies, have put an increasingly complex set of challenges for the growing fields of Bioinformatics and Computational Biology. The analysis o...

  3. Bioinformatics Training Network (BTN): a community resource for bioinformatics trainers

    DEFF Research Database (Denmark)

    Schneider, Maria V.; Walter, Peter; Blatter, Marie-Claude

    2012-01-01

    Funding bodies are increasingly recognizing the need to provide graduates and researchers with access to short intensive courses in a variety of disciplines, in order both to improve the general skills base and to provide solid foundations on which researchers may build their careers. In response...... and clearly tagged in relation to target audiences, learning objectives, etc. Ideally, they would also be peer reviewed, and easily and efficiently accessible for downloading. Here, we present the Bioinformatics Training Network (BTN), a new enterprise that has been initiated to address these needs and review...

  4. Bioinformatic approaches to interrogating vitamin D receptor signaling.

    Science.gov (United States)

    Campbell, Moray J

    2017-03-10

    Bioinformatics applies unbiased approaches to develop statistically-robust insight into health and disease. At the global, or "20,000 foot" view bioinformatic analyses of vitamin D receptor (NR1I1/VDR) signaling can measure where the VDR gene or protein exerts a genome-wide significant impact on biology; VDR is significantly implicated in bone biology and immune systems, but not in cancer. With a more VDR-centric, or "2000 foot" view, bioinformatic approaches can interrogate events downstream of VDR activity. Integrative approaches can combine VDR ChIP-Seq in cell systems where significant volumes of publically available data are available. For example, VDR ChIP-Seq studies can be combined with genome-wide association studies to reveal significant associations to immune phenotypes. Similarly, VDR ChIP-Seq can be combined with data from Cancer Genome Atlas (TCGA) to infer the impact of VDR target genes in cancer progression. Therefore, bioinformatic approaches can reveal what aspects of VDR downstream networks are significantly related to disease or phenotype.

  5. Mathematics and evolutionary biology make bioinformatics education comprehensible.

    Science.gov (United States)

    Jungck, John R; Weisstein, Anton E

    2013-09-01

    The patterns of variation within a molecular sequence data set result from the interplay between population genetic, molecular evolutionary and macroevolutionary processes-the standard purview of evolutionary biologists. Elucidating these patterns, particularly for large data sets, requires an understanding of the structure, assumptions and limitations of the algorithms used by bioinformatics software-the domain of mathematicians and computer scientists. As a result, bioinformatics often suffers a 'two-culture' problem because of the lack of broad overlapping expertise between these two groups. Collaboration among specialists in different fields has greatly mitigated this problem among active bioinformaticians. However, science education researchers report that much of bioinformatics education does little to bridge the cultural divide, the curriculum too focused on solving narrow problems (e.g. interpreting pre-built phylogenetic trees) rather than on exploring broader ones (e.g. exploring alternative phylogenetic strategies for different kinds of data sets). Herein, we present an introduction to the mathematics of tree enumeration, tree construction, split decomposition and sequence alignment. We also introduce off-line downloadable software tools developed by the BioQUEST Curriculum Consortium to help students learn how to interpret and critically evaluate the results of standard bioinformatics analyses.

  6. Mathematics and evolutionary biology make bioinformatics education comprehensible

    Science.gov (United States)

    Weisstein, Anton E.

    2013-01-01

    The patterns of variation within a molecular sequence data set result from the interplay between population genetic, molecular evolutionary and macroevolutionary processes—the standard purview of evolutionary biologists. Elucidating these patterns, particularly for large data sets, requires an understanding of the structure, assumptions and limitations of the algorithms used by bioinformatics software—the domain of mathematicians and computer scientists. As a result, bioinformatics often suffers a ‘two-culture’ problem because of the lack of broad overlapping expertise between these two groups. Collaboration among specialists in different fields has greatly mitigated this problem among active bioinformaticians. However, science education researchers report that much of bioinformatics education does little to bridge the cultural divide, the curriculum too focused on solving narrow problems (e.g. interpreting pre-built phylogenetic trees) rather than on exploring broader ones (e.g. exploring alternative phylogenetic strategies for different kinds of data sets). Herein, we present an introduction to the mathematics of tree enumeration, tree construction, split decomposition and sequence alignment. We also introduce off-line downloadable software tools developed by the BioQUEST Curriculum Consortium to help students learn how to interpret and critically evaluate the results of standard bioinformatics analyses. PMID:23821621

  7. Establishing bioinformatics research in the Asia Pacific

    Directory of Open Access Journals (Sweden)

    Tammi Martti

    2006-12-01

    Full Text Available Abstract In 1998, the Asia Pacific Bioinformatics Network (APBioNet, Asia's oldest bioinformatics organisation was set up to champion the advancement of bioinformatics in the Asia Pacific. By 2002, APBioNet was able to gain sufficient critical mass to initiate the first International Conference on Bioinformatics (InCoB bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2006 Conference was organized as the 5th annual conference of the Asia-Pacific Bioinformatics Network, on Dec. 18–20, 2006 in New Delhi, India, following a series of successful events in Bangkok (Thailand, Penang (Malaysia, Auckland (New Zealand and Busan (South Korea. This Introduction provides a brief overview of the peer-reviewed manuscripts accepted for publication in this Supplement. It exemplifies a typical snapshot of the growing research excellence in bioinformatics of the region as we embark on a trajectory of establishing a solid bioinformatics research culture in the Asia Pacific that is able to contribute fully to the global bioinformatics community.

  8. Integration of bioinformatics and synthetic promoters leads to the discovery of novel elicitor-responsive cis-regulatory sequences in Arabidopsis.

    Science.gov (United States)

    Koschmann, Jeannette; Machens, Fabian; Becker, Marlies; Niemeyer, Julia; Schulze, Jutta; Bülow, Lorenz; Stahl, Dietmar J; Hehl, Reinhard

    2012-09-01

    A combination of bioinformatic tools, high-throughput gene expression profiles, and the use of synthetic promoters is a powerful approach to discover and evaluate novel cis-sequences in response to specific stimuli. With Arabidopsis (Arabidopsis thaliana) microarray data annotated to the PathoPlant database, 732 different queries with a focus on fungal and oomycete pathogens were performed, leading to 510 up-regulated gene groups. Using the binding site estimation suite of tools, BEST, 407 conserved sequence motifs were identified in promoter regions of these coregulated gene sets. Motif similarities were determined with STAMP, classifying the 407 sequence motifs into 37 families. A comparative analysis of these 37 families with the AthaMap, PLACE, and AGRIS databases revealed similarities to known cis-elements but also led to the discovery of cis-sequences not yet implicated in pathogen response. Using a parsley (Petroselinum crispum) protoplast system and a modified reporter gene vector with an internal transformation control, 25 elicitor-responsive cis-sequences from 10 different motif families were identified. Many of the elicitor-responsive cis-sequences also drive reporter gene expression in an Agrobacterium tumefaciens infection assay in Nicotiana benthamiana. This work significantly increases the number of known elicitor-responsive cis-sequences and demonstrates the successful integration of a diverse set of bioinformatic resources combined with synthetic promoter analysis for data mining and functional screening in plant-pathogen interaction.

  9. A Mathematical Optimization Problem in Bioinformatics

    Science.gov (United States)

    Heyer, Laurie J.

    2008-01-01

    This article describes the sequence alignment problem in bioinformatics. Through examples, we formulate sequence alignment as an optimization problem and show how to compute the optimal alignment with dynamic programming. The examples and sample exercises have been used by the author in a specialized course in bioinformatics, but could be adapted…

  10. Using "Arabidopsis" Genetic Sequences to Teach Bioinformatics

    Science.gov (United States)

    Zhang, Xiaorong

    2009-01-01

    This article describes a new approach to teaching bioinformatics using "Arabidopsis" genetic sequences. Several open-ended and inquiry-based laboratory exercises have been designed to help students grasp key concepts and gain practical skills in bioinformatics, using "Arabidopsis" leucine-rich repeat receptor-like kinase (LRR…

  11. Online Bioinformatics Tutorials | Office of Cancer Genomics

    Science.gov (United States)

    Bioinformatics is a scientific discipline that applies computer science and information technology to help understand biological processes. The NIH provides a list of free online bioinformatics tutorials, either generated by the NIH Library or other institutes, which includes introductory lectures and "how to" videos on using various tools.

  12. The Aspergillus Mine - publishing bioinformatics

    DEFF Research Database (Denmark)

    Vesth, Tammi Camilla; Rasmussen, Jane Lind Nybo; Theobald, Sebastian

    so with no computational specialist. Here we present a setup for analysis and publication of genome data of 70 species of Aspergillus fungi. The platform is based on R, Python and uses the RShiny framework to create interactive web‐applications. It allows all participants to create interactive...... analysis which can be shared with the team and in connection with publications. We present analysis for investigation of genetic diversity, secondary and primary metabolism and general data overview. The platform, the Aspergillus Mine, is a collection of analysis tools based on data from collaboration...... with the Joint Genome Institute. The Aspergillus Mine is not intended as a genomic data sharing service but instead focuses on creating an environment where the results of bioinformatic analysis is made available for inspection. The data and code is public upon request and figures can be obtained directly from...

  13. Bioinformatics clouds for big data manipulation

    KAUST Repository

    Dai, Lin

    2012-11-28

    As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics.This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor. 2012 Dai et al.; licensee BioMed Central Ltd.

  14. Bioinformatics clouds for big data manipulation

    Directory of Open Access Journals (Sweden)

    Dai Lin

    2012-11-01

    Full Text Available Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS, Software as a Service (SaaS, Platform as a Service (PaaS, and Infrastructure as a Service (IaaS, and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.

  15. BIOINFORMATICS FOR UNDERGRADUATES OF LIFE SCIENCE COURSES

    Directory of Open Access Journals (Sweden)

    J.F. De Mesquita

    2007-05-01

    Full Text Available In the recent years, Bioinformatics has emerged as an important research tool. Theability to mine large databases for relevant information has become essential fordifferent life science fields. On the other hand, providing education in bioinformatics toundergraduates is challenging from this multidisciplinary perspective. Therefore, it isimportant to introduced undergraduate students to the available information andcurrent methodologies in Bioinformatics. Here we report the results of a course usinga computer-assisted and problem -based learning model. The syllabus was comprisedof theoretical lectures covering different topics within bioinformatics and practicalactivities. For the latter, we developed a set of step-by-step tutorials based on casestudies. The course was applied to undergraduate students of biological andbiomedical courses. At the end of the course, the students were able to build up astep-by-step tutorial covering a bioinformatics issue.

  16. ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis.

    Science.gov (United States)

    Römer, Michael; Eichner, Johannes; Dräger, Andreas; Wrzodek, Clemens; Wrzodek, Finja; Zell, Andreas

    2016-01-01

    Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/.

  17. p3d – Python module for structural bioinformatics

    Directory of Open Access Journals (Sweden)

    Fufezan Christian

    2009-08-01

    Full Text Available Abstract Background High-throughput bioinformatic analysis tools are needed to mine the large amount of structural data via knowledge based approaches. The development of such tools requires a robust interface to access the structural data in an easy way. For this the Python scripting language is the optimal choice since its philosophy is to write an understandable source code. Results p3d is an object oriented Python module that adds a simple yet powerful interface to the Python interpreter to process and analyse three dimensional protein structure files (PDB files. p3d's strength arises from the combination of a very fast spatial access to the structural data due to the implementation of a binary space partitioning (BSP tree, b set theory and c functions that allow to combine a and b and that use human readable language in the search queries rather than complex computer language. All these factors combined facilitate the rapid development of bioinformatic tools that can perform quick and complex analyses of protein structures. Conclusion p3d is the perfect tool to quickly develop tools for structural bioinformatics using the Python scripting language.

  18. Computational biology and bioinformatics in Nigeria.

    Science.gov (United States)

    Fatumo, Segun A; Adoga, Moses P; Ojo, Opeolu O; Oluwagbemi, Olugbenga; Adeoye, Tolulope; Ewejobi, Itunuoluwa; Adebiyi, Marion; Adebiyi, Ezekiel; Bewaji, Clement; Nashiru, Oyekanmi

    2014-04-01

    Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  19. Computational biology and bioinformatics in Nigeria.

    Directory of Open Access Journals (Sweden)

    Segun A Fatumo

    2014-04-01

    Full Text Available Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  20. Translational Bioinformatics and Clinical Research (Biomedical) Informatics.

    Science.gov (United States)

    Sirintrapun, S Joseph; Zehir, Ahmet; Syed, Aijazuddin; Gao, JianJiong; Schultz, Nikolaus; Cheng, Donavan T

    2016-03-01

    Translational bioinformatics and clinical research (biomedical) informatics are the primary domains related to informatics activities that support translational research. Translational bioinformatics focuses on computational techniques in genetics, molecular biology, and systems biology. Clinical research (biomedical) informatics involves the use of informatics in discovery and management of new knowledge relating to health and disease. This article details 3 projects that are hybrid applications of translational bioinformatics and clinical research (biomedical) informatics: The Cancer Genome Atlas, the cBioPortal for Cancer Genomics, and the Memorial Sloan Kettering Cancer Center clinical variants and results database, all designed to facilitate insights into cancer biology and clinical/therapeutic correlations.

  1. When cloud computing meets bioinformatics: a review.

    Science.gov (United States)

    Zhou, Shuigeng; Liao, Ruiqi; Guan, Jihong

    2013-10-01

    In the past decades, with the rapid development of high-throughput technologies, biology research has generated an unprecedented amount of data. In order to store and process such a great amount of data, cloud computing and MapReduce were applied to many fields of bioinformatics. In this paper, we first introduce the basic concepts of cloud computing and MapReduce, and their applications in bioinformatics. We then highlight some problems challenging the applications of cloud computing and MapReduce to bioinformatics. Finally, we give a brief guideline for using cloud computing in biology research.

  2. KBWS: an EMBOSS associated package for accessing bioinformatics web services

    Directory of Open Access Journals (Sweden)

    Tomita Masaru

    2011-04-01

    Full Text Available Abstract The availability of bioinformatics web-based services is rapidly proliferating, for their interoperability and ease of use. The next challenge is in the integration of these services in the form of workflows, and several projects are already underway, standardizing the syntax, semantics, and user interfaces. In order to deploy the advantages of web services with locally installed tools, here we describe a collection of proxy client tools for 42 major bioinformatics web services in the form of European Molecular Biology Open Software Suite (EMBOSS UNIX command-line tools. EMBOSS provides sophisticated means for discoverability and interoperability for hundreds of tools, and our package, named the Keio Bioinformatics Web Service (KBWS, adds functionalities of local and multiple alignment of sequences, phylogenetic analyses, and prediction of cellular localization of proteins and RNA secondary structures. This software implemented in C is available under GPL from http://www.g-language.org/kbws/ and GitHub repository http://github.com/cory-ko/KBWS. Users can utilize the SOAP services implemented in Perl directly via WSDL file at http://soap.g-language.org/kbws.wsdl (RPC Encoded and http://soap.g-language.org/kbws_dl.wsdl (Document/literal.

  3. High-throughput protein analysis integrating bioinformatics and experimental assays.

    Science.gov (United States)

    del Val, Coral; Mehrle, Alexander; Falkenhahn, Mechthild; Seiler, Markus; Glatting, Karl-Heinz; Poustka, Annemarie; Suhai, Sandor; Wiemann, Stefan

    2004-01-01

    The wealth of transcript information that has been made publicly available in recent years requires the development of high-throughput functional genomics and proteomics approaches for its analysis. Such approaches need suitable data integration procedures and a high level of automation in order to gain maximum benefit from the results generated. We have designed an automatic pipeline to analyse annotated open reading frames (ORFs) stemming from full-length cDNAs produced mainly by the German cDNA Consortium. The ORFs are cloned into expression vectors for use in large-scale assays such as the determination of subcellular protein localization or kinase reaction specificity. Additionally, all identified ORFs undergo exhaustive bioinformatic analysis such as similarity searches, protein domain architecture determination and prediction of physicochemical characteristics and secondary structure, using a wide variety of bioinformatic methods in combination with the most up-to-date public databases (e.g. PRINTS, BLOCKS, INTERPRO, PROSITE SWISSPROT). Data from experimental results and from the bioinformatic analysis are integrated and stored in a relational database (MS SQL-Server), which makes it possible for researchers to find answers to biological questions easily, thereby speeding up the selection of targets for further analysis. The designed pipeline constitutes a new automatic approach to obtaining and administrating relevant biological data from high-throughput investigations of cDNAs in order to systematically identify and characterize novel genes, as well as to comprehensively describe the function of the encoded proteins.

  4. Best practices in bioinformatics training for life scientists.

    KAUST Repository

    Via, Allegra

    2013-06-25

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists.

  5. KBWS: an EMBOSS associated package for accessing bioinformatics web services.

    Science.gov (United States)

    Oshita, Kazuki; Arakawa, Kazuharu; Tomita, Masaru

    2011-04-29

    The availability of bioinformatics web-based services is rapidly proliferating, for their interoperability and ease of use. The next challenge is in the integration of these services in the form of workflows, and several projects are already underway, standardizing the syntax, semantics, and user interfaces. In order to deploy the advantages of web services with locally installed tools, here we describe a collection of proxy client tools for 42 major bioinformatics web services in the form of European Molecular Biology Open Software Suite (EMBOSS) UNIX command-line tools. EMBOSS provides sophisticated means for discoverability and interoperability for hundreds of tools, and our package, named the Keio Bioinformatics Web Service (KBWS), adds functionalities of local and multiple alignment of sequences, phylogenetic analyses, and prediction of cellular localization of proteins and RNA secondary structures. This software implemented in C is available under GPL from http://www.g-language.org/kbws/ and GitHub repository http://github.com/cory-ko/KBWS. Users can utilize the SOAP services implemented in Perl directly via WSDL file at http://soap.g-language.org/kbws.wsdl (RPC Encoded) and http://soap.g-language.org/kbws_dl.wsdl (Document/literal).

  6. Best practices in bioinformatics training for life scientists.

    Science.gov (United States)

    Via, Allegra; Blicher, Thomas; Bongcam-Rudloff, Erik; Brazas, Michelle D; Brooksbank, Cath; Budd, Aidan; De Las Rivas, Javier; Dreyer, Jacqueline; Fernandes, Pedro L; van Gelder, Celia; Jacob, Joachim; Jimenez, Rafael C; Loveland, Jane; Moran, Federico; Mulder, Nicola; Nyrönen, Tommi; Rother, Kristian; Schneider, Maria Victoria; Attwood, Teresa K

    2013-09-01

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists.

  7. Agile parallel bioinformatics workflow management using Pwrake.

    OpenAIRE

    2011-01-01

    Abstract Background In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environm...

  8. Coronavirus Genomics and Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Kwok-Yung Yuen

    2010-08-01

    Full Text Available The drastic increase in the number of coronaviruses discovered and coronavirus genomes being sequenced have given us an unprecedented opportunity to perform genomics and bioinformatics analysis on this family of viruses. Coronaviruses possess the largest genomes (26.4 to 31.7 kb among all known RNA viruses, with G + C contents varying from 32% to 43%. Variable numbers of small ORFs are present between the various conserved genes (ORF1ab, spike, envelope, membrane and nucleocapsid and downstream to nucleocapsid gene in different coronavirus lineages. Phylogenetically, three genera, Alphacoronavirus, Betacoronavirus and Gammacoronavirus, with Betacoronavirus consisting of subgroups A, B, C and D, exist. A fourth genus, Deltacoronavirus, which includes bulbul coronavirus HKU11, thrush coronavirus HKU12 and munia coronavirus HKU13, is emerging. Molecular clock analysis using various gene loci revealed that the time of most recent common ancestor of human/civet SARS related coronavirus to be 1999-2002, with estimated substitution rate of 4´10-4 to 2´10-2 substitutions per site per year. Recombination in coronaviruses was most notable between different strains of murine hepatitis virus (MHV, between different strains of infectious bronchitis virus, between MHV and bovine coronavirus, between feline coronavirus (FCoV type I and canine coronavirus generating FCoV type II, and between the three genotypes of human coronavirus HKU1 (HCoV-HKU1. Codon usage bias in coronaviruses were observed, with HCoV-HKU1 showing the most extreme bias, and cytosine deamination and selection of CpG suppressed clones are the two major independent biological forces that shape such codon usage bias in coronaviruses.

  9. Regulatory bioinformatics for food and drug safety.

    Science.gov (United States)

    Healy, Marion J; Tong, Weida; Ostroff, Stephen; Eichler, Hans-Georg; Patak, Alex; Neuspiel, Margaret; Deluyker, Hubert; Slikker, William

    2016-10-01

    "Regulatory Bioinformatics" strives to develop and implement a standardized and transparent bioinformatic framework to support the implementation of existing and emerging technologies in regulatory decision-making. It has great potential to improve public health through the development and use of clinically important medical products and tools to manage the safety of the food supply. However, the application of regulatory bioinformatics also poses new challenges and requires new knowledge and skill sets. In the latest Global Coalition on Regulatory Science Research (GCRSR) governed conference, Global Summit on Regulatory Science (GSRS2015), regulatory bioinformatics principles were presented with respect to global trends, initiatives and case studies. The discussion revealed that datasets, analytical tools, skills and expertise are rapidly developing, in many cases via large international collaborative consortia. It also revealed that significant research is still required to realize the potential applications of regulatory bioinformatics. While there is significant excitement in the possibilities offered by precision medicine to enhance treatments of serious and/or complex diseases, there is a clear need for further development of mechanisms to securely store, curate and share data, integrate databases, and standardized quality control and data analysis procedures. A greater understanding of the biological significance of the data is also required to fully exploit vast datasets that are becoming available. The application of bioinformatics in the microbiological risk analysis paradigm is delivering clear benefits both for the investigation of food borne pathogens and for decision making on clinically important treatments. It is recognized that regulatory bioinformatics will have many beneficial applications by ensuring high quality data, validated tools and standardized processes, which will help inform the regulatory science community of the requirements

  10. Adapting bioinformatics curricula for big data.

    Science.gov (United States)

    Greene, Anna C; Giffin, Kristine A; Greene, Casey S; Moore, Jason H

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs.

  11. The GMOD Drupal Bioinformatic Server Framework

    Science.gov (United States)

    Papanicolaou, Alexie; Heckel, David G.

    2010-01-01

    Motivation: Next-generation sequencing technologies have led to the widespread use of -omic applications. As a result, there is now a pronounced bioinformatic bottleneck. The general model organism database (GMOD) tool kit (http://gmod.org) has produced a number of resources aimed at addressing this issue. It lacks, however, a robust online solution that can deploy heterogeneous data and software within a Web content management system (CMS). Results: We present a bioinformatic framework for the Drupal CMS. It consists of three modules. First, GMOD-DBSF is an application programming interface module for the Drupal CMS that simplifies the programming of bioinformatic Drupal modules. Second, the Drupal Bioinformatic Software Bench (biosoftware_bench) allows for a rapid and secure deployment of bioinformatic software. An innovative graphical user interface (GUI) guides both use and administration of the software, including the secure provision of pre-publication datasets. Third, we present genes4all_experiment, which exemplifies how our work supports the wider research community. Conclusion: Given the infrastructure presented here, the Drupal CMS may become a powerful new tool set for bioinformaticians. The GMOD-DBSF base module is an expandable community resource that decreases development time of Drupal modules for bioinformatics. The biosoftware_bench module can already enhance biologists' ability to mine their own data. The genes4all_experiment module has already been responsible for archiving of more than 150 studies of RNAi from Lepidoptera, which were previously unpublished. Availability and implementation: Implemented in PHP and Perl. Freely available under the GNU Public License 2 or later from http://gmod-dbsf.googlecode.com Contact: alexie@butterflybase.org PMID:20971988

  12. Making sense of genomes of parasitic worms: Tackling bioinformatic challenges.

    Science.gov (United States)

    Korhonen, Pasi K; Young, Neil D; Gasser, Robin B

    2016-01-01

    Billions of people and animals are infected with parasitic worms (helminths). Many of these worms cause diseases that have a major socioeconomic impact worldwide, and are challenging to control because existing treatment methods are often inadequate. There is, therefore, a need to work toward developing new intervention methods, built on a sound understanding of parasitic worms at molecular level, the relationships that they have with their animal hosts and/or the diseases that they cause. Decoding the genomes and transcriptomes of these parasites brings us a step closer to this goal. The key focus of this article is to critically review and discuss bioinformatic tools used for the assembly and annotation of these genomes and transcriptomes, as well as various post-genomic analyses of transcription profiles, biological pathways, synteny, phylogeny, biogeography and the prediction and prioritisation of drug target candidates. Bioinformatic pipelines implemented and established recently provide practical and efficient tools for the assembly and annotation of genomes of parasitic worms, and will be applicable to a wide range of other parasites and eukaryotic organisms. Future research will need to assess the utility of long-read sequence data sets for enhanced genomic assemblies, and develop improved algorithms for gene prediction and post-genomic analyses, to enable comprehensive systems biology explorations of parasitic organisms.

  13. Why Choose This One? Factors in Scientists' Selection of Bioinformatics Tools

    Science.gov (United States)

    Bartlett, Joan C.; Ishimura, Yusuke; Kloda, Lorie A.

    2011-01-01

    Purpose: The objective was to identify and understand the factors involved in scientists' selection of preferred bioinformatics tools, such as databases of gene or protein sequence information (e.g., GenBank) or programs that manipulate and analyse biological data (e.g., BLAST). Methods: Eight scientists maintained research diaries for a two-week…

  14. Bioinformatic Challenges in Clinical Diagnostic Application of Targeted Next Generation Sequencing: Experience from Pheochromocytoma.

    Directory of Open Access Journals (Sweden)

    Joakim Crona

    Full Text Available Recent studies have demonstrated equal quality of targeted next generation sequencing (NGS compared to Sanger Sequencing. Whereas these novel sequencing processes have a validated robust performance, choice of enrichment method and different available bioinformatic software as reliable analysis tool needs to be further investigated in a diagnostic setting.DNA from 21 patients with genetic variants in SDHB, VHL, EPAS1, RET, (n=17 or clinical criteria of NF1 syndrome (n=4 were included. Targeted NGS was performed using Truseq custom amplicon enrichment sequenced on an Illumina MiSEQ instrument. Results were analysed in parallel using three different bioinformatics pipelines; (1 Commercially available MiSEQ Reporter, fully automatized and integrated software, (2 CLC Genomics Workbench, graphical interface based software, also commercially available, and ICP (3 an in-house scripted custom bioinformatic tool.A tenfold read coverage was achieved in between 95-98% of targeted bases. All workflows had alignment of reads to SDHA and NF1 pseudogenes. Compared to Sanger sequencing, variant calling revealed a sensitivity ranging from 83 to 100% and a specificity of 99.9-100%. Only MiSEQ reporter identified all pathogenic variants in both sequencing runs.We conclude that targeted next generation sequencing have equal quality compared to Sanger sequencing. Enrichment specificity and the bioinformatic performance need to be carefully assessed in a diagnostic setting. As acceptable accuracy was noted for a fully automated bioinformatic workflow, we suggest that processing of NGS data could be performed without expert bioinformatics skills utilizing already existing commercially available bioinformatics tools.

  15. Integrative cluster analysis in bioinformatics

    CERN Document Server

    Abu-Jamous, Basel; Nandi, Asoke K

    2015-01-01

    Clustering techniques are increasingly being put to use in the analysis of high-throughput biological datasets. Novel computational techniques to analyse high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. This book details the complete pathway of cluster analysis, from the basics of molecular biology to the generation of biological knowledge. The book also presents the latest clustering methods and clustering validation, thereby offering the reader a comprehensive review o

  16. Hardware Acceleration of Bioinformatics Sequence Alignment Applications

    NARCIS (Netherlands)

    Hasan, L.

    2011-01-01

    Biological sequence alignment is an important and challenging task in bioinformatics. Alignment may be defined as an arrangement of two or more DNA or protein sequences to highlight the regions of their similarity. Sequence alignment is used to infer the evolutionary relationship between a set of pr

  17. A bioinformatics approach to marker development

    NARCIS (Netherlands)

    Tang, J.

    2008-01-01

    The thesis focuses on two bioinformatics research topics: the development of tools for an efficient and reliable identification of single nucleotides polymorphisms (SNPs) and polymorphic simple sequence repeats (SSRs) from expressed sequence tags (ESTs) (Chapter 2, 3 and 4), and the subsequent imple

  18. Privacy Preserving PCA on Distributed Bioinformatics Datasets

    Science.gov (United States)

    Li, Xin

    2011-01-01

    In recent years, new bioinformatics technologies, such as gene expression microarray, genome-wide association study, proteomics, and metabolomics, have been widely used to simultaneously identify a huge number of human genomic/genetic biomarkers, generate a tremendously large amount of data, and dramatically increase the knowledge on human…

  19. Implementing bioinformatic workflows within the bioextract server

    Science.gov (United States)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows typically require the integrated use of multiple, distributed data sources and analytic tools. The BioExtract Server (http://bioextract.org) is a distributed servi...

  20. Bioinformatics in Undergraduate Education: Practical Examples

    Science.gov (United States)

    Boyle, John A.

    2004-01-01

    Bioinformatics has emerged as an important research tool in recent years. The ability to mine large databases for relevant information has become increasingly central to many different aspects of biochemistry and molecular biology. It is important that undergraduates be introduced to the available information and methodologies. We present a…

  1. "Extreme Programming" in a Bioinformatics Class

    Science.gov (United States)

    Kelley, Scott; Alger, Christianna; Deutschman, Douglas

    2009-01-01

    The importance of Bioinformatics tools and methodology in modern biological research underscores the need for robust and effective courses at the college level. This paper describes such a course designed on the principles of cooperative learning based on a computer software industry production model called "Extreme Programming" (EP).…

  2. Bioinformatics: A History of Evolution "In Silico"

    Science.gov (United States)

    Ondrej, Vladan; Dvorak, Petr

    2012-01-01

    Bioinformatics, biological databases, and the worldwide use of computers have accelerated biological research in many fields, such as evolutionary biology. Here, we describe a primer of nucleotide sequence management and the construction of a phylogenetic tree with two examples; the two selected are from completely different groups of organisms:…

  3. Mass spectrometry and bioinformatics analysis data

    Directory of Open Access Journals (Sweden)

    Mainak Dutta

    2015-03-01

    Full Text Available 2DE and 2D-DIGE based proteomics analysis of serum from women with endometriosis revealed several proteins to be dysregulated. A complete list of these proteins along with their mass spectrometry data and subsequent bioinformatics analysis are presented here. The data is related to “Investigation of serum proteome alterations in human endometriosis” by Dutta et al. [1].

  4. Evolution of web services in bioinformatics

    NARCIS (Netherlands)

    Neerincx, P.B.T.; Leunissen, J.A.M.

    2005-01-01

    Bioinformaticians have developed large collections of tools to make sense of the rapidly growing pool of molecular biological data. Biological systems tend to be complex and in order to understand them, it is often necessary to link many data sets and use more than one tool. Therefore, bioinformatic

  5. G2LC: Resources Autoscaling for Real Time Bioinformatics Applications in IaaS

    Directory of Open Access Journals (Sweden)

    Rongdong Hu

    2015-01-01

    Full Text Available Cloud computing has started to change the way how bioinformatics research is being carried out. Researchers who have taken advantage of this technology can process larger amounts of data and speed up scientific discovery. The variability in data volume results in variable computing requirements. Therefore, bioinformatics researchers are pursuing more reliable and efficient methods for conducting sequencing analyses. This paper proposes an automated resource provisioning method, G2LC, for bioinformatics applications in IaaS. It enables application to output the results in a real time manner. Its main purpose is to guarantee applications performance, while improving resource utilization. Real sequence searching data of BLAST is used to evaluate the effectiveness of G2LC. Experimental results show that G2LC guarantees the application performance, while resource is saved up to 20.14%.

  6. G2LC: Resources Autoscaling for Real Time Bioinformatics Applications in IaaS.

    Science.gov (United States)

    Hu, Rongdong; Liu, Guangming; Jiang, Jingfei; Wang, Lixin

    2015-01-01

    Cloud computing has started to change the way how bioinformatics research is being carried out. Researchers who have taken advantage of this technology can process larger amounts of data and speed up scientific discovery. The variability in data volume results in variable computing requirements. Therefore, bioinformatics researchers are pursuing more reliable and efficient methods for conducting sequencing analyses. This paper proposes an automated resource provisioning method, G2LC, for bioinformatics applications in IaaS. It enables application to output the results in a real time manner. Its main purpose is to guarantee applications performance, while improving resource utilization. Real sequence searching data of BLAST is used to evaluate the effectiveness of G2LC. Experimental results show that G2LC guarantees the application performance, while resource is saved up to 20.14%.

  7. Agile parallel bioinformatics workflow management using Pwrake

    Directory of Open Access Journals (Sweden)

    Tanaka Masahiro

    2011-09-01

    Full Text Available Abstract Background In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error. Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. Findings We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Conclusions Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows

  8. Component-Based Approach for Educating Students in Bioinformatics

    Science.gov (United States)

    Poe, D.; Venkatraman, N.; Hansen, C.; Singh, G.

    2009-01-01

    There is an increasing need for an effective method of teaching bioinformatics. Increased progress and availability of computer-based tools for educating students have led to the implementation of a computer-based system for teaching bioinformatics as described in this paper. Bioinformatics is a recent, hybrid field of study combining elements of…

  9. Bioinformatics Training: A Review of Challenges, Actions and Support Requirements

    DEFF Research Database (Denmark)

    Schneider, M.V.; Watson, J.; Attwood, T.;

    2010-01-01

    As bioinformatics becomes increasingly central to research in the molecular life sciences, the need to train non-bioinformaticians to make the most of bioinformatics resources is growing. Here, we review the key challenges and pitfalls to providing effective training for users of bioinformatics...... services, and discuss successful training strategies shared by a diverse set of bioinformatics trainers. We also identify steps that trainers in bioinformatics could take together to advance the state of the art in current training practices. The ideas presented in this article derive from the first...

  10. Towards bioinformatics assisted infectious disease control

    Directory of Open Access Journals (Sweden)

    Gallego Blanca

    2009-02-01

    Full Text Available Abstract Background This paper proposes a novel framework for bioinformatics assisted biosurveillance and early warning to address the inefficiencies in traditional surveillance as well as the need for more timely and comprehensive infection monitoring and control. It leverages on breakthroughs in rapid, high-throughput molecular profiling of microorganisms and text mining. Results This framework combines the genetic and geographic data of a pathogen to reconstruct its history and to identify the migration routes through which the strains spread regionally and internationally. A pilot study of Salmonella typhimurium genotype clustering and temporospatial outbreak analysis demonstrated better discrimination power than traditional phage typing. Half of the outbreaks were detected in the first half of their duration. Conclusion The microbial profiling and biosurveillance focused text mining tools can enable integrated infectious disease outbreak detection and response environments based upon bioinformatics knowledge models and measured by outcomes including the accuracy and timeliness of outbreak detection.

  11. Bioinformatics in New Generation Flavivirus Vaccines

    Directory of Open Access Journals (Sweden)

    Penelope Koraka

    2010-01-01

    Full Text Available Flavivirus infections are the most prevalent arthropod-borne infections world wide, often causing severe disease especially among children, the elderly, and the immunocompromised. In the absence of effective antiviral treatment, prevention through vaccination would greatly reduce morbidity and mortality associated with flavivirus infections. Despite the success of the empirically developed vaccines against yellow fever virus, Japanese encephalitis virus and tick-borne encephalitis virus, there is an increasing need for a more rational design and development of safe and effective vaccines. Several bioinformatic tools are available to support such rational vaccine design. In doing so, several parameters have to be taken into account, such as safety for the target population, overall immunogenicity of the candidate vaccine, and efficacy and longevity of the immune responses triggered. Examples of how bio-informatics is applied to assist in the rational design and improvements of vaccines, particularly flavivirus vaccines, are presented and discussed.

  12. Bioinformatics for saffron (Crocus sativus L. improvement

    Directory of Open Access Journals (Sweden)

    Ghulam A. Parray

    2009-02-01

    Full Text Available Saffron (Crocus sativus L. is a sterile triploid plant and belongs to the Iridaceae (Liliales, Monocots. Its genome is of relatively large size and is poorly characterized. Bioinformatics can play an enormous technical role in the sequence-level structural characterization of saffron genomic DNA. Bioinformatics tools can also help in appreciating the extent of diversity of various geographic or genetic groups of cultivated saffron to infer relationships between groups and accessions. The characterization of the transcriptome of saffron stigmas is the most vital for throwing light on the molecular basis of flavor, color biogenesis, genomic organization and biology of gynoecium of saffron. The information derived can be utilized for constructing biological pathways involved in the biosynthesis of principal components of saffron i.e., crocin, crocetin, safranal, picrocrocin and safchiA

  13. [Applied problems of mathematical biology and bioinformatics].

    Science.gov (United States)

    Lakhno, V D

    2011-01-01

    Mathematical biology and bioinformatics represent a new and rapidly progressing line of investigations which emerged in the course of work on the project "Human genome". The main applied problems of these sciences are grug design, patient-specific medicine and nanobioelectronics. It is shown that progress in the technology of mass sequencing of the human genome has set the stage for starting the national program on patient-specific medicine.

  14. Genome bioinformatics of tomato and potato

    OpenAIRE

    E Datema

    2011-01-01

    In the past two decades genome sequencing has developed from a laborious and costly technology employed by large international consortia to a widely used, automated and affordable tool used worldwide by many individual research groups. Genome sequences of many food animals and crop plants have been deciphered and are being exploited for fundamental research and applied to improve their breeding programs. The developments in sequencing technologies have also impacted the associated bioinformat...

  15. VLSI Microsystem for Rapid Bioinformatic Pattern Recognition

    Science.gov (United States)

    Fang, Wai-Chi; Lue, Jaw-Chyng

    2009-01-01

    A system comprising very-large-scale integrated (VLSI) circuits is being developed as a means of bioinformatics-oriented analysis and recognition of patterns of fluorescence generated in a microarray in an advanced, highly miniaturized, portable genetic-expression-assay instrument. Such an instrument implements an on-chip combination of polymerase chain reactions and electrochemical transduction for amplification and detection of deoxyribonucleic acid (DNA).

  16. Application of bioinformatics in chronobiology research.

    Science.gov (United States)

    Lopes, Robson da Silva; Resende, Nathalia Maria; Honorio-França, Adenilda Cristina; França, Eduardo Luzía

    2013-01-01

    Bioinformatics and other well-established sciences, such as molecular biology, genetics, and biochemistry, provide a scientific approach for the analysis of data generated through "omics" projects that may be used in studies of chronobiology. The results of studies that apply these techniques demonstrate how they significantly aided the understanding of chronobiology. However, bioinformatics tools alone cannot eliminate the need for an understanding of the field of research or the data to be considered, nor can such tools replace analysts and researchers. It is often necessary to conduct an evaluation of the results of a data mining effort to determine the degree of reliability. To this end, familiarity with the field of investigation is necessary. It is evident that the knowledge that has been accumulated through chronobiology and the use of tools derived from bioinformatics has contributed to the recognition and understanding of the patterns and biological rhythms found in living organisms. The current work aims to develop new and important applications in the near future through chronobiology research.

  17. Application of Bioinformatics in Chronobiology Research

    Directory of Open Access Journals (Sweden)

    Robson da Silva Lopes

    2013-01-01

    Full Text Available Bioinformatics and other well-established sciences, such as molecular biology, genetics, and biochemistry, provide a scientific approach for the analysis of data generated through “omics” projects that may be used in studies of chronobiology. The results of studies that apply these techniques demonstrate how they significantly aided the understanding of chronobiology. However, bioinformatics tools alone cannot eliminate the need for an understanding of the field of research or the data to be considered, nor can such tools replace analysts and researchers. It is often necessary to conduct an evaluation of the results of a data mining effort to determine the degree of reliability. To this end, familiarity with the field of investigation is necessary. It is evident that the knowledge that has been accumulated through chronobiology and the use of tools derived from bioinformatics has contributed to the recognition and understanding of the patterns and biological rhythms found in living organisms. The current work aims to develop new and important applications in the near future through chronobiology research.

  18. Chapter 16: text mining for translational bioinformatics.

    Directory of Open Access Journals (Sweden)

    K Bretonnel Cohen

    2013-04-01

    Full Text Available Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.

  19. A library-based bioinformatics services program.

    Science.gov (United States)

    Yarfitz, S; Ketchell, D S

    2000-01-01

    Support for molecular biology researchers has been limited to traditional library resources and services in most academic health sciences libraries. The University of Washington Health Sciences Libraries have been providing specialized services to this user community since 1995. The library recruited a Ph.D. biologist to assess the molecular biological information needs of researchers and design strategies to enhance library resources and services. A survey of laboratory research groups identified areas of greatest need and led to the development of a three-pronged program: consultation, education, and resource development. Outcomes of this program include bioinformatics consultation services, library-based and graduate level courses, networking of sequence analysis tools, and a biological research Web site. Bioinformatics clients are drawn from diverse departments and include clinical researchers in need of tools that are not readily available outside of basic sciences laboratories. Evaluation and usage statistics indicate that researchers, regardless of departmental affiliation or position, require support to access molecular biology and genetics resources. Centralizing such services in the library is a natural synergy of interests and enhances the provision of traditional library resources. Successful implementation of a library-based bioinformatics program requires both subject-specific and library and information technology expertise.

  20. A library-based bioinformatics services program*

    Science.gov (United States)

    Yarfitz, Stuart; Ketchell, Debra S.

    2000-01-01

    Support for molecular biology researchers has been limited to traditional library resources and services in most academic health sciences libraries. The University of Washington Health Sciences Libraries have been providing specialized services to this user community since 1995. The library recruited a Ph.D. biologist to assess the molecular biological information needs of researchers and design strategies to enhance library resources and services. A survey of laboratory research groups identified areas of greatest need and led to the development of a three-pronged program: consultation, education, and resource development. Outcomes of this program include bioinformatics consultation services, library-based and graduate level courses, networking of sequence analysis tools, and a biological research Web site. Bioinformatics clients are drawn from diverse departments and include clinical researchers in need of tools that are not readily available outside of basic sciences laboratories. Evaluation and usage statistics indicate that researchers, regardless of departmental affiliation or position, require support to access molecular biology and genetics resources. Centralizing such services in the library is a natural synergy of interests and enhances the provision of traditional library resources. Successful implementation of a library-based bioinformatics program requires both subject-specific and library and information technology expertise. PMID:10658962

  1. Bringing Web 2.0 to bioinformatics.

    Science.gov (United States)

    Zhang, Zhang; Cheung, Kei-Hoi; Townsend, Jeffrey P

    2009-01-01

    Enabling deft data integration from numerous, voluminous and heterogeneous data sources is a major bioinformatic challenge. Several approaches have been proposed to address this challenge, including data warehousing and federated databasing. Yet despite the rise of these approaches, integration of data from multiple sources remains problematic and toilsome. These two approaches follow a user-to-computer communication model for data exchange, and do not facilitate a broader concept of data sharing or collaboration among users. In this report, we discuss the potential of Web 2.0 technologies to transcend this model and enhance bioinformatics research. We propose a Web 2.0-based Scientific Social Community (SSC) model for the implementation of these technologies. By establishing a social, collective and collaborative platform for data creation, sharing and integration, we promote a web services-based pipeline featuring web services for computer-to-computer data exchange as users add value. This pipeline aims to simplify data integration and creation, to realize automatic analysis, and to facilitate reuse and sharing of data. SSC can foster collaboration and harness collective intelligence to create and discover new knowledge. In addition to its research potential, we also describe its potential role as an e-learning platform in education. We discuss lessons from information technology, predict the next generation of Web (Web 3.0), and describe its potential impact on the future of bioinformatics studies.

  2. Chapter 16: text mining for translational bioinformatics.

    Science.gov (United States)

    Cohen, K Bretonnel; Hunter, Lawrence E

    2013-04-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.

  3. Translational bioinformatics in psychoneuroimmunology: methods and applications.

    Science.gov (United States)

    Yan, Qing

    2012-01-01

    Translational bioinformatics plays an indispensable role in transforming psychoneuroimmunology (PNI) into personalized medicine. It provides a powerful method to bridge the gaps between various knowledge domains in PNI and systems biology. Translational bioinformatics methods at various systems levels can facilitate pattern recognition, and expedite and validate the discovery of systemic biomarkers to allow their incorporation into clinical trials and outcome assessments. Analysis of the correlations between genotypes and phenotypes including the behavioral-based profiles will contribute to the transition from the disease-based medicine to human-centered medicine. Translational bioinformatics would also enable the establishment of predictive models for patient responses to diseases, vaccines, and drugs. In PNI research, the development of systems biology models such as those of the neurons would play a critical role. Methods based on data integration, data mining, and knowledge representation are essential elements in building health information systems such as electronic health records and computerized decision support systems. Data integration of genes, pathophysiology, and behaviors are needed for a broad range of PNI studies. Knowledge discovery approaches such as network-based systems biology methods are valuable in studying the cross-talks among pathways in various brain regions involved in disorders such as Alzheimer's disease.

  4. The European Bioinformatics Institute in 2016: Data growth and integration.

    Science.gov (United States)

    Cook, Charles E; Bergman, Mary Todd; Finn, Robert D; Cochrane, Guy; Birney, Ewan; Apweiler, Rolf

    2016-01-04

    New technologies are revolutionising biological research and its applications by making it easier and cheaper to generate ever-greater volumes and types of data. In response, the services and infrastructure of the European Bioinformatics Institute (EMBL-EBI, www.ebi.ac.uk) are continually expanding: total disk capacity increases significantly every year to keep pace with demand (75 petabytes as of December 2015), and interoperability between resources remains a strategic priority. Since 2014 we have launched two new resources: the European Variation Archive for genetic variation data and EMPIAR for two-dimensional electron microscopy data, as well as a Resource Description Framework platform. We also launched the Embassy Cloud service, which allows users to run large analyses in a virtual environment next to EMBL-EBI's vast public data resources.

  5. Meta-learning framework applied in bioinformatics inference system design.

    Science.gov (United States)

    Arredondo, Tomás; Ormazábal, Wladimir

    2015-01-01

    This paper describes a meta-learner inference system development framework which is applied and tested in the implementation of bioinformatic inference systems. These inference systems are used for the systematic classification of the best candidates for inclusion in bacterial metabolic pathway maps. This meta-learner-based approach utilises a workflow where the user provides feedback with final classification decisions which are stored in conjunction with analysed genetic sequences for periodic inference system training. The inference systems were trained and tested with three different data sets related to the bacterial degradation of aromatic compounds. The analysis of the meta-learner-based framework involved contrasting several different optimisation methods with various different parameters. The obtained inference systems were also contrasted with other standard classification methods with accurate prediction capabilities observed.

  6. Mining Cancer Transcriptomes: Bioinformatic Tools and the Remaining Challenges.

    Science.gov (United States)

    Milan, Thomas; Wilhelm, Brian T

    2017-02-22

    The development of next-generation sequencing technologies has had a profound impact on the field of cancer genomics. With the enormous quantities of data being generated from tumor samples, researchers have had to rapidly adapt tools or develop new ones to analyse the raw data to maximize its value. While much of this effort has been focused on improving specific algorithms to get faster and more precise results, the accessibility of the final data for the research community remains a significant problem. Large amounts of data exist but are not easily available to researchers who lack the resources and experience to download and reanalyze them. In this article, we focus on RNA-seq analysis in the context of cancer genomics and discuss the bioinformatic tools available to explore these data. We also highlight the importance of developing new and more intuitive tools to provide easier access to public data and discuss the related issues of data sharing and patient privacy.

  7. Assessment of Common and Emerging Bioinformatics Pipelines for Targeted Metagenomics

    Science.gov (United States)

    Siegwald, Léa; Touzet, Hélène; Lemoine, Yves; Hot, David

    2017-01-01

    Targeted metagenomics, also known as metagenetics, is a high-throughput sequencing application focusing on a nucleotide target in a microbiome to describe its taxonomic content. A wide range of bioinformatics pipelines are available to analyze sequencing outputs, and the choice of an appropriate tool is crucial and not trivial. No standard evaluation method exists for estimating the accuracy of a pipeline for targeted metagenomics analyses. This article proposes an evaluation protocol containing real and simulated targeted metagenomics datasets, and adequate metrics allowing us to study the impact of different variables on the biological interpretation of results. This protocol was used to compare six different bioinformatics pipelines in the basic user context: Three common ones (mothur, QIIME and BMP) based on a clustering-first approach and three emerging ones (Kraken, CLARK and One Codex) using an assignment-first approach. This study surprisingly reveals that the effect of sequencing errors has a bigger impact on the results that choosing different amplified regions. Moreover, increasing sequencing throughput increases richness overestimation, even more so for microbiota of high complexity. Finally, the choice of the reference database has a bigger impact on richness estimation for clustering-first pipelines, and on correct taxa identification for assignment-first pipelines. Using emerging assignment-first pipelines is a valid approach for targeted metagenomics analyses, with a quality of results comparable to popular clustering-first pipelines, even with an error-prone sequencing technology like Ion Torrent. However, those pipelines are highly sensitive to the quality of databases and their annotations, which makes clustering-first pipelines still the only reliable approach for studying microbiomes that are not well described. PMID:28052134

  8. Mutational and Bioinformatic Analysis of Haloarchaeal Lipobox-Containing Proteins

    Directory of Open Access Journals (Sweden)

    Stefanie Storf

    2010-01-01

    Full Text Available A conserved lipid-modified cysteine found in a protein motif commonly referred to as a lipobox mediates the membrane anchoring of a subset of proteins transported across the bacterial cytoplasmic membrane via the Sec pathway. Sequenced haloarchaeal genomes encode many putative lipoproteins and recent studies have confirmed the importance of the conserved lipobox cysteine for signal peptide processing of three lipobox-containing proteins in the model archaeon Haloferax volcanii. We have extended these in vivo analyses to additional Hfx. volcanii substrates, supporting our previous in silico predictions and confirming the diversity of predicted Hfx. volcanii lipoproteins. Moreover, using extensive comparative secretome analyses, we identified genes encodining putative lipoproteins across a wide range of archaeal species. While our in silico analyses, supported by in vivo data, indicate that most haloarchaeal lipoproteins are Tat substrates, these analyses also predict that many crenarchaeal species lack lipoproteins altogether and that other archaea, such as nonhalophilic euryarchaeal species, transport lipoproteins via the Sec pathway. To facilitate the identification of genes that encode potential haloarchaeal Tat-lipoproteins, we have developed TatLipo, a bioinformatic tool designed to detect lipoboxes in haloarchaeal Tat signal peptides. Our results provide a strong foundation for future studies aimed at identifying components of the archaeal lipoprotein biogenesis pathway.

  9. Robust Bioinformatics Recognition with VLSI Biochip Microsystem

    Science.gov (United States)

    Lue, Jaw-Chyng L.; Fang, Wai-Chi

    2006-01-01

    A microsystem architecture for real-time, on-site, robust bioinformatic patterns recognition and analysis has been proposed. This system is compatible with on-chip DNA analysis means such as polymerase chain reaction (PCR)amplification. A corresponding novel artificial neural network (ANN) learning algorithm using new sigmoid-logarithmic transfer function based on error backpropagation (EBP) algorithm is invented. Our results show the trained new ANN can recognize low fluorescence patterns better than the conventional sigmoidal ANN does. A differential logarithmic imaging chip is designed for calculating logarithm of relative intensities of fluorescence signals. The single-rail logarithmic circuit and a prototype ANN chip are designed, fabricated and characterized.

  10. Multiobjective optimization in bioinformatics and computational biology.

    Science.gov (United States)

    Handl, Julia; Kell, Douglas B; Knowles, Joshua

    2007-01-01

    This paper reviews the application of multiobjective optimization in the fields of bioinformatics and computational biology. A survey of existing work, organized by application area, forms the main body of the review, following an introduction to the key concepts in multiobjective optimization. An original contribution of the review is the identification of five distinct "contexts," giving rise to multiple objectives: These are used to explain the reasons behind the use of multiobjective optimization in each application area and also to point the way to potential future uses of the technique.

  11. Microbial bioinformatics for food safety and production.

    Science.gov (United States)

    Alkema, Wynand; Boekhorst, Jos; Wels, Michiel; van Hijum, Sacha A F T

    2016-03-01

    In the production of fermented foods, microbes play an important role. Optimization of fermentation processes or starter culture production traditionally was a trial-and-error approach inspired by expert knowledge of the fermentation process. Current developments in high-throughput 'omics' technologies allow developing more rational approaches to improve fermentation processes both from the food functionality as well as from the food safety perspective. Here, the authors thematically review typical bioinformatics techniques and approaches to improve various aspects of the microbial production of fermented food products and food safety.

  12. Translational Bioinformatics:Past, Present, and Future

    Institute of Scientific and Technical Information of China (English)

    Jessica D. Tenenbaum

    2016-01-01

    Though a relatively young discipline, translational bioinformatics (TBI) has become a key component of biomedical research in the era of precision medicine. Development of high-throughput technologies and electronic health records has caused a paradigm shift in both healthcare and biomedical research. Novel tools and methods are required to convert increasingly voluminous datasets into information and actionable knowledge. This review provides a definition and contex-tualization of the term TBI, describes the discipline’s brief history and past accomplishments, as well as current foci, and concludes with predictions of future directions in the field.

  13. Introducing bioinformatics, the biosciences' genomic revolution

    CERN Document Server

    Zanella, Paolo

    1999-01-01

    The general audience for these lectures is mainly physicists, computer scientists, engineers or the general public wanting to know more about what’s going on in the biosciences. What’s bioinformatics and why is all this fuss being made about it ? What’s this revolution triggered by the human genome project ? Are there any results yet ? What are the problems ? What new avenues of research have been opened up ? What about the technology ? These new developments will be compared with what happened at CERN earlier in its evolution, and it is hoped that the similiraties and contrasts will stimulate new curiosity and provoke new thoughts.

  14. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    Science.gov (United States)

    Magana, Alejandra J.; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the…

  15. Bioinformatics approaches to single-cell analysis in developmental biology.

    Science.gov (United States)

    Yalcin, Dicle; Hakguder, Zeynep M; Otu, Hasan H

    2016-03-01

    Individual cells within the same population show various degrees of heterogeneity, which may be better handled with single-cell analysis to address biological and clinical questions. Single-cell analysis is especially important in developmental biology as subtle spatial and temporal differences in cells have significant associations with cell fate decisions during differentiation and with the description of a particular state of a cell exhibiting an aberrant phenotype. Biotechnological advances, especially in the area of microfluidics, have led to a robust, massively parallel and multi-dimensional capturing, sorting, and lysis of single-cells and amplification of related macromolecules, which have enabled the use of imaging and omics techniques on single cells. There have been improvements in computational single-cell image analysis in developmental biology regarding feature extraction, segmentation, image enhancement and machine learning, handling limitations of optical resolution to gain new perspectives from the raw microscopy images. Omics approaches, such as transcriptomics, genomics and epigenomics, targeting gene and small RNA expression, single nucleotide and structural variations and methylation and histone modifications, rely heavily on high-throughput sequencing technologies. Although there are well-established bioinformatics methods for analysis of sequence data, there are limited bioinformatics approaches which address experimental design, sample size considerations, amplification bias, normalization, differential expression, coverage, clustering and classification issues, specifically applied at the single-cell level. In this review, we summarize biological and technological advancements, discuss challenges faced in the aforementioned data acquisition and analysis issues and present future prospects for application of single-cell analyses to developmental biology.

  16. Bioinformatics for Diagnostics, Forensics, and Virulence Characterization and Detection

    Energy Technology Data Exchange (ETDEWEB)

    Gardner, S; Slezak, T

    2005-04-05

    We summarize four of our group's high-risk/high-payoff research projects funded by the Intelligence Technology Innovation Center (ITIC) in conjunction with our DHS-funded pathogen informatics activities. These are (1) quantitative assessment of genomic sequencing needs to predict high quality DNA and protein signatures for detection, and comparison of draft versus finished sequences for diagnostic signature prediction; (2) development of forensic software to identify SNP and PCR-RFLP variations from a large number of viral pathogen sequences and optimization of the selection of markers for maximum discrimination of those sequences; (3) prediction of signatures for the detection of virulence, antibiotic resistance, and toxin genes and genetic engineering markers in bacteria; (4) bioinformatic characterization of virulence factors to rapidly screen genomic data for potential genes with similar functions and to elucidate potential health threats in novel organisms. The results of (1) are being used by policy makers to set national sequencing priorities. Analyses from (2) are being used in collaborations with the CDC to genotype and characterize many variola strains, and reports from these collaborations have been made to the President. We also determined SNPs for serotype and strain discrimination of 126 foot and mouth disease virus (FMDV) genomes. For (3), currently >1000 probes have been predicted for the specific detection of >4000 virulence, antibiotic resistance, and genetic engineering vector sequences, and we expect to complete the bioinformatic design of a comprehensive ''virulence detection chip'' by August 2005. Results of (4) will be a system to rapidly predict potential virulence pathways and phenotypes in organisms based on their genomic sequences.

  17. Quantum Bio-Informatics II From Quantum Information to Bio-Informatics

    Science.gov (United States)

    Accardi, L.; Freudenberg, Wolfgang; Ohya, Masanori

    2009-02-01

    / H. Kamimura -- Massive collection of full-length complementary DNA clones and microarray analyses: keys to rice transcriptome analysis / S. Kikuchi -- Changes of influenza A(H5) viruses by means of entropic chaos degree / K. Sato and M. Ohya -- Basics of genome sequence analysis in bioinformatics - its fundamental ideas and problems / T. Suzuki and S. Miyazaki -- A basic introduction to gene expression studies using microarray expression data analysis / D. Wanke and J. Kilian -- Integrating biological perspectives: a quantum leap for microarray expression analysis / D. Wanke ... [et al.].

  18. Bioinformatics for cancer immunology and immunotherapy.

    Science.gov (United States)

    Charoentong, Pornpimol; Angelova, Mihaela; Efremova, Mirjana; Gallasch, Ralf; Hackl, Hubert; Galon, Jerome; Trajanoski, Zlatko

    2012-11-01

    Recent mechanistic insights obtained from preclinical studies and the approval of the first immunotherapies has motivated increasing number of academic investigators and pharmaceutical/biotech companies to further elucidate the role of immunity in tumor pathogenesis and to reconsider the role of immunotherapy. Additionally, technological advances (e.g., next-generation sequencing) are providing unprecedented opportunities to draw a comprehensive picture of the tumor genomics landscape and ultimately enable individualized treatment. However, the increasing complexity of the generated data and the plethora of bioinformatics methods and tools pose considerable challenges to both tumor immunologists and clinical oncologists. In this review, we describe current concepts and future challenges for the management and analysis of data for cancer immunology and immunotherapy. We first highlight publicly available databases with specific focus on cancer immunology including databases for somatic mutations and epitope databases. We then give an overview of the bioinformatics methods for the analysis of next-generation sequencing data (whole-genome and exome sequencing), epitope prediction tools as well as methods for integrative data analysis and network modeling. Mathematical models are powerful tools that can predict and explain important patterns in the genetic and clinical progression of cancer. Therefore, a survey of mathematical models for tumor evolution and tumor-immune cell interaction is included. Finally, we discuss future challenges for individualized immunotherapy and suggest how a combined computational/experimental approaches can lead to new insights into the molecular mechanisms of cancer, improved diagnosis, and prognosis of the disease and pinpoint novel therapeutic targets.

  19. 西门塔尔牛肌内和皮下脂肪miRNA表达谱及miR-27b靶基因分析%Profiling of Differential Expressed MicroRNA in Intramuscular Fat and Subcutaneous Fat of Simmental and Bioinformatic Analyses of miR-27b Target Gene

    Institute of Scientific and Technical Information of China (English)

    汪海洋; 郑月; 李惠侠; 韩兆玉; 王根林

    2013-01-01

    [目的]通过分析西门塔尔牛肌内脂肪和皮下脂肪差异表达miRNA,及对重要的候选miRNA-27b进行靶基因预测及相关生物信息学分析,探索miRNA在动物肌内脂肪生长、发育过程中的作用及其调控机制。[方法]利用miRNA微阵列芯片技术对肌内脂肪和皮下脂肪miRNA表达谱进行检测和分析以筛选差异表达miRNA;qRT-PCR方法验证4个在肌内脂肪显著高表达的候选miRNAs;采用Targetscan生物信息学计算方法预测目标miR-27b(对肌内脂肪沉积可能起重要调节作用)的靶基因,并对靶基因进行GO注释描述、富集分析及KEGG富集。[结果]miRNA芯片分析发现肌内脂肪和皮下脂肪共有88个差异表达极显著miRNA(P<0.01)。候选的4个在肌内脂肪高表达miRNAs(miR-140、miR-145、miR-143、miR-27b)的qRT-PCR检测结果与芯片分析结果具有很好的一致性。KEGG富集显示,目标miR-27b的预测靶基因在MAPK、Wnt、Hedgehog、TGF-beta、GnRH等信号通路中显著富集,其中在MAPK信号通路中富集度最高。[结论]牛肌内脂肪和皮下脂肪组织中确实存在一些差异表达的miRNAs,某些重要的在差异高表达miRNA(如miR-27b)可能以独特的通路(如MAPK信号)来调节牛肌内脂肪的形成过程。%[Objective] In order to explore the role of microRNA in the growth and development process and its regulatory mechanism in intramuscular fat, the profiling of differential expressed microRNA in intramuscular fat and subcutaneous fat of Simmental were analyzed by microarray and real-time PCR (qRT-PCR). The target genes and signal pathway of miRNA-27b were predicted by bioinformatics software and database.[Method]Microarray of bovine microRNA was employed to detect the differentially expressed microRNA between intramuscular adipose tissue and subcutaneous adipose tissue. Then the expression of four selected microRNAs (miR-140, miR-145, miR-143 and miR-27b) in

  20. A lightweight, flow-based toolkit for parallel and distributed bioinformatics pipelines

    Directory of Open Access Journals (Sweden)

    Cieślik Marcin

    2011-02-01

    Full Text Available Abstract Background Bioinformatic analyses typically proceed as chains of data-processing tasks. A pipeline, or 'workflow', is a well-defined protocol, with a specific structure defined by the topology of data-flow interdependencies, and a particular functionality arising from the data transformations applied at each step. In computer science, the dataflow programming (DFP paradigm defines software systems constructed in this manner, as networks of message-passing components. Thus, bioinformatic workflows can be naturally mapped onto DFP concepts. Results To enable the flexible creation and execution of bioinformatics dataflows, we have written a modular framework for parallel pipelines in Python ('PaPy'. A PaPy workflow is created from re-usable components connected by data-pipes into a directed acyclic graph, which together define nested higher-order map functions. The successive functional transformations of input data are evaluated on flexibly pooled compute resources, either local or remote. Input items are processed in batches of adjustable size, all flowing one to tune the trade-off between parallelism and lazy-evaluation (memory consumption. An add-on module ('NuBio' facilitates the creation of bioinformatics workflows by providing domain specific data-containers (e.g., for biomolecular sequences, alignments, structures and functionality (e.g., to parse/write standard file formats. Conclusions PaPy offers a modular framework for the creation and deployment of parallel and distributed data-processing workflows. Pipelines derive their functionality from user-written, data-coupled components, so PaPy also can be viewed as a lightweight toolkit for extensible, flow-based bioinformatics data-processing. The simplicity and flexibility of distributed PaPy pipelines may help users bridge the gap between traditional desktop/workstation and grid computing. PaPy is freely distributed as open-source Python code at http://muralab.org/PaPy, and

  1. Evaluating an Inquiry-Based Bioinformatics Course Using Q Methodology

    Science.gov (United States)

    Ramlo, Susan E.; McConnell, David; Duan, Zhong-Hui; Moore, Francisco B.

    2008-01-01

    Faculty at a Midwestern metropolitan public university recently developed a course on bioinformatics that emphasized collaboration and inquiry. Bioinformatics, essentially the application of computational tools to biological data, is inherently interdisciplinary. Thus part of the challenge of creating this course was serving the needs and…

  2. Assessment of a Bioinformatics across Life Science Curricula Initiative

    Science.gov (United States)

    Howard, David R.; Miskowski, Jennifer A.; Grunwald, Sandra K.; Abler, Michael L.

    2007-01-01

    At the University of Wisconsin-La Crosse, we have undertaken a program to integrate the study of bioinformatics across the undergraduate life science curricula. Our efforts have included incorporating bioinformatics exercises into courses in the biology, microbiology, and chemistry departments, as well as coordinating the efforts of faculty within…

  3. Generative Topic Modeling in Image Data Mining and Bioinformatics Studies

    Science.gov (United States)

    Chen, Xin

    2012-01-01

    Probabilistic topic models have been developed for applications in various domains such as text mining, information retrieval and computer vision and bioinformatics domain. In this thesis, we focus on developing novel probabilistic topic models for image mining and bioinformatics studies. Specifically, a probabilistic topic-connection (PTC) model…

  4. The bioinformatics of next generation sequencing: a meeting report

    Institute of Scientific and Technical Information of China (English)

    Ravi Shankar

    2011-01-01

    @@ The Studio of Computational Biology & Bioinformatics (SCBB), IHBT, CSIR,Palampur, India organized one of the very first national workshop funded by DBT,Govt.of India, on the Bioinformatics issues associated with next generation sequencing approaches.The course structure was designed by SCBB, IHBT.The workshop took place in the IHBT premise on 17 and 18 June 2010.

  5. The 2015 Bioinformatics Open Source Conference (BOSC 2015.

    Directory of Open Access Journals (Sweden)

    Nomi L Harris

    2016-02-01

    Full Text Available The Bioinformatics Open Source Conference (BOSC is organized by the Open Bioinformatics Foundation (OBF, a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG before the annual Intelligent Systems in Molecular Biology (ISMB conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included "Data Science;" "Standards and Interoperability;" "Open Science and Reproducibility;" "Translational Bioinformatics;" "Visualization;" and "Bioinformatics Open Source Project Updates". In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled "Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community," that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule.

  6. 基于131I摄取功能的乳头状甲状腺癌肺转移患者血清差异miRNA的筛选及生物信息学分析%Screening and bioinformatic analyses of serum microRNAs in papillary thyroid carcinoma with lung metastases based on iodine-131 uptake capacity

    Institute of Scientific and Technical Information of China (English)

    薛艳丽; 邱忠领; 宋红俊; 罗全勇

    2013-01-01

    目的 筛选与乳头状甲状腺癌(PTC)肺转移灶131I摄取功能相关的血清miRNAs.方法 收集18例伴肺转移PTC患者血清标本,分为A、B两组,A组9例,为肺转移灶具有摄取131I功能者,B组9例,为肺转移灶无131I摄取功能者.采用miRNA芯片检测两组间的差异血清miRNAs,并经生物信息学分析筛选相关核心miRNA.结果 通过miRNA的检测,发现A、B两组患者血清间具有显著差异的miRNAs 13种,其中5种miRNAs在B组上调,8种miRNAs在B组下调.生物信息学分析发现血清miR-106a为核心miRNA,并参与调节193个靶基因,其中包括MAPK1、MAP3K8、MAP3K3、MAP3K12、MAP3K5、MAP3K14、MAP3K2、MAPK11等基因.结论 PTC肺转移灶具有131I摄取功能组患者血清与肺转移灶无131I摄取功能组患者血清miRNAs存在显著差异,在肺转移灶无131I摄取功能组患者血清中显著上调的miR-106a为核心miRNA,可能与肺转移灶丧失131I摄取功能有关.%Objective To compare the differences in serum miRNAs between papillary thyroid carcinoma (PTC) patients with 131I-avid lung metastases and those with non-131I-avid lung metastases,and to screen a serum miRNA relevant to 131I uptake in lung metastases from PTC.Methods We collected serum samples from 9 PTC patients with 131I-avid lung metastases(Group A)and 9 PTC patients with non-131I-avid lung metastases(Group B)respectively,and both groups were matched for age and sex,without history of other tumors.The serum miRNAs were profiled using miRNA chip technology and the difference between the two groups was compared.Bioinformatics analysis was performed to screen the core miRNAs.Results A total of 13 serum miRNAs with statistically significant differences were detected between Group A and Group B.Among them,5 miRNAs displayed high expression,and 8 miRNAs showed low expression in Group B.Bioinformatics analysis showed that serum miR-106a was the core miRNA.The miR-106a was involved in regulating 193 target genes

  7. Storage, data management, and retrieval in bioinformatics

    Science.gov (United States)

    Wong, Stephen T. C.; Patwardhan, Anil

    2001-12-01

    The evolution of biology into a large-scale quantitative molecular science has been paralleled by concomitant advances in computer storage systems, processing power, and data-analysis algorithms. The application of computer technologies to molecular biology data has given rise to a new system-based approach to biological research. Bioinformatics addresses problems related to the storage, retrieval and analysis of information about biological structure, sequence and function. Its goals include the development of integrated storage systems and analysis tools to interpret molecular biology data in a biologically meaningful manner in normal and disease processes and in efforts for drug discovery. This paper reviews recent developments in data management, storage, and retrieval that are central to the effective use of structural and functional genomics in fulfilling these goals.

  8. Bioinformatics analysis of estrogen-responsive genes

    Science.gov (United States)

    Handel, Adam E.

    2016-01-01

    Estrogen is a steroid hormone that plays critical roles in a myriad of intracellular pathways. The expression of many genes is regulated through the steroid hormone receptors ESR1 and ESR2. These bind to DNA and modulate the expression of target genes. Identification of estrogen target genes is greatly facilitated by the use of transcriptomic methods, such as RNA-seq and expression microarrays, and chromatin immunoprecipitation with massively parallel sequencing (ChIP-seq). Combining transcriptomic and ChIP-seq data enables a distinction to be drawn between direct and indirect estrogen target genes. This chapter will discuss some methods of identifying estrogen target genes that do not require any expertise in programming languages or complex bioinformatics. PMID:26585125

  9. Academic Training - Bioinformatics: Decoding the Genome

    CERN Multimedia

    Chris Jones

    2006-01-01

    ACADEMIC TRAINING LECTURE SERIES 27, 28 February 1, 2, 3 March 2006 from 11:00 to 12:00 - Auditorium, bldg. 500 Decoding the Genome A special series of 5 lectures on: Recent extraordinary advances in the life sciences arising through new detection technologies and bioinformatics The past five years have seen an extraordinary change in the information and tools available in the life sciences. The sequencing of the human genome, the discovery that we possess far fewer genes than foreseen, the measurement of the tiny changes in the genomes that differentiate us, the sequencing of the genomes of many pathogens that lead to diseases such as malaria are all examples of completely new information that is now available in the quest for improved healthcare. New tools have allowed similar strides in the discovery of the associated protein structures, providing invaluable information for those searching for new drugs. New DNA microarray chips permit simultaneous measurement of the state of expression of tens...

  10. Wrapping and interoperating bioinformatics resources using CORBA.

    Science.gov (United States)

    Stevens, R; Miller, C

    2000-02-01

    Bioinformaticians seeking to provide services to working biologists are faced with the twin problems of distribution and diversity of resources. Bioinformatics databases are distributed around the world and exist in many kinds of storage forms, platforms and access paradigms. To provide adequate services to biologists, these distributed and diverse resources have to interoperate seamlessly within single applications. The Common Object Request Broker Architecture (CORBA) offers one technical solution to these problems. The key component of CORBA is its use of object orientation as an intermediate form to translate between different representations. This paper concentrates on an explanation of object orientation and how it can be used to overcome the problems of distribution and diversity by describing the interfaces between objects.

  11. Using Cluster Computers in Bioinformatics Research

    Institute of Scientific and Technical Information of China (English)

    周澄; 郁松年

    2003-01-01

    In the last ten years, high-performance and massively parallel computing technology comes into a high speed developing phase and is used in all fields. The cluster computer systems are also being widely used for their low cost and high performance. In bioinformatics research, solving a problem with computer usually takes hours even days. To speed up research, high-performance cluster computers are considered to be a good platform. Moving into the new MPP (massively parallel processing) system, the original algorithm should be parallelized in a proper way. In this paper, a new parallelizing method of useful sequence alignment algorithm (Smith-Waterman) is designed based on its optimizing algorithm already exists. The result is gratifying.

  12. Bioinformatics methods for identifying candidate disease genes

    Directory of Open Access Journals (Sweden)

    van Driel Marc A

    2006-06-01

    Full Text Available Abstract With the explosion in genomic and functional genomics information, methods for disease gene identification are rapidly evolving. Databases are now essential to the process of selecting candidate disease genes. Combining positional information with disease characteristics and functional information is the usual strategy by which candidate disease genes are selected. Enrichment for candidate disease genes, however, depends on the skills of the operating researcher. Over the past few years, a number of bioinformatics methods that enrich for the most likely candidate disease genes have been developed. Such in silico prioritisation methods may further improve by completion of datasets, by development of standardised ontologies across databases and species and, ultimately, by the integration of different strategies.

  13. Bioinformatics approaches for identifying new therapeutic bioactive peptides in food

    Directory of Open Access Journals (Sweden)

    Nora Khaldi

    2012-10-01

    Full Text Available ABSTRACT:The traditional methods for mining foods for bioactive peptides are tedious and long. Similar to the drug industry, the length of time to identify and deliver a commercial health ingredient that reduces disease symptoms can take anything between 5 to 10 years. Reducing this time and effort is crucial in order to create new commercially viable products with clear and important health benefits. In the past few years, bioinformatics, the science that brings together fast computational biology, and efficient genome mining, is appearing as the long awaited solution to this problem. By quickly mining food genomes for characteristics of certain food therapeutic ingredients, researchers can potentially find new ones in a matter of a few weeks. Yet, surprisingly, very little success has been achieved so far using bioinformatics in mining for food bioactives.The absence of food specific bioinformatic mining tools, the slow integration of both experimental mining and bioinformatics, and the important difference between different experimental platforms are some of the reasons for the slow progress of bioinformatics in the field of functional food and more specifically in bioactive peptide discovery.In this paper I discuss some methods that could be easily translated, using a rational peptide bioinformatics design, to food bioactive peptide mining. I highlight the need for an integrated food peptide database. I also discuss how to better integrate experimental work with bioinformatics in order to improve the mining of food for bioactive peptides, therefore achieving a higher success rates.

  14. High-throughput bioinformatics with the Cyrille2 pipeline system

    Directory of Open Access Journals (Sweden)

    de Groot Joost CW

    2008-02-01

    Full Text Available Abstract Background Modern omics research involves the application of high-throughput technologies that generate vast volumes of data. These data need to be pre-processed, analyzed and integrated with existing knowledge through the use of diverse sets of software tools, models and databases. The analyses are often interdependent and chained together to form complex workflows or pipelines. Given the volume of the data used and the multitude of computational resources available, specialized pipeline software is required to make high-throughput analysis of large-scale omics datasets feasible. Results We have developed a generic pipeline system called Cyrille2. The system is modular in design and consists of three functionally distinct parts: 1 a web based, graphical user interface (GUI that enables a pipeline operator to manage the system; 2 the Scheduler, which forms the functional core of the system and which tracks what data enters the system and determines what jobs must be scheduled for execution, and; 3 the Executor, which searches for scheduled jobs and executes these on a compute cluster. Conclusion The Cyrille2 system is an extensible, modular system, implementing the stated requirements. Cyrille2 enables easy creation and execution of high throughput, flexible bioinformatics pipelines.

  15. Evaluating an Inquiry-based Bioinformatics Course Using Q Methodology

    Science.gov (United States)

    Ramlo, Susan E.; McConnell, David; Duan, Zhong-Hui; Moore, Francisco B.

    2008-06-01

    Faculty at a Midwestern metropolitan public university recently developed a course on bioinformatics that emphasized collaboration and inquiry. Bioinformatics, essentially the application of computational tools to biological data, is inherently interdisciplinary. Thus part of the challenge of creating this course was serving the needs and backgrounds of a diverse set of students, predominantly computer science and biology undergraduate and graduate students. Although the researchers desired to investigate student views of the course, they were interested in the potentially different perspectives. Q methodology, a measure of subjectivity, allowed the researchers to determine the various student perspectives in the bioinformatics course.

  16. Survey of MapReduce frame operation in bioinformatics.

    Science.gov (United States)

    Zou, Quan; Li, Xu-Bin; Jiang, Wen-Rui; Lin, Zi-Yu; Li, Gui-Lin; Chen, Ke

    2014-07-01

    Bioinformatics is challenged by the fact that traditional analysis tools have difficulty in processing large-scale data from high-throughput sequencing. The open source Apache Hadoop project, which adopts the MapReduce framework and a distributed file system, has recently given bioinformatics researchers an opportunity to achieve scalable, efficient and reliable computing performance on Linux clusters and on cloud computing services. In this article, we present MapReduce frame-based applications that can be employed in the next-generation sequencing and other biological domains. In addition, we discuss the challenges faced by this field as well as the future works on parallel computing in bioinformatics.

  17. Thriving in multidisciplinary research: advice for new bioinformatics students.

    Science.gov (United States)

    Auerbach, Raymond K

    2012-09-01

    The sciences have seen a large increase in demand for students in bioinformatics and multidisciplinary fields in general. Many new educational programs have been created to satisfy this demand, but navigating these programs requires a non-traditional outlook and emphasizes working in teams of individuals with distinct yet complementary skill sets. Written from the perspective of a current bioinformatics student, this article seeks to offer advice to prospective and current students in bioinformatics regarding what to expect in their educational program, how multidisciplinary fields differ from more traditional paths, and decisions that they will face on the road to becoming successful, productive bioinformaticists.

  18. Evolution of web services in bioinformatics.

    Science.gov (United States)

    Neerincx, Pieter B T; Leunissen, Jack A M

    2005-06-01

    Bioinformaticians have developed large collections of tools to make sense of the rapidly growing pool of molecular biological data. Biological systems tend to be complex and in order to understand them, it is often necessary to link many data sets and use more than one tool. Therefore, bioinformaticians have experimented with several strategies to try to integrate data sets and tools. Owing to the lack of standards for data sets and the interfaces of the tools this is not a trivial task. Over the past few years building services with web-based interfaces has become a popular way of sharing the data and tools that have resulted from many bioinformatics projects. This paper discusses the interoperability problem and how web services are being used to try to solve it, resulting in the evolution of tools with web interfaces from HTML/web form-based tools not suited for automatic workflow generation to a dynamic network of XML-based web services that can easily be used to create pipelines.

  19. Website for avian flu information and bioinformatics

    Institute of Scientific and Technical Information of China (English)

    GAO; George; Fu

    2009-01-01

    Highly pathogenic influenza A virus H5N1 has spread out worldwide and raised the public concerns. This increased the output of influenza virus sequence data as well as the research publication and other reports. In order to fight against H5N1 avian flu in a comprehensive way, we designed and started to set up the Website for Avian Flu Information (http://www.avian-flu.info) from 2004. Other than the influenza virus database available, the website is aiming to integrate diversified information for both researchers and the public. From 2004 to 2009, we collected information from all aspects, i.e. reports of outbreaks, scientific publications and editorials, policies for prevention, medicines and vaccines, clinic and diagnosis. Except for publications, all information is in Chinese. Till April 15, 2009, the cumulative news entries had been over 2000 and research papers were approaching 5000. By using the curated data from Influenza Virus Resource, we have set up an influenza virus sequence database and a bioinformatic platform, providing the basic functions for the sequence analysis of influenza virus. We will focus on the collection of experimental data and results as well as the integration of the data from the geological information system and avian influenza epidemiology.

  20. Website for avian flu information and bioinformatics

    Institute of Scientific and Technical Information of China (English)

    LIU Di; LIU Quan-He; WU Lin-Huan; LIU Bin; WU Jun; LAO Yi-Mei; LI Xiao-Jing; GAO George Fu; MA Jun-Cai

    2009-01-01

    Highly pathogenic influenza A virus H5N1 has spread out worldwide and raised the public concerns. This increased the output of influenza virus sequence data as well as the research publication and other reports. In order to fight against H5N1 avian flu in a comprehensive way, we designed and started to set up the Website for Avian Flu Information (http://www.avian-flu.info) from 2004. Other than the influenza virus database available, the website is aiming to integrate diversified information for both researchers and the public. From 2004 to 2009, we collected information from all aspects, i.e. reports of outbreaks, scientific publications and editorials, policies for prevention, medicines and vaccines, clinic and diagnosis. Except for publications, all information is in Chinese. Till April 15, 2009, the cumulative news entries had been over 2000 and research papers were approaching 5000. By using the curated data from Influenza Virus Resource, we have set up an influenza virus sequence database and a bioin-formatic platform, providing the basic functions for the sequence analysis of influenza virus. We will focus on the collection of experimental data and results as well as the integration of the data from the geological information system and avian influenza epidemiology.

  1. Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud.

    Directory of Open Access Journals (Sweden)

    Enis Afgan

    Full Text Available Analyzing high throughput genomics data is a complex and compute intensive task, generally requiring numerous software tools and large reference data sets, tied together in successive stages of data transformation and visualisation. A computational platform enabling best practice genomics analysis ideally meets a number of requirements, including: a wide range of analysis and visualisation tools, closely linked to large user and reference data sets; workflow platform(s enabling accessible, reproducible, portable analyses, through a flexible set of interfaces; highly available, scalable computational resources; and flexibility and versatility in the use of these resources to meet demands and expertise of a variety of users. Access to an appropriate computational platform can be a significant barrier to researchers, as establishing such a platform requires a large upfront investment in hardware, experience, and expertise.We designed and implemented the Genomics Virtual Laboratory (GVL as a middleware layer of machine images, cloud management tools, and online services that enable researchers to build arbitrarily sized compute clusters on demand, pre-populated with fully configured bioinformatics tools, reference datasets and workflow and visualisation options. The platform is flexible in that users can conduct analyses through web-based (Galaxy, RStudio, IPython Notebook or command-line interfaces, and add/remove compute nodes and data resources as required. Best-practice tutorials and protocols provide a path from introductory training to practice. The GVL is available on the OpenStack-based Australian Research Cloud (http://nectar.org.au and the Amazon Web Services cloud. The principles, implementation and build process are designed to be cloud-agnostic.This paper provides a blueprint for the design and implementation of a cloud-based Genomics Virtual Laboratory. We discuss scope, design considerations and technical and logistical constraints

  2. Delivering bioinformatics training: bridging the gaps between computer science and biomedicine.

    Science.gov (United States)

    Dubay, Christopher; Brundege, James M; Hersh, William; Spackman, Kent

    2002-01-01

    Biomedical researchers have always sought innovative methodologies to elucidate the underlying biology in their experimental models. As the pace of research has increased with new technologies that 'scale-up' these experiments, researchers have developed acute needs for the information technologies which assist them in managing and processing their experiments and results into useful data analyses that support scientific discovery. The application of information technology to support this discovery process is often called bioinformatics. We have observed a 'gap' in the training of those individuals who traditionally aid in the delivery of information technology at the level of the end-user (e.g. a systems analyst working with a biomedical researcher) which can negatively impact the successful application of technological solutions to biomedical research problems. In this paper we describe the roots and branches of bioinformatics to illustrate a range of applications and technologies that it encompasses. We then propose a taxonomy of bioinformatics as a framework for the identification of skills employed in the field. The taxonomy can be used to assess a set of skills required by a student to traverse this hierarchy from one area to another. We then describe a curriculum that attempts to deliver the identified skills to a broad audience of participants, and describe our experiences with the curriculum to show how it can help bridge the 'gap'.

  3. Scalable pattern recognition algorithms applications in computational biology and bioinformatics

    CERN Document Server

    Maji, Pradipta

    2014-01-01

    Reviews the development of scalable pattern recognition algorithms for computational biology and bioinformatics Includes numerous examples and experimental results to support the theoretical concepts described Concludes each chapter with directions for future research and a comprehensive bibliography

  4. Bioconductor: open software development for computational biology and bioinformatics

    DEFF Research Database (Denmark)

    Gentleman, R.C.; Carey, V.J.; Bates, D.M.;

    2004-01-01

    into interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples....

  5. A high-throughput bioinformatics distributed computing platform

    OpenAIRE

    Keane, Thomas M; Page, Andrew J.; McInerney, James O; Naughton, Thomas J.

    2005-01-01

    In the past number of years the demand for high performance computing has greatly increased in the area of bioinformatics. The huge increase in size of many genomic databases has meant that many common tasks in bioinformatics are not possible to complete in a reasonable amount of time on a single processor. Recently distributed computing has emerged as an inexpensive alternative to dedicated parallel computing. We have developed a general-purpose distributed computing platform ...

  6. An innovative approach for testing bioinformatics programs using metamorphic testing

    Directory of Open Access Journals (Sweden)

    Liu Huai

    2009-01-01

    Full Text Available Abstract Background Recent advances in experimental and computational technologies have fueled the development of many sophisticated bioinformatics programs. The correctness of such programs is crucial as incorrectly computed results may lead to wrong biological conclusion or misguide downstream experimentation. Common software testing procedures involve executing the target program with a set of test inputs and then verifying the correctness of the test outputs. However, due to the complexity of many bioinformatics programs, it is often difficult to verify the correctness of the test outputs. Therefore our ability to perform systematic software testing is greatly hindered. Results We propose to use a novel software testing technique, metamorphic testing (MT, to test a range of bioinformatics programs. Instead of requiring a mechanism to verify whether an individual test output is correct, the MT technique verifies whether a pair of test outputs conform to a set of domain specific properties, called metamorphic relations (MRs, thus greatly increases the number and variety of test cases that can be applied. To demonstrate how MT is used in practice, we applied MT to test two open-source bioinformatics programs, namely GNLab and SeqMap. In particular we show that MT is simple to implement, and is effective in detecting faults in a real-life program and some artificially fault-seeded programs. Further, we discuss how MT can be applied to test programs from various domains of bioinformatics. Conclusion This paper describes the application of a simple, effective and automated technique to systematically test a range of bioinformatics programs. We show how MT can be implemented in practice through two real-life case studies. Since many bioinformatics programs, particularly those for large scale simulation and data analysis, are hard to test systematically, their developers may benefit from using MT as part of the testing strategy. Therefore our work

  7. Biopipe: a flexible framework for protocol-based bioinformatics analysis.

    Science.gov (United States)

    Hoon, Shawn; Ratnapu, Kiran Kumar; Chia, Jer-Ming; Kumarasamy, Balamurugan; Juguang, Xiao; Clamp, Michele; Stabenau, Arne; Potter, Simon; Clarke, Laura; Stupka, Elia

    2003-08-01

    We identify several challenges facing bioinformatics analysis today. Firstly, to fulfill the promise of comparative studies, bioinformatics analysis will need to accommodate different sources of data residing in a federation of databases that, in turn, come in different formats and modes of accessibility. Secondly, the tsunami of data to be handled will require robust systems that enable bioinformatics analysis to be carried out in a parallel fashion. Thirdly, the ever-evolving state of bioinformatics presents new algorithms and paradigms in conducting analysis. This means that any bioinformatics framework must be flexible and generic enough to accommodate such changes. In addition, we identify the need for introducing an explicit protocol-based approach to bioinformatics analysis that will lend rigorousness to the analysis. This makes it easier for experimentation and replication of results by external parties. Biopipe is designed in an effort to meet these goals. It aims to allow researchers to focus on protocol design. At the same time, it is designed to work over a compute farm and thus provides high-throughput performance. A common exchange format that encapsulates the entire protocol in terms of the analysis modules, parameters, and data versions has been developed to provide a powerful way in which to distribute and reproduce results. This will enable researchers to discuss and interpret the data better as the once implicit assumptions are now explicitly defined within the Biopipe framework.

  8. Using registries to integrate bioinformatics tools and services into workbench environments

    DEFF Research Database (Denmark)

    Ménager, Hervé; Kalaš, Matúš; Rapacki, Kristoffer;

    2015-01-01

    The diversity and complexity of bioinformatics resources presents significant challenges to their localisation, deployment and use, creating a need for reliable systems that address these issues. Meanwhile, users demand increasingly usable and integrated ways to access and analyse data, especially...... within convenient, integrated “workbench” environments. Resource descriptions are the core element of registry and workbench systems, which are used to both help the user find and comprehend available software tools, data resources, and Web Services, and to localise, execute and combine them...

  9. Analyses of fugu hoxa2 genes provide evidence for subfunctionalization of neural crest cell and rhombomere cis-regulatory modules during vertebrate evolution.

    Science.gov (United States)

    McEllin, Jennifer A; Alexander, Tara B; Tümpel, Stefan; Wiedemann, Leanne M; Krumlauf, Robb

    2016-01-15

    Hoxa2 gene is a primary player in regulation of craniofacial programs of head development in vertebrates. Here we investigate the evolution of a Hoxa2 neural crest enhancer identified originally in mouse by comparing and contrasting the fugu hoxa2a and hoxa2b genes with their orthologous teleost and mammalian sequences. Using sequence analyses in combination with transgenic regulatory assays in zebrafish and mouse embryos we demonstrate subfunctionalization of regulatory activity for expression in hindbrain segments and neural crest cells between these two fugu co-orthologs. hoxa2a regulatory sequences have retained the ability to mediate expression in neural crest cells while those of hoxa2b include cis-elements that direct expression in rhombomeres. Functional dissection of the neural crest regulatory potential of the fugu hoxa2a and hoxa2b genes identify the previously unknown cis-element NC5, which is implicated in generating the differential activity of the enhancers from these genes. The NC5 region plays a similar role in the ability of this enhancer to mediate reporter expression in mice, suggesting it is a conserved component involved in control of neural crest expression of Hoxa2 in vertebrate craniofacial development.

  10. Integrating bioinformatics into senior high school: design principles and implications.

    Science.gov (United States)

    Machluf, Yossy; Yarden, Anat

    2013-09-01

    Bioinformatics is an integral part of modern life sciences. It has revolutionized and redefined how research is carried out and has had an enormous impact on biotechnology, medicine, agriculture and related areas. Yet, it is only rarely integrated into high school teaching and learning programs, playing almost no role in preparing the next generation of information-oriented citizens. Here, we describe the design principles of bioinformatics learning environments, including our own, that are aimed at introducing bioinformatics into senior high school curricula through engaging learners in scientifically authentic inquiry activities. We discuss the bioinformatics-related benefits and challenges that high school teachers and students face in the course of the implementation process, in light of previous studies and our own experience. Based on these lessons, we present a new approach for characterizing the questions embedded in bioinformatics teaching and learning units, based on three criteria: the type of domain-specific knowledge required to answer each question (declarative knowledge, procedural knowledge, strategic knowledge, situational knowledge), the scientific approach from which each question stems (biological, bioinformatics, a combination of the two) and the associated cognitive process dimension (remember, understand, apply, analyze, evaluate, create). We demonstrate the feasibility of this approach using a learning environment, which we developed for the high school level, and suggest some of its implications. This review sheds light on unique and critical characteristics related to broader integration of bioinformatics in secondary education, which are also relevant to the undergraduate level, and especially on curriculum design, development of suitable learning environments and teaching and learning processes.

  11. 拟南芥bZIP1转录因子通过与ABRE元件结合调节ABA信号传导%Arabidopisis bZIP1 Transcription Factor Binding to the ABRE Cis-Element Regulates Abscisic Acid Signal Transduction

    Institute of Scientific and Technical Information of China (English)

    孙晓丽; 李勇; 才华; 柏锡; 纪巍; 季佐军; 朱延明

    2011-01-01

    Abscisic acid (ABA) is a phytohormone and mediates the response and adaptation of higher plants to various environmental stresses during vegetative growth.The basic leucine zipper (bZIP) transcription factors are also important regulators of plant development and abiotic resistance, acting through either ABA-dependent or ABA-independent pathways.In this study, we investigated and characterized the involvement of the AtbZIP1 gene in plant responsiveness to ABA.As confirmed by PCR and RT-PCR, AtbZIP1 has been silenced in mutant Arabidopsis ko-1 (SALK_059343) and ko-2 (SALK_069489C).The AtbZIP1 knockout plants demonstrated reduced sensitivity to ABA both at the seed germination and seedling stage with improvements in rates of germination, leaf opening/greening, and primary root length.In order to investigate whether the regulation of AtbZIP1-mediated ABA responsiveness depended on the ABA-responsive elements (ABRE), we expressed the AtbZIP1 HIS6 fusion protein in E.coli and found that the AtbZIP1 HIS6 specifically bound to the ABRE cis-elements.Semi-quantitive RT PCR showed that AtbZIP1 disruption altered expressions of some ABA responsive genes, such as NCED3, RD22, KIN1, and RD29A.Our results indicated that AtbZIP1 regulates abscisic acid signal transduction by binding to the ABREs and altered the expressions of the ABA responsive genes.%ABA作为一种重要的植物激素和生长凋节剂,介导了高等植物在营养生长阶段对各种外界环境的响应和适应.bZIP类转录因子可以通过ABA依赖途径和ABA非依赖途径调节植物的生长发育和对非生物胁迫的耐性.本研究通过AtbZIP1 T-DNA插入突变的拟南芥植株ko-1(SALK_059343)和ko-2(SALK_069489C)在ABA处理后的表型实验,验证了AtbZIP1参与ABA依赖的信号传导通路.采用"三引物法",分别在DNA水平和RNA水平通过PCR和RT-PCR验证了AtbZIP1基因在拟南芥突变体中的沉默效果.定量分析数据表明,在种子萌发阶段,经过0.6μmolL-1

  12. Kvalitative analyser ..

    DEFF Research Database (Denmark)

    Boolsen, Merete Watt

    bogen forklarer de fundamentale trin i forskningsprocessen og applikerer dem på udvalgte kvalitative analyser: indholdsanalyse, Grounded Theory, argumentationsanalyse og diskursanalyse......bogen forklarer de fundamentale trin i forskningsprocessen og applikerer dem på udvalgte kvalitative analyser: indholdsanalyse, Grounded Theory, argumentationsanalyse og diskursanalyse...

  13. Application of Bioinformatics and Systems Biology in Medicinal Plant Studies

    Institute of Scientific and Technical Information of China (English)

    DENG You-ping; AI Jun-mei; XIAO Pei-gen

    2010-01-01

    One important purpose to investigate medicinal plants is to understand genes and enzymes that govern the biological metabolic process to produce bioactive compounds.Genome wide high throughput technologies such as genomics,transcriptomics,proteomics and metabolomics can help reach that goal.Such technologies can produce a vast amount of data which desperately need bioinformatics and systems biology to process,manage,distribute and understand these data.By dealing with the"omics"data,bioinformatics and systems biology can also help improve the quality of traditional medicinal materials,develop new approaches for the classification and authentication of medicinal plants,identify new active compounds,and cultivate medicinal plant species that tolerate harsh environmental conditions.In this review,the application of bioinformatics and systems biology in medicinal plants is briefly introduced.

  14. PineappleDB: An online pineapple bioinformatics resource

    Directory of Open Access Journals (Sweden)

    Fairbairn David J

    2005-10-01

    Full Text Available Abstract Background A world first pineapple EST sequencing program has been undertaken to investigate genes expressed during non-climacteric fruit ripening and the nematode-plant interaction during root infection. Very little is known of how non-climacteric fruit ripening is controlled or of the molecular basis of the nematode-plant interaction. PineappleDB was developed to provide the research community with access to a curated bioinformatics resource housing the fruit, root and nematode infected gall expressed sequences. Description PineappleDB is an online, curated database providing integrated access to annotated expressed sequence tag (EST data for cDNA clones isolated from pineapple fruit, root, and nematode infected root gall vascular cylinder tissues. The database currently houses over 5600 EST sequences, 3383 contig consensus sequences, and associated bioinformatic data including splice variants, Arabidopsis homologues, both MIPS based and Gene Ontology functional classifications, and clone distributions. The online resource can be searched by text or by BLAST sequence homology. The data outputs provide comprehensive sequence, bioinformatic and functional classification information. Conclusion The online pineapple bioinformatic resource provides the research community with access to pineapple fruit and root/gall sequence and bioinformatic data in a user-friendly format. The search tools enable efficient data mining and present a wide spectrum of bioinformatic and functional classification information. PineappleDB will be of broad appeal to researchers investigating pineapple genetics, non-climacteric fruit ripening, root-knot nematode infection, crassulacean acid metabolism and alternative RNA splicing in plants.

  15. Enabling the democratization of the genomics revolution with a fully integrated web-based bioinformatics platform

    Science.gov (United States)

    Li, Po-E; Lo, Chien-Chi; Anderson, Joseph J.; Davenport, Karen W.; Bishop-Lilly, Kimberly A.; Xu, Yan; Ahmed, Sanaa; Feng, Shihai; Mokashi, Vishwesh P.; Chain, Patrick S.G.

    2017-01-01

    Continued advancements in sequencing technologies have fueled the development of new sequencing applications and promise to flood current databases with raw data. A number of factors prevent the seamless and easy use of these data, including the breadth of project goals, the wide array of tools that individually perform fractions of any given analysis, the large number of associated software/hardware dependencies, and the detailed expertise required to perform these analyses. To address these issues, we have developed an intuitive web-based environment with a wide assortment of integrated and cutting-edge bioinformatics tools in pre-configured workflows. These workflows, coupled with the ease of use of the environment, provide even novice next-generation sequencing users with the ability to perform many complex analyses with only a few mouse clicks and, within the context of the same environment, to visualize and further interrogate their results. This bioinformatics platform is an initial attempt at Empowering the Development of Genomics Expertise (EDGE) in a wide range of applications for microbial research. PMID:27899609

  16. Approaches in integrative bioinformatics towards the virtual cell

    CERN Document Server

    Chen, Ming

    2014-01-01

    Approaches in Integrative Bioinformatics provides a basic introduction to biological information systems, as well as guidance for the computational analysis of systems biology. This book also covers a range of issues and methods that reveal the multitude of omics data integration types and the relevance that integrative bioinformatics has today. Topics include biological data integration and manipulation, modeling and simulation of metabolic networks, transcriptomics and phenomics, and virtual cell approaches, as well as a number of applications of network biology. It helps to illustrat

  17. Naturally selecting solutions: the use of genetic algorithms in bioinformatics.

    Science.gov (United States)

    Manning, Timmy; Sleator, Roy D; Walsh, Paul

    2013-01-01

    For decades, computer scientists have looked to nature for biologically inspired solutions to computational problems; ranging from robotic control to scheduling optimization. Paradoxically, as we move deeper into the post-genomics era, the reverse is occurring, as biologists and bioinformaticians look to computational techniques, to solve a variety of biological problems. One of the most common biologically inspired techniques are genetic algorithms (GAs), which take the Darwinian concept of natural selection as the driving force behind systems for solving real world problems, including those in the bioinformatics domain. Herein, we provide an overview of genetic algorithms and survey some of the most recent applications of this approach to bioinformatics based problems.

  18. Bioinformatic scaling of allosteric interactions in biomedical isozymes

    Science.gov (United States)

    Phillips, J. C.

    2016-09-01

    Allosteric (long-range) interactions can be surprisingly strong in proteins of biomedical interest. Here we use bioinformatic scaling to connect prior results on nonsteroidal anti-inflammatory drugs to promising new drugs that inhibit cancer cell metabolism. Many parallel features are apparent, which explain how even one amino acid mutation, remote from active sites, can alter medical results. The enzyme twins involved are cyclooxygenase (aspirin) and isocitrate dehydrogenase (IDH). The IDH results are accurate to 1% and are overdetermined by adjusting a single bioinformatic scaling parameter. It appears that the final stage in optimizing protein functionality may involve leveling of the hydrophobic limits of the arms of conformational hydrophilic hinges.

  19. High-performance computational solutions in protein bioinformatics

    CERN Document Server

    Mrozek, Dariusz

    2014-01-01

    Recent developments in computer science enable algorithms previously perceived as too time-consuming to now be efficiently used for applications in bioinformatics and life sciences. This work focuses on proteins and their structures, protein structure similarity searching at main representation levels and various techniques that can be used to accelerate similarity searches. Divided into four parts, the first part provides a formal model of 3D protein structures for functional genomics, comparative bioinformatics and molecular modeling. The second part focuses on the use of multithreading for

  20. Phosphoproteomics and Bioinformatics Analyses of Spinal Cord Proteins in Rats with Morphine Tolerance

    OpenAIRE

    Wen-Jinn Liaw; Cheng-Ming Tsao; Go-Shine Huang; Chin-Chen Wu; Shung-Tai Ho; Jhi-Joung Wang; Yuan-Xiang Tao; Hao-Ai Shui

    2014-01-01

    INTRODUCTION: Morphine is the most effective pain-relieving drug, but it can cause unwanted side effects. Direct neuraxial administration of morphine to spinal cord not only can provide effective, reliable pain relief but also can prevent the development of supraspinal side effects. However, repeated neuraxial administration of morphine may still lead to morphine tolerance. METHODS: To better understand the mechanism that causes morphine tolerance, we induced tolerance in rats at the spinal c...

  1. Incorporating bioinformatics into biological science education in Nigeria: prospects and challenges.

    Science.gov (United States)

    Ojo, O O; Omabe, M

    2011-06-01

    The urgency to process and analyze the deluge of data created by proteomics and genomics studies worldwide has caused bioinformatics to gain prominence and importance. However, its multidisciplinary nature has created a unique demand for specialist trained in both biology and computing. Several countries, in response to this challenge, have developed a number of manpower training programmes. This review presents a description of the meaning, scope, history and development of bioinformatics with focus on prospects and challenges facing bioinformatics education worldwide. The paper also provides an overview of attempts at the introduction of bioinformatics in Nigeria; describes the existing bioinformatics scenario in Nigeria and suggests strategies for effective bioinformatics education in Nigeria.

  2. Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum

    Energy Technology Data Exchange (ETDEWEB)

    Ditty, Jayna L.; Kvaal, Christopher A.; Goodner, Brad; Freyermuth, Sharyn K.; Bailey, Cheryl; Britton, Robert A.; Gordon, Stuart G.; Heinhorst, Sabine; Reed, Kelynne; Xu, Zhaohui; Sanders-Lorenz, Erin R.; Axen, Seth; Kim, Edwin; Johns, Mitrick; Scott, Kathleen; Kerfeld, Cheryl A.

    2011-08-01

    Undergraduate life sciences education needs an overhaul, as clearly described in the National Research Council of the National Academies publication BIO 2010: Transforming Undergraduate Education for Future Research Biologists. Among BIO 2010's top recommendations is the need to involve students in working with real data and tools that reflect the nature of life sciences research in the 21st century. Education research studies support the importance of utilizing primary literature, designing and implementing experiments, and analyzing results in the context of a bona fide scientific question in cultivating the analytical skills necessary to become a scientist. Incorporating these basic scientific methodologies in undergraduate education leads to increased undergraduate and post-graduate retention in the sciences. Toward this end, many undergraduate teaching organizations offer training and suggestions for faculty to update and improve their teaching approaches to help students learn as scientists, through design and discovery (e.g., Council of Undergraduate Research [www.cur.org] and Project Kaleidoscope [www.pkal.org]). With the advent of genome sequencing and bioinformatics, many scientists now formulate biological questions and interpret research results in the context of genomic information. Just as the use of bioinformatic tools and databases changed the way scientists investigate problems, it must change how scientists teach to create new opportunities for students to gain experiences reflecting the influence of genomics, proteomics, and bioinformatics on modern life sciences research. Educators have responded by incorporating bioinformatics into diverse life science curricula. While these published exercises in, and guidelines for, bioinformatics curricula are helpful and inspirational, faculty new to the area of bioinformatics inevitably need training in the theoretical underpinnings of the algorithms. Moreover, effectively integrating bioinformatics

  3. Bioinformatics for precision medicine in oncology: principles and application to the SHIVA clinical trial.

    Science.gov (United States)

    Servant, Nicolas; Roméjon, Julien; Gestraud, Pierre; La Rosa, Philippe; Lucotte, Georges; Lair, Séverine; Bernard, Virginie; Zeitouni, Bruno; Coffin, Fanny; Jules-Clément, Gérôme; Yvon, Florent; Lermine, Alban; Poullet, Patrick; Liva, Stéphane; Pook, Stuart; Popova, Tatiana; Barette, Camille; Prud'homme, François; Dick, Jean-Gabriel; Kamal, Maud; Le Tourneau, Christophe; Barillot, Emmanuel; Hupé, Philippe

    2014-01-01

    Precision medicine (PM) requires the delivery of individually adapted medical care based on the genetic characteristics of each patient and his/her tumor. The last decade witnessed the development of high-throughput technologies such as microarrays and next-generation sequencing which paved the way to PM in the field of oncology. While the cost of these technologies decreases, we are facing an exponential increase in the amount of data produced. Our ability to use this information in daily practice relies strongly on the availability of an efficient bioinformatics system that assists in the translation of knowledge from the bench towards molecular targeting and diagnosis. Clinical trials and routine diagnoses constitute different approaches, both requiring a strong bioinformatics environment capable of (i) warranting the integration and the traceability of data, (ii) ensuring the correct processing and analyses of genomic data, and (iii) applying well-defined and reproducible procedures for workflow management and decision-making. To address the issues, a seamless information system was developed at Institut Curie which facilitates the data integration and tracks in real-time the processing of individual samples. Moreover, computational pipelines were developed to identify reliably genomic alterations and mutations from the molecular profiles of each patient. After a rigorous quality control, a meaningful report is delivered to the clinicians and biologists for the therapeutic decision. The complete bioinformatics environment and the key points of its implementation are presented in the context of the SHIVA clinical trial, a multicentric randomized phase II trial comparing targeted therapy based on tumor molecular profiling versus conventional therapy in patients with refractory cancer. The numerous challenges faced in practice during the setting up and the conduct of this trial are discussed as an illustration of PM application.

  4. Robust enzyme design: bioinformatic tools for improved protein stability.

    Science.gov (United States)

    Suplatov, Dmitry; Voevodin, Vladimir; Švedas, Vytas

    2015-03-01

    The ability of proteins and enzymes to maintain a functionally active conformation under adverse environmental conditions is an important feature of biocatalysts, vaccines, and biopharmaceutical proteins. From an evolutionary perspective, robust stability of proteins improves their biological fitness and allows for further optimization. Viewed from an industrial perspective, enzyme stability is crucial for the practical application of enzymes under the required reaction conditions. In this review, we analyze bioinformatic-driven strategies that are used to predict structural changes that can be applied to wild type proteins in order to produce more stable variants. The most commonly employed techniques can be classified into stochastic approaches, empirical or systematic rational design strategies, and design of chimeric proteins. We conclude that bioinformatic analysis can be efficiently used to study large protein superfamilies systematically as well as to predict particular structural changes which increase enzyme stability. Evolution has created a diversity of protein properties that are encoded in genomic sequences and structural data. Bioinformatics has the power to uncover this evolutionary code and provide a reproducible selection of hotspots - key residues to be mutated in order to produce more stable and functionally diverse proteins and enzymes. Further development of systematic bioinformatic procedures is needed to organize and analyze sequences and structures of proteins within large superfamilies and to link them to function, as well as to provide knowledge-based predictions for experimental evaluation.

  5. An evaluation of ontology exchange languages for bioinformatics.

    Science.gov (United States)

    McEntire, R; Karp, P; Abernethy, N; Benton, D; Helt, G; DeJongh, M; Kent, R; Kosky, A; Lewis, S; Hodnett, D; Neumann, E; Olken, F; Pathak, D; Tarczy-Hornoch, P; Toldo, L; Topaloglou, T

    2000-01-01

    Ontologies are specifications of the concepts in a given field, and of the relationships among those concepts. The development of ontologies for molecular-biology information and the sharing of those ontologies within the bioinformatics community are central problems in bioinformatics. If the bioinformatics community is to share ontologies effectively, ontologies must be exchanged in a form that uses standardized syntax and semantics. This paper reports on an effort among the authors to evaluate alternative ontology-exchange languages, and to recommend one or more languages for use within the larger bioinformatics community. The study selected a set of candidate languages, and defined a set of capabilities that the ideal ontology-exchange language should satisfy. The study scored the languages according to the degree to which they satisfied each capability. In addition, the authors performed several ontology-exchange experiments with the two languages that received the highest scores: OML and Ontolingua. The result of those experiments, and the main conclusion of this study, was that the frame-based semantic model of Ontolingua is preferable to the conceptual graph model of OML, but that the XML-based syntax of OML is preferable to the Lisp-based syntax of Ontolingua.

  6. A Tool for Creating and Parallelizing Bioinformatics Pipelines

    Science.gov (United States)

    2007-06-01

    well as that are incorporated into InterPro (Mulder, et al., 2005). other users’ work. PUMA2 ( Maltsev , et al., 2006) incorporates more than 20 0-7695...pipeline for protocol-based bioinformatics analysis." Genome Res., 13(8), pp. 1904-1915, 2003. Maltsev , N. and E. Glass, et al., "PUMA2--grid-based 4

  7. A BIOINFORMATIC STRATEGY TO RAPIDLY CHARACTERIZE CDNA LIBRARIES

    Science.gov (United States)

    A Bioinformatic Strategy to Rapidly Characterize cDNA LibrariesG. Charles Ostermeier1, David J. Dix2 and Stephen A. Krawetz1.1Departments of Obstetrics and Gynecology, Center for Molecular Medicine and Genetics, & Institute for Scientific Computing, Wayne State Univer...

  8. Pladipus Enables Universal Distributed Computing in Proteomics Bioinformatics.

    Science.gov (United States)

    Verheggen, Kenneth; Maddelein, Davy; Hulstaert, Niels; Martens, Lennart; Barsnes, Harald; Vaudel, Marc

    2016-03-04

    The use of proteomics bioinformatics substantially contributes to an improved understanding of proteomes, but this novel and in-depth knowledge comes at the cost of increased computational complexity. Parallelization across multiple computers, a strategy termed distributed computing, can be used to handle this increased complexity; however, setting up and maintaining a distributed computing infrastructure requires resources and skills that are not readily available to most research groups. Here we propose a free and open-source framework named Pladipus that greatly facilitates the establishment of distributed computing networks for proteomics bioinformatics tools. Pladipus is straightforward to install and operate thanks to its user-friendly graphical interface, allowing complex bioinformatics tasks to be run easily on a network instead of a single computer. As a result, any researcher can benefit from the increased computational efficiency provided by distributed computing, hence empowering them to tackle more complex bioinformatics challenges. Notably, it enables any research group to perform large-scale reprocessing of publicly available proteomics data, thus supporting the scientific community in mining these data for novel discoveries.

  9. Bioinformatics: Tools to accelerate population science and disease control research.

    Science.gov (United States)

    Forman, Michele R; Greene, Sarah M; Avis, Nancy E; Taplin, Stephen H; Courtney, Paul; Schad, Peter A; Hesse, Bradford W; Winn, Deborah M

    2010-06-01

    Population science and disease control researchers can benefit from a more proactive approach to applying bioinformatics tools for clinical and public health research. Bioinformatics utilizes principles of information sciences and technologies to transform vast, diverse, and complex life sciences data into a more coherent format for wider application. Bioinformatics provides the means to collect and process data, enhance data standardization and harmonization for scientific discovery, and merge disparate data sources. Achieving interoperability (i.e. the development of an informatics system that provides access to and use of data from different systems) will facilitate scientific explorations and careers and opportunities for interventions in population health. The National Cancer Institute's (NCI's) interoperable Cancer Biomedical Informatics Grid (caBIG) is one of a number of illustrative tools in this report that are being mined by population scientists. Tools are not all that is needed for progress. Challenges persist, including a lack of common data standards, proprietary barriers to data access, and difficulties pooling data from studies. Population scientists and informaticists are developing promising and innovative solutions to these barriers. The purpose of this paper is to describe how the application of bioinformatics systems can accelerate population health research across the continuum from prevention to detection, diagnosis, treatment, and outcome.

  10. BioRuby: Bioinformatics software for the Ruby programming language

    NARCIS (Netherlands)

    Goto, N.; Prins, J.C.P.; Nakao, M.; Bonnal, R.; Aerts, J.; Katayama, A.

    2010-01-01

    The BioRuby software toolkit contains a comprehensive set of free development tools and libraries for bioinformatics and molecular biology, written in the Ruby programming language. BioRuby has components for sequence analysis, pathway analysis, protein modelling and phylogenetic analysis; it suppor

  11. BioRuby : bioinformatics software for the Ruby programming language

    NARCIS (Netherlands)

    Goto, Naohisa; Prins, Pjotr; Nakao, Mitsuteru; Bonnal, Raoul; Aerts, Jan; Katayama, Toshiaki

    2010-01-01

    The BioRuby software toolkit contains a comprehensive set of free development tools and libraries for bioinformatics and molecular biology, written in the Ruby programming language. BioRuby has components for sequence analysis, pathway analysis, protein modelling and phylogenetic analysis; it suppor

  12. CROSSWORK for Glycans: Glycan Identificatin Through Mass Spectrometry and Bioinformatics

    DEFF Research Database (Denmark)

    Rasmussen, Morten; Thaysen-Andersen, Morten; Højrup, Peter

      We have developed "GLYCANthrope " - CROSSWORKS for glycans:  a bioinformatics tool, which assists in identifying N-linked glycosylated peptides as well as their glycan moieties from MS2 data of enzymatically digested glycoproteins. The program runs either as a stand-alone application or as a plug...

  13. Learning Genetics through an Authentic Research Simulation in Bioinformatics

    Science.gov (United States)

    Gelbart, Hadas; Yarden, Anat

    2006-01-01

    Following the rationale that learning is an active process of knowledge construction as well as enculturation into a community of experts, we developed a novel web-based learning environment in bioinformatics for high-school biology majors in Israel. The learning environment enables the learners to actively participate in a guided inquiry process…

  14. Hidden in the Middle: Culture, Value and Reward in Bioinformatics

    Science.gov (United States)

    Lewis, Jamie; Bartlett, Andrew; Atkinson, Paul

    2016-01-01

    Bioinformatics--the so-called shotgun marriage between biology and computer science--is an interdiscipline. Despite interdisciplinarity being seen as a virtue, for having the capacity to solve complex problems and foster innovation, it has the potential to place projects and people in anomalous categories. For example, valorised…

  15. Intrageneric Primer Design: Bringing Bioinformatics Tools to the Class

    Science.gov (United States)

    Lima, Andre O. S.; Garces, Sergio P. S.

    2006-01-01

    Bioinformatics is one of the fastest growing scientific areas over the last decade. It focuses on the use of informatics tools for the organization and analysis of biological data. An example of their importance is the availability nowadays of dozens of software programs for genomic and proteomic studies. Thus, there is a growing field (private…

  16. An International Bioinformatics Infrastructure to Underpin the Arabidopsis Community

    Science.gov (United States)

    The future bioinformatics needs of the Arabidopsis community as well as those of other scientific communities that depend on Arabidopsis resources were discussed at a pair of recent meetings held by the Multinational Arabidopsis Steering Committee (MASC) and the North American Arabidopsis Steering C...

  17. Anticipating Viral Species Jumps: Bioinformatics and Data Needs

    Science.gov (United States)

    2011-06-01

    with a function of propelling or steering the evolution of a gene, phenotypic trait or species (Prakash 2008). Bioinformatics Research, development...platforms like SJOne. Ebola Viruses There are five known species of Ebola virus: Bundibugyo, Cote d’Ivoire, Reston, Sudan and Zaire. The relative

  18. Bioinformatics Assisted Gene Discovery and Annotation of Human Genome

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    As the sequencing stage of human genome project is near the end, the work has begun for discovering novel genes from genome sequences and annotating their biological functions. Here are reviewed current major bioinformatics tools and technologies available for large scale gene discovery and annotation from human genome sequences. Some ideas about possible future development are also provided.

  19. WIWS: a protein structure bioinformatics Web service collection.

    NARCIS (Netherlands)

    Hekkelman, M.L.; Beek, T.A.H. te; Pettifer, S.R.; Thorne, D.; Attwood, T.K.; Vriend, G.

    2010-01-01

    The WHAT IF molecular-modelling and drug design program is widely distributed in the world of protein structure bioinformatics. Although originally designed as an interactive application, its highly modular design and inbuilt control language have recently enabled its deployment as a collection of p

  20. A Bioinformatic Approach to Inter Functional Interactions within Protein Sequences

    Science.gov (United States)

    2009-02-23

    Geoffrey Webb Prof James Whisstock Dr Jianging Song Mr Khalid Mahmood Mr Cyril Reboul Ms Wan Ting Kan Publications: List peer-reviewed...Khalid Mahmood, Jianging Song, Cyril Reboul , Wan Ting Kan, Geoffrey I. Webb and James C. Whisstock. To be submitted to BMC Bioinformatics. Outline

  1. Virginia Bioinformatics Institute offers fellowships for graduate work in transdisciplinary science

    OpenAIRE

    Bland, Susan

    2008-01-01

    The Virginia Bioinformatics Institute at Virginia Tech, in collaboration with Virginia Tech's Ph.D. program in genetics, bioinformatics, and computational biology, is providing substantial fellowships in support of graduate work in transdisciplinary team science.

  2. Freshwater Metaviromics and Bacteriophages: A Current Assessment of the State of the Art in Relation to Bioinformatic Challenges.

    Science.gov (United States)

    Bruder, Katherine; Malki, Kema; Cooper, Alexandria; Sible, Emily; Shapiro, Jason W; Watkins, Siobhan C; Putonti, Catherine

    2016-01-01

    Advances in bioinformatics and sequencing technologies have allowed for the analysis of complex microbial communities at an unprecedented rate. While much focus is often placed on the cellular members of these communities, viruses play a pivotal role, particularly bacteria-infecting viruses (bacteriophages); phages mediate global biogeochemical processes and drive microbial evolution through bacterial grazing and horizontal gene transfer. Despite their importance and ubiquity in nature, very little is known about the diversity and structure of viral communities. Though the need for culture-based methods for viral identification has been somewhat circumvented through metagenomic techniques, the analysis of metaviromic data is marred with many unique issues. In this review, we examine the current bioinformatic approaches for metavirome analyses and the inherent challenges facing the field as illustrated by the ongoing efforts in the exploration of freshwater phage populations.

  3. BioWarehouse: a bioinformatics database warehouse toolkit

    Directory of Open Access Journals (Sweden)

    Stringer-Calvert David WJ

    2006-03-01

    Full Text Available Abstract Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the

  4. Missing "Links" in Bioinformatics Education: Expanding Students' Conceptions of Bioinformatics Using a Biodiversity Database of Living and Fossil Reef Corals

    Science.gov (United States)

    Nehm, Ross H.; Budd, Ann F.

    2006-01-01

    NMITA is a reef coral biodiversity database that we use to introduce students to the expansive realm of bioinformatics beyond genetics. We introduce a series of lessons that have students use this database, thereby accessing real data that can be used to test hypotheses about biodiversity and evolution while targeting the "National Science …

  5. Bioinformatics training: selecting an appropriate learning content management system--an example from the European Bioinformatics Institute.

    Science.gov (United States)

    Wright, Victoria Ann; Vaughan, Brendan W; Laurent, Thomas; Lopez, Rodrigo; Brooksbank, Cath; Schneider, Maria Victoria

    2010-11-01

    Today's molecular life scientists are well educated in the emerging experimental tools of their trade, but when it comes to training on the myriad of resources and tools for dealing with biological data, a less ideal situation emerges. Often bioinformatics users receive no formal training on how to make the most of the bioinformatics resources and tools available in the public domain. The European Bioinformatics Institute, which is part of the European Molecular Biology Laboratory (EMBL-EBI), holds the world's most comprehensive collection of molecular data, and training the research community to exploit this information is embedded in the EBI's mission. We have evaluated eLearning, in parallel with face-to-face courses, as a means of training users of our data resources and tools. We anticipate that eLearning will become an increasingly important vehicle for delivering training to our growing user base, so we have undertaken an extensive review of Learning Content Management Systems (LCMSs). Here, we describe the process that we used, which considered the requirements of trainees, trainers and systems administrators, as well as taking into account our organizational values and needs. This review describes the literature survey, user discussions and scripted platform testing that we performed to narrow down our choice of platform from 36 to a single platform. We hope that it will serve as guidance for others who are seeking to incorporate eLearning into their bioinformatics training programmes.

  6. Report on the EMBER Project--A European Multimedia Bioinformatics Educational Resource

    Science.gov (United States)

    Attwood, Terri K.; Selimas, Ioannis; Buis, Rob; Altenburg, Ruud; Herzog, Robert; Ledent, Valerie; Ghita, Viorica; Fernandes, Pedro; Marques, Isabel; Brugman, Marc

    2005-01-01

    EMBER was a European project aiming to develop bioinformatics teaching materials on the Web and CD-ROM to help address the recognised skills shortage in bioinformatics. The project grew out of pilot work on the development of an interactive web-based bioinformatics tutorial and the desire to repackage that resource with the help of a professional…

  7. Introductory Bioinformatics Exercises Utilizing Hemoglobin and Chymotrypsin to Reinforce the Protein Sequence-Structure-Function Relationship

    Science.gov (United States)

    Inlow, Jennifer K.; Miller, Paige; Pittman, Bethany

    2007-01-01

    We describe two bioinformatics exercises intended for use in a computer laboratory setting in an upper-level undergraduate biochemistry course. To introduce students to bioinformatics, the exercises incorporate several commonly used bioinformatics tools, including BLAST, that are freely available online. The exercises build upon the students'…

  8. Vertical and Horizontal Integration of Bioinformatics Education: A Modular, Interdisciplinary Approach

    Science.gov (United States)

    Furge, Laura Lowe; Stevens-Truss, Regina; Moore, D. Blaine; Langeland, James A.

    2009-01-01

    Bioinformatics education for undergraduates has been approached primarily in two ways: introduction of new courses with largely bioinformatics focus or introduction of bioinformatics experiences into existing courses. For small colleges such as Kalamazoo, creation of new courses within an already resource-stretched setting has not been an option.…

  9. Genome bioinformatic analysis of nonsynonymous SNPs

    Directory of Open Access Journals (Sweden)

    Todd John A

    2007-08-01

    Full Text Available Abstract Background Genome-wide association studies of common diseases for common, low penetrance causal variants are underway. A proportion of these will alter protein sequences, the most common of which is the non-synonymous single nucleotide polymorphism (nsSNP. It would be an advantage if the functional effects of an nsSNP on protein structure and function could be predicted, both for the final identification process of a causal variant in a disease-associated chromosome region, and in further functional analyses of the nsSNP and its disease-associated protein. Results In the present report we have compared and contrasted structure- and sequence-based methods of prediction to over 5500 genes carrying nearly 24,000 nsSNPs, by employing an automatic comparative modelling procedure to build models for the genes. The nsSNP information came from two sources, the OMIM database which are rare (minor allele frequency, MAF, 0.05, have no known link to a disease. For over 40% of the nsSNPs, structure-based methods predicted which of these sequence changes are likely to either disrupt the structure of the protein or interfere with the function or interactions of the protein. For the remaining 60%, we generated sequence-based predictions. Conclusion We show that, in general, the prediction tools are able distinguish disease causing mutations from those mutations which are thought to have a neutral affect. We give examples of mutations in genes that are predicted to be deleterious and may have a role in disease. Contrary to previous reports, we also show that rare mutations are consistently predicted to be deleterious as often as commonly occurring nsSNPs.

  10. Identification and bioinformatic characterization of a multidrug resistance associated protein (ABCC) gene in Plasmodium berghei

    Science.gov (United States)

    González-Pons, María; Szeto, Ada C; González-Méndez, Ricardo; Serrano, Adelfa E

    2009-01-01

    Background The ATP-binding cassette (ABC) superfamily is one of the largest evolutionarily conserved families of proteins. ABC proteins play key roles in cellular detoxification of endobiotics and xenobiotics. Overexpression of certain ABC proteins, among them the multidrug resistance associated protein (MRP), contributes to drug resistance in organisms ranging from human neoplastic cells to parasitic protozoa. In the present study, the Plasmodium berghei mrp gene (pbmrp) was partially characterized and the predicted protein was classified using bioinformatics in order to explore its putative involvement in drug resistance. Methods The pbmrp gene from the P. berghei drug sensitive, N clone, was sequenced using a PCR strategy. Classification and domain organization of pbMRP were determined with bioinformatics. The Plasmodium spp. MRPs were aligned and analysed to study their conserved motifs and organization. Gene copy number and organization were determined via Southern blot analysis in both N clone and the chloroquine selected line, RC. Chromosomal Southern blots and RNase protection assays were employed to determine the chromosomal location and expression levels of pbmrp in blood stages. Results The pbmrp gene is a single copy, intronless gene with a predicted open reading frame spanning 5820 nucleotides. Bioinformatic analyses show that this protein has distinctive features characteristic of the ABCC sub-family. Multiple sequence alignments reveal a high degree of conservation in the nucleotide binding and transmembrane domains within the MRPs from the Plasmodium spp. analysed. Expression of pbmrp was detected in asexual blood stages. Gene organization, copy number and mRNA expression was similar in both lines studied. A chromosomal translocation was observed in the chloroquine selected RC line, from chromosome 13/14 to chromosome 8, when compared to the drug sensitive N clone. Conclusion In this study, the pbmrp gene was sequenced and classified as a member of

  11. Statistical modelling in biostatistics and bioinformatics selected papers

    CERN Document Server

    Peng, Defen

    2014-01-01

    This book presents selected papers on statistical model development related mainly to the fields of Biostatistics and Bioinformatics. The coverage of the material falls squarely into the following categories: (a) Survival analysis and multivariate survival analysis, (b) Time series and longitudinal data analysis, (c) Statistical model development and (d) Applied statistical modelling. Innovations in statistical modelling are presented throughout each of the four areas, with some intriguing new ideas on hierarchical generalized non-linear models and on frailty models with structural dispersion, just to mention two examples. The contributors include distinguished international statisticians such as Philip Hougaard, John Hinde, Il Do Ha, Roger Payne and Alessandra Durio, among others, as well as promising newcomers. Some of the contributions have come from researchers working in the BIO-SI research programme on Biostatistics and Bioinformatics, centred on the Universities of Limerick and Galway in Ireland and fu...

  12. Bioinformatics Analysis of Zinc Transporter from Baoding Alfalfa

    Institute of Scientific and Technical Information of China (English)

    Haibo WANG; Junyun GUO

    2012-01-01

    [Objective] This study aimed to perform the bioinformatics analysis of Zinc transporter (ZnT) from Baoding Alfalfa. [Method] Based on the amino acid sequence, the physical and chemical properties, hydrophilicity/hydrophobicity, secondary structure of ZnT from Baoding alfalfa were predicted by a series of bioinformatics software. And the transmembrane domains were predicted by using different online tools. [Result] ZnT is a hydrophobic protein containing 408 amino acids with the theoretical pl of 5.94, and it has 7 potential transmembrane hydrophobic regions. In the sec- ondary structure, co-helix (Hh) accounted for 48.04%, extended strand (Ee) for 9.56%, random coil (Cc) for 42.40%, which was accored with the characteristic of transmembrane protein. [Conclusion] mZnT is a member of CDF family, responsible for transporting Zn^2+ out of the cell membrane to reduce the concentration and toxicity of Zn^2+.

  13. Bioinformatics Data Distribution and Integration via Web Services and XML

    Institute of Scientific and Technical Information of China (English)

    Xiao Li; Yizheng Zhang

    2003-01-01

    It is widely recognized that exchange, distribution, and integration of biological data are the keys to improve bioinformatics and genome biology in post-genomic era. However, the problem of exchanging and integrating biological data is not solved satisfactorily. The eXtensible Markup Language (XML) is rapidly spreading as an emerging standard for structuring documents to exchange and integrate data on the World Wide Web (WWW). Web service is the next generation of WWW and is founded upon the open standards of W3C (World Wide Web Consortium)and IETF (Internet Engineering Task Force). This paper presents XML and Web Services technologies and their use for an appropriate solution to the problem of bioinformatics data exchange and integration.

  14. Some statistics in bioinformatics: the fifth Armitage Lecture.

    Science.gov (United States)

    Solomon, Patricia J

    2009-10-15

    The spirit and content of the 2007 Armitage Lecture are presented in this paper. To begin, two areas of Peter Armitage's early work are distinguished: his pioneering research on sequential methods intended for use in medical trials and the comparison of survival curves. Their influence on much later work is highlighted, and motivate the proposal of several statistical 'truths' that are presented in the paper. The illustration of these truths demonstrates biology's new morphology and its dominance over statistics in this century. An overview of a recent proteomics ovarian cancer study is given as a warning of what can happen when bioinformatics meets epidemiology badly, in particular, when the study design is poor. A statistical bioinformatics success story is outlined, in which gene profiling is helping to identify novel genes and networks involved in mouse embryonic stem cell development. Some concluding thoughts are given.

  15. 2nd Colombian Congress on Computational Biology and Bioinformatics

    CERN Document Server

    Cristancho, Marco; Isaza, Gustavo; Pinzón, Andrés; Rodríguez, Juan

    2014-01-01

    This volume compiles accepted contributions for the 2nd Edition of the Colombian Computational Biology and Bioinformatics Congress CCBCOL, after a rigorous review process in which 54 papers were accepted for publication from 119 submitted contributions. Bioinformatics and Computational Biology are areas of knowledge that have emerged due to advances that have taken place in the Biological Sciences and its integration with Information Sciences. The expansion of projects involving the study of genomes has led the way in the production of vast amounts of sequence data which needs to be organized, analyzed and stored to understand phenomena associated with living organisms related to their evolution, behavior in different ecosystems, and the development of applications that can be derived from this analysis.  .

  16. State of the nation in data integration for bioinformatics.

    Science.gov (United States)

    Goble, Carole; Stevens, Robert

    2008-10-01

    Data integration is a perennial issue in bioinformatics, with many systems being developed and many technologies offered as a panacea for its resolution. The fact that it is still a problem indicates a persistence of underlying issues. Progress has been made, but we should ask "what lessons have been learnt?", and "what still needs to be done?" Semantic Web and Web 2.0 technologies are the latest to find traction within bioinformatics data integration. Now we can ask whether the Semantic Web, mashups, or their combination, have the potential to help. This paper is based on the opening invited talk by Carole Goble given at the Health Care and Life Sciences Data Integration for the Semantic Web Workshop collocated with WWW2007. The paper expands on that talk. We attempt to place some perspective on past efforts, highlight the reasons for success and failure, and indicate some pointers to the future.

  17. Rise and demise of bioinformatics? Promise and progress.

    Directory of Open Access Journals (Sweden)

    Christos A Ouzounis

    Full Text Available The field of bioinformatics and computational biology has gone through a number of transformations during the past 15 years, establishing itself as a key component of new biology. This spectacular growth has been challenged by a number of disruptive changes in science and technology. Despite the apparent fatigue of the linguistic use of the term itself, bioinformatics has grown perhaps to a point beyond recognition. We explore both historical aspects and future trends and argue that as the field expands, key questions remain unanswered and acquire new meaning while at the same time the range of applications is widening to cover an ever increasing number of biological disciplines. These trends appear to be pointing to a redefinition of certain objectives, milestones, and possibly the field itself.

  18. Architecture exploration of FPGA based accelerators for bioinformatics applications

    CERN Document Server

    Varma, B Sharat Chandra; Balakrishnan, M

    2016-01-01

    This book presents an evaluation methodology to design future FPGA fabrics incorporating hard embedded blocks (HEBs) to accelerate applications. This methodology will be useful for selection of blocks to be embedded into the fabric and for evaluating the performance gain that can be achieved by such an embedding. The authors illustrate the use of their methodology by studying the impact of HEBs on two important bioinformatics applications: protein docking and genome assembly. The book also explains how the respective HEBs are designed and how hardware implementation of the application is done using these HEBs. It shows that significant speedups can be achieved over pure software implementations by using such FPGA-based accelerators. The methodology presented in this book may also be used for designing HEBs for accelerating software implementations in other domains besides bioinformatics. This book will prove useful to students, researchers, and practicing engineers alike.

  19. WIWS: a protein structure bioinformatics Web service collection.

    Science.gov (United States)

    Hekkelman, M L; Te Beek, T A H; Pettifer, S R; Thorne, D; Attwood, T K; Vriend, G

    2010-07-01

    The WHAT IF molecular-modelling and drug design program is widely distributed in the world of protein structure bioinformatics. Although originally designed as an interactive application, its highly modular design and inbuilt control language have recently enabled its deployment as a collection of programmatically accessible web services. We report here a collection of WHAT IF-based protein structure bioinformatics web services: these relate to structure quality, the use of symmetry in crystal structures, structure correction and optimization, adding hydrogens and optimizing hydrogen bonds and a series of geometric calculations. The freely accessible web services are based on the industry standard WS-I profile and the EMBRACE technical guidelines, and are available via both REST and SOAP paradigms. The web services run on a dedicated computational cluster; their function and availability is monitored daily.

  20. Bioinformatics for whole-genome shotgun sequencing of microbial communities.

    Directory of Open Access Journals (Sweden)

    Kevin Chen

    2005-07-01

    Full Text Available The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fecal samples. The interpretation of this new kind of data poses a wide variety of exciting and difficult bioinformatics problems. The aim of this review is to introduce the bioinformatics community to this emerging field by surveying existing techniques and promising new approaches for several of the most interesting of these computational problems.

  1. BIRCH: A user-oriented, locally-customizable, bioinformatics system

    Directory of Open Access Journals (Sweden)

    Fristensky Brian

    2007-02-01

    Full Text Available Abstract Background Molecular biologists need sophisticated analytical tools which often demand extensive computational resources. While finding, installing, and using these tools can be challenging, pipelining data from one program to the next is particularly awkward, especially when using web-based programs. At the same time, system administrators tasked with maintaining these tools do not always appreciate the needs of research biologists. Results BIRCH (Biological Research Computing Hierarchy is an organizational framework for delivering bioinformatics resources to a user group, scaling from a single lab to a large institution. The BIRCH core distribution includes many popular bioinformatics programs, unified within the GDE (Genetic Data Environment graphic interface. Of equal importance, BIRCH provides the system administrator with tools that simplify the job of managing a multiuser bioinformatics system across different platforms and operating systems. These include tools for integrating locally-installed programs and databases into BIRCH, and for customizing the local BIRCH system to meet the needs of the user base. BIRCH can also act as a front end to provide a unified view of already-existing collections of bioinformatics software. Documentation for the BIRCH and locally-added programs is merged in a hierarchical set of web pages. In addition to manual pages for individual programs, BIRCH tutorials employ step by step examples, with screen shots and sample files, to illustrate both the important theoretical and practical considerations behind complex analytical tasks. Conclusion BIRCH provides a versatile organizational framework for managing software and databases, and making these accessible to a user base. Because of its network-centric design, BIRCH makes it possible for any user to do any task from anywhere.

  2. Bioinformatics meets user-centred design: a perspective.

    Directory of Open Access Journals (Sweden)

    Katrina Pavelin

    Full Text Available Designers have a saying that "the joy of an early release lasts but a short time. The bitterness of an unusable system lasts for years." It is indeed disappointing to discover that your data resources are not being used to their full potential. Not only have you invested your time, effort, and research grant on the project, but you may face costly redesigns if you want to improve the system later. This scenario would be less likely if the product was designed to provide users with exactly what they need, so that it is fit for purpose before its launch. We work at EMBL-European Bioinformatics Institute (EMBL-EBI, and we consult extensively with life science researchers to find out what they need from biological data resources. We have found that although users believe that the bioinformatics community is providing accurate and valuable data, they often find the interfaces to these resources tricky to use and navigate. We believe that if you can find out what your users want even before you create the first mock-up of a system, the final product will provide a better user experience. This would encourage more people to use the resource and they would have greater access to the data, which could ultimately lead to more scientific discoveries. In this paper, we explore the need for a user-centred design (UCD strategy when designing bioinformatics resources and illustrate this with examples from our work at EMBL-EBI. Our aim is to introduce the reader to how selected UCD techniques may be successfully applied to software design for bioinformatics.

  3. Bioinformatic prediction and functional characterization of human KIAA0100 gene

    OpenAIRE

    He Cui; Xi Lan; Shemin Lu; Fujun Zhang; Wanggang Zhang

    2017-01-01

    Our previous study demonstrated that human KIAA0100 gene was a novel acute monocytic leukemia-associated antigen (MLAA) gene. But the functional characterization of human KIAA0100 gene has remained unknown to date. Here, firstly, bioinformatic prediction of human KIAA0100 gene was carried out using online softwares; Secondly, Human KIAA0100 gene expression was downregulated by the clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) 9 system in U937 cells...

  4. The web server of IBM's Bioinformatics and Pattern Discovery group

    OpenAIRE

    Huynh, Tien; Rigoutsos, Isidore; Parida, Laxmi; Platt, Daniel,; Shibuya, Tetsuo

    2003-01-01

    We herein present and discuss the services and content which are available on the web server of IBM's Bioinformatics and Pattern Discovery group. The server is operational around the clock and provides access to a variety of methods that have been published by the group's members and collaborators. The available tools correspond to applications ranging from the discovery of patterns in streams of events and the computation of multiple sequence alignments, to the discovery of genes in nucleic ...

  5. Bioinformatics analysis and detection of gelatinase encoded gene in Lysinibacillussphaericus

    Science.gov (United States)

    Repin, Rul Aisyah Mat; Mutalib, Sahilah Abdul; Shahimi, Safiyyah; Khalid, Rozida Mohd.; Ayob, Mohd. Khan; Bakar, Mohd. Faizal Abu; Isa, Mohd Noor Mat

    2016-11-01

    In this study, we performed bioinformatics analysis toward genome sequence of Lysinibacillussphaericus (L. sphaericus) to determine gene encoded for gelatinase. L. sphaericus was isolated from soil and gelatinase species-specific bacterium to porcine and bovine gelatin. This bacterium offers the possibility of enzymes production which is specific to both species of meat, respectively. The main focus of this research is to identify the gelatinase encoded gene within the bacteria of L. Sphaericus using bioinformatics analysis of partially sequence genome. From the research study, three candidate gene were identified which was, gelatinase candidate gene 1 (P1), NODE_71_length_93919_cov_158.931839_21 which containing 1563 base pair (bp) in size with 520 amino acids sequence; Secondly, gelatinase candidate gene 2 (P2), NODE_23_length_52851_cov_190.061386_17 which containing 1776 bp in size with 591 amino acids sequence; and Thirdly, gelatinase candidate gene 3 (P3), NODE_106_length_32943_cov_169.147919_8 containing 1701 bp in size with 566 amino acids sequence. Three pairs of oligonucleotide primers were designed and namely as, F1, R1, F2, R2, F3 and R3 were targeted short sequences of cDNA by PCR. The amplicons were reliably results in 1563 bp in size for candidate gene P1 and 1701 bp in size for candidate gene P3. Therefore, the results of bioinformatics analysis of L. Sphaericus resulting in gene encoded gelatinase were identified.

  6. An Integrative Study on Bioinformatics Computing Concepts, Issues and Problems

    Directory of Open Access Journals (Sweden)

    Muhammad Zakarya

    2011-11-01

    Full Text Available Bioinformatics is the permutation and mishmash of biological science and 4IT. The discipline covers every computational tools and techniques used to administer, examine and manipulate huge sets of biological statistics. The discipline also helps in creation of databases to store up and supervise biological statistics, improvement of computer algorithms to find out relations in these databases and use of computer tools for the study and understanding of biological information, including DNA, RNA, protein sequences, gene expression profiles, protein structures, and biochemical pathways. The study of this paper implements an integrative solution. As we know that solution to a problem in a specific discipline may be a solution to another problem in a different discipline. For example entropy that has been rented from physical sciences is solution to most of the problems and issues in computer science. Another example is bioinformatics, where computing method and applications are implemented over biological information. This paper shows an initiative step towards that and will discuss upon the needs for integration of multiple discipline and sciences. Similarly green chemistry gives birth to a new kind of computing i.e. green computing. In next versions of this paper we will study biological fuel cell and will discuss to develop a mobile battery that will be life time charged using the concepts of biological fuel cell. Another issue that we are going to discuss in our series is brain tumor detection. This paper is a review on BI i.e. bioinformatics to start with.

  7. Microsatellites for next-generation ecologists: a post-sequencing bioinformatics pipeline.

    Science.gov (United States)

    Fernandez-Silva, Iria; Whitney, Jonathan; Wainwright, Benjamin; Andrews, Kimberly R; Ylitalo-Ward, Heather; Bowen, Brian W; Toonen, Robert J; Goetze, Erica; Karl, Stephen A

    2013-01-01

    Microsatellites are the markers of choice for a variety of population genetic studies. The recent advent of next-generation pyrosequencing has drastically accelerated microsatellite locus discovery by providing a greater amount of DNA sequencing reads at lower costs compared to other techniques. However, laboratory testing of PCR primers targeting potential microsatellite markers remains time consuming and costly. Here we show how to reduce this workload by screening microsatellite loci via bioinformatic analyses prior to primer design. Our method emphasizes the importance of sequence quality, and we avoid loci associated with repetitive elements by screening with repetitive sequence databases available for a growing number of taxa. Testing with the Yellowstripe Goatfish Mulloidichthys flavolineatus and the marine planktonic copepod Pleuromamma xiphias we show higher success rate of primers selected by our pipeline in comparison to previous in silico microsatellite detection methodologies. Following the same pipeline, we discover and select microsatellite loci in nine additional species including fishes, sea stars, copepods and octopuses.

  8. Importance of databases of nucleic acids for bioinformatic analysis focused to genomics

    Science.gov (United States)

    Jimenez-Gutierrez, L. R.; Barrios-Hernández, C. J.; Pedraza-Ferreira, G. R.; Vera-Cala, L.; Martinez-Perez, F.

    2016-08-01

    Recently, bioinformatics has become a new field of science, indispensable in the analysis of millions of nucleic acids sequences, which are currently deposited in international databases (public or private); these databases contain information of genes, RNA, ORF, proteins, intergenic regions, including entire genomes from some species. The analysis of this information requires computer programs; which were renewed in the use of new mathematical methods, and the introduction of the use of artificial intelligence. In addition to the constant creation of supercomputing units trained to withstand the heavy workload of sequence analysis. However, it is still necessary the innovation on platforms that allow genomic analyses, faster and more effectively, with a technological understanding of all biological processes.

  9. 10th International Conference on Practical Applications of Computational Biology & Bioinformatics

    CERN Document Server

    Rocha, Miguel; Fdez-Riverola, Florentino; Mayo, Francisco; Paz, Juan

    2016-01-01

    Biological and biomedical research are increasingly driven by experimental techniques that challenge our ability to analyse, process and extract meaningful knowledge from the underlying data. The impressive capabilities of next generation sequencing technologies, together with novel and ever evolving distinct types of omics data technologies, have put an increasingly complex set of challenges for the growing fields of Bioinformatics and Computational Biology. The analysis of the datasets produced and their integration call for new algorithms and approaches from fields such as Databases, Statistics, Data Mining, Machine Learning, Optimization, Computer Science and Artificial Intelligence. Clearly, Biology is more and more a science of information requiring tools from the computational sciences. In the last few years, we have seen the surge of a new generation of interdisciplinary scientists that have a strong background in the biological and computational sciences. In this context, the interaction of researche...

  10. 8th International Conference on Practical Applications of Computational Biology & Bioinformatics

    CERN Document Server

    Rocha, Miguel; Fdez-Riverola, Florentino; Santana, Juan

    2014-01-01

    Biological and biomedical research are increasingly driven by experimental techniques that challenge our ability to analyse, process and extract meaningful knowledge from the underlying data. The impressive capabilities of next generation sequencing technologies, together with novel and ever evolving distinct types of omics data technologies, have put an increasingly complex set of challenges for the growing fields of Bioinformatics and Computational Biology. The analysis of the datasets produced and their integration call for new algorithms and approaches from fields such as Databases, Statistics, Data Mining, Machine Learning, Optimization, Computer Science and Artificial Intelligence. Clearly, Biology is more and more a science of information requiring tools from the computational sciences. In the last few years, we have seen the surge of a new generation of interdisciplinary scientists that have a strong background in the biological and computational sciences. In this context, the interaction of researche...

  11. Bioinformatics analysis of the gene expression profile of hepatocellular carcinoma: preliminary results

    Science.gov (United States)

    Li, Jia

    2016-01-01

    Aim of the study To analyse the expression profile of hepatocellular carcinoma compared with normal liver by using bioinformatics methods. Material and methods In this study, we analysed the microarray expression data of HCC and adjacent normal liver samples from the Gene Expression Omnibus (GEO) database to screen for differentially expressed genes. Then, functional analyses were performed using GenCLiP analysis, Gene Ontology categories, and aberrant pathway identification. In addition, we used the CMap database to identify small molecules that can induce HCC. Results Overall, 2721 differentially expressed genes (DEGs) were identified. We found 180 metastasis-related genes and constructed co-occurrence networks. Several significant pathways, including the transforming growth factor β (TGF-β) signalling pathway, were identified as closely related to these DEGs. Some candidate small molecules (such as betahistine) were identified that might provide a basis for developing HCC treatments in the future. Conclusions Although we functionally analysed the differences in the gene expression profiles of HCC and normal liver tissues, our study is essentially preliminary, and it may be premature to apply our results to clinical trials. Further research and experimental testing are required in future studies. PMID:27095935

  12. Promoting synergistic research and education in genomics and bioinformatics.

    Science.gov (United States)

    Yang, Jack Y; Yang, Mary Qu; Zhu, Mengxia Michelle; Arabnia, Hamid R; Deng, Youping

    2008-01-01

    Bioinformatics and Genomics are closely related disciplines that hold great promises for the advancement of research and development in complex biomedical systems, as well as public health, drug design, comparative genomics, personalized medicine and so on. Research and development in these two important areas are impacting the science and technology.High throughput sequencing and molecular imaging technologies marked the beginning of a new era for modern translational medicine and personalized healthcare. The impact of having the human sequence and personalized digital images in hand has also created tremendous demands of developing powerful supercomputing, statistical learning and artificial intelligence approaches to handle the massive bioinformatics and personalized healthcare data, which will obviously have a profound effect on how biomedical research will be conducted toward the improvement of human health and prolonging of human life in the future. The International Society of Intelligent Biological Medicine (http://www.isibm.org) and its official journals, the International Journal of Functional Informatics and Personalized Medicine (http://www.inderscience.com/ijfipm) and the International Journal of Computational Biology and Drug Design (http://www.inderscience.com/ijcbdd) in collaboration with International Conference on Bioinformatics and Computational Biology (Biocomp), touch tomorrow's bioinformatics and personalized medicine throughout today's efforts in promoting the research, education and awareness of the upcoming integrated inter/multidisciplinary field. The 2007 international conference on Bioinformatics and Computational Biology (BIOCOMP07) was held in Las Vegas, the United States of American on June 25-28, 2007. The conference attracted over 400 papers, covering broad research areas in the genomics, biomedicine and bioinformatics. The Biocomp 2007 provides a common platform for the cross fertilization of ideas, and to help shape knowledge and

  13. Bioinformatics in microbial biotechnology – a mini review

    Directory of Open Access Journals (Sweden)

    Bansal Arvind K

    2005-06-01

    Full Text Available Abstract The revolutionary growth in the computation speed and memory storage capability has fueled a new era in the analysis of biological data. Hundreds of microbial genomes and many eukaryotic genomes including a cleaner draft of human genome have been sequenced raising the expectation of better control of microorganisms. The goals are as lofty as the development of rational drugs and antimicrobial agents, development of new enhanced bacterial strains for bioremediation and pollution control, development of better and easy to administer vaccines, the development of protein biomarkers for various bacterial diseases, and better understanding of host-bacteria interaction to prevent bacterial infections. In the last decade the development of many new bioinformatics techniques and integrated databases has facilitated the realization of these goals. Current research in bioinformatics can be classified into: (i genomics – sequencing and comparative study of genomes to identify gene and genome functionality, (ii proteomics – identification and characterization of protein related properties and reconstruction of metabolic and regulatory pathways, (iii cell visualization and simulation to study and model cell behavior, and (iv application to the development of drugs and anti-microbial agents. In this article, we will focus on the techniques and their limitations in genomics and proteomics. Bioinformatics research can be classified under three major approaches: (1 analysis based upon the available experimental wet-lab data, (2 the use of mathematical modeling to derive new information, and (3 an integrated approach that integrates search techniques with mathematical modeling. The major impact of bioinformatics research has been to automate the genome sequencing, automated development of integrated genomics and proteomics databases, automated genome comparisons to identify the genome function, automated derivation of metabolic pathways, gene

  14. The bioinformatics of microarrays to study cancer: Advantages and disadvantages

    Science.gov (United States)

    Rodríguez-Segura, M. A.; Godina-Nava, J. J.; Villa-Treviño, S.

    2012-10-01

    Microarrays are devices designed to analyze simultaneous expression of thousands of genes. However, the process will adds noise into the information at each stage of the study. To analyze these thousands of data is necessary to use bioinformatics tools. The traditional analysis begins by normalizing data, but the obtained results are highly dependent on how it is conducted the study. It is shown the need to develop new strategies to analyze microarray. Liver tissue taken from an animal model in which is chemically induced cancer is used as an example.

  15. Biophysics and bioinformatics of transcription regulation in bacteria and bacteriophages

    Science.gov (United States)

    Djordjevic, Marko

    2005-11-01

    Due to rapid accumulation of biological data, bioinformatics has become a very important branch of biological research. In this thesis, we develop novel bioinformatic approaches and aid design of biological experiments by using ideas and methods from statistical physics. Identification of transcription factor binding sites within the regulatory segments of genomic DNA is an important step towards understanding of the regulatory circuits that control expression of genes. We propose a novel, biophysics based algorithm, for the supervised detection of transcription factor (TF) binding sites. The method classifies potential binding sites by explicitly estimating the sequence-specific binding energy and the chemical potential of a given TF. In contrast with the widely used information theory based weight matrix method, our approach correctly incorporates saturation in the transcription factor/DNA binding probability. This results in a significant reduction in the number of expected false positives, and in the explicit appearance---and determination---of a binding threshold. The new method was used to identify likely genomic binding sites for the Escherichia coli TFs, and to examine the relationship between TF binding specificity and degree of pleiotropy (number of regulatory targets). We next address how parameters of protein-DNA interactions can be obtained from data on protein binding to random oligos under controlled conditions (SELEX experiment data). We show that 'robust' generation of an appropriate data set is achieved by a suitable modification of the standard SELEX procedure, and propose a novel bioinformatic algorithm for analysis of such data. Finally, we use quantitative data analysis, bioinformatic methods and kinetic modeling to analyze gene expression strategies of bacterial viruses. We study bacteriophage Xp10 that infects rice pathogen Xanthomonas oryzae. Xp10 is an unusual bacteriophage, which has morphology and genome organization that most closely

  16. Bioinformatics pipeline for functional identification and characterization of proteins

    Science.gov (United States)

    Skarzyńska, Agnieszka; Pawełkowicz, Magdalena; Krzywkowski, Tomasz; Świerkula, Katarzyna; PlÄ der, Wojciech; Przybecki, Zbigniew

    2015-09-01

    The new sequencing methods, called Next Generation Sequencing gives an opportunity to possess a vast amount of data in short time. This data requires structural and functional annotation. Functional identification and characterization of predicted proteins could be done by in silico approches, thanks to a numerous computational tools available nowadays. However, there is a need to confirm the results of proteins function prediction using different programs and comparing the results or confirm experimentally. Here we present a bioinformatics pipeline for structural and functional annotation of proteins.

  17. Bioinformatics Tools for the Discovery of New Nonribosomal Peptides

    DEFF Research Database (Denmark)

    Leclère, Valérie; Weber, Tilmann; Jacques, Philippe

    2016-01-01

    -dimensional structure of the peptides can be compared with the structural patterns of all known NRPs. The presented workflow leads to an efficient and rapid screening of genomic data generated by high throughput technologies. The exploration of such sequenced genomes may lead to the discovery of new drugs (i......This chapter helps in the use of bioinformatics tools relevant to the discovery of new nonribosomal peptides (NRPs) produced by microorganisms. The strategy described can be applied to draft or fully assembled genome sequences. It relies on the identification of the synthetase genes...

  18. Identification of plasma lipid biomarkers for prostate cancer by lipidomics and bioinformatics.

    Directory of Open Access Journals (Sweden)

    Xinchun Zhou

    Full Text Available BACKGROUND: Lipids have critical functions in cellular energy storage, structure and signaling. Many individual lipid molecules have been associated with the evolution of prostate cancer; however, none of them has been approved to be used as a biomarker. The aim of this study is to identify lipid molecules from hundreds plasma apparent lipid species as biomarkers for diagnosis of prostate cancer. METHODOLOGY/PRINCIPAL FINDINGS: Using lipidomics, lipid profiling of 390 individual apparent lipid species was performed on 141 plasma samples from 105 patients with prostate cancer and 36 male controls. High throughput data generated from lipidomics were analyzed using bioinformatic and statistical methods. From 390 apparent lipid species, 35 species were demonstrated to have potential in differentiation of prostate cancer. Within the 35 species, 12 were identified as individual plasma lipid biomarkers for diagnosis of prostate cancer with a sensitivity above 80%, specificity above 50% and accuracy above 80%. Using top 15 of 35 potential biomarkers together increased predictive power dramatically in diagnosis of prostate cancer with a sensitivity of 93.6%, specificity of 90.1% and accuracy of 97.3%. Principal component analysis (PCA and hierarchical clustering analysis (HCA demonstrated that patient and control populations were visually separated by identified lipid biomarkers. RandomForest and 10-fold cross validation analyses demonstrated that the identified lipid biomarkers were able to predict unknown populations accurately, and this was not influenced by patient's age and race. Three out of 13 lipid classes, phosphatidylethanolamine (PE, ether-linked phosphatidylethanolamine (ePE and ether-linked phosphatidylcholine (ePC could be considered as biomarkers in diagnosis of prostate cancer. CONCLUSIONS/SIGNIFICANCE: Using lipidomics and bioinformatic and statistical methods, we have identified a few out of hundreds plasma apparent lipid molecular

  19. Technosciences in Academia: Rethinking a Conceptual Framework for Bioinformatics Undergraduate Curricula

    Science.gov (United States)

    Symeonidis, Iphigenia Sofia

    This paper aims to elucidate guiding concepts for the design of powerful undergraduate bioinformatics degrees which will lead to a conceptual framework for the curriculum. "Powerful" here should be understood as having truly bioinformatics objectives rather than enrichment of existing computer science or life science degrees on which bioinformatics degrees are often based. As such, the conceptual framework will be one which aims to demonstrate intellectual honesty in regards to the field of bioinformatics. A synthesis/conceptual analysis approach was followed as elaborated by Hurd (1983). The approach takes into account the following: bioinfonnatics educational needs and goals as expressed by different authorities, five undergraduate bioinformatics degrees case-studies, educational implications of bioinformatics as a technoscience and approaches to curriculum design promoting interdisciplinarity and integration. Given these considerations, guiding concepts emerged and a conceptual framework was elaborated. The practice of bioinformatics was given a closer look, which led to defining tool-integration skills and tool-thinking capacity as crucial areas of the bioinformatics activities spectrum. It was argued, finally, that a process-based curriculum as a variation of a concept-based curriculum (where the concepts are processes) might be more conducive to the teaching of bioinformatics given a foundational first year of integrated science education as envisioned by Bialek and Botstein (2004). Furthermore, the curriculum design needs to define new avenues of communication and learning which bypass the traditional disciplinary barriers of academic settings as undertaken by Tador and Tidmor (2005) for graduate studies.

  20. MOWServ: a web client for integration of bioinformatic resources

    Science.gov (United States)

    Ramírez, Sergio; Muñoz-Mérida, Antonio; Karlsson, Johan; García, Maximiliano; Pérez-Pulido, Antonio J.; Claros, M. Gonzalo; Trelles, Oswaldo

    2010-01-01

    The productivity of any scientist is affected by cumbersome, tedious and time-consuming tasks that try to make the heterogeneous web services compatible so that they can be useful in their research. MOWServ, the bioinformatic platform offered by the Spanish National Institute of Bioinformatics, was released to provide integrated access to databases and analytical tools. Since its release, the number of available services has grown dramatically, and it has become one of the main contributors of registered services in the EMBRACE Biocatalogue. The ontology that enables most of the web-service compatibility has been curated, improved and extended. The service discovery has been greatly enhanced by Magallanes software and biodataSF. User data are securely stored on the main server by an authentication protocol that enables the monitoring of current or already-finished user’s tasks, as well as the pipelining of successive data processing services. The BioMoby standard has been greatly extended with the new features included in the MOWServ, such as management of additional information (metadata such as extended descriptions, keywords and datafile examples), a qualified registry, error handling, asynchronous services and service replication. All of them have increased the MOWServ service quality, usability and robustness. MOWServ is available at http://www.inab.org/MOWServ/ and has a mirror at http://www.bitlab-es.com/MOWServ/. PMID:20525794

  1. Computational Lipidomics and Lipid Bioinformatics: Filling In the Blanks.

    Science.gov (United States)

    Pauling, Josch; Klipp, Edda

    2016-12-22

    Lipids are highly diverse metabolites of pronounced importance in health and disease. While metabolomics is a broad field under the omics umbrella that may also relate to lipids, lipidomics is an emerging field which specializes in the identification, quantification and functional interpretation of complex lipidomes. Today, it is possible to identify and distinguish lipids in a high-resolution, high-throughput manner and simultaneously with a lot of structural detail. However, doing so may produce thousands of mass spectra in a single experiment which has created a high demand for specialized computational support to analyze these spectral libraries. The computational biology and bioinformatics community has so far established methodology in genomics, transcriptomics and proteomics but there are many (combinatorial) challenges when it comes to structural diversity of lipids and their identification, quantification and interpretation. This review gives an overview and outlook on lipidomics research and illustrates ongoing computational and bioinformatics efforts. These efforts are important and necessary steps to advance the lipidomics field alongside analytic, biochemistry, biomedical and biology communities and to close the gap in available computational methodology between lipidomics and other omics sub-branches.

  2. Bioinformatics analysis of metastasis-related proteins in hepatocellular carcinoma

    Institute of Scientific and Technical Information of China (English)

    Pei-Ming Song; Yang Zhang; Yu-Fei He; Hui-Min Bao; Jian-Hua Luo; Yin-Kun Liu; Peng-Yuan Yang; Xian Chen

    2008-01-01

    AIM: To analyze the metastasis-related proteins in hepatocellular carcinoma (HCC) and discover the biomark-er candidates for diagnosis and therapeutic intervention of HCC metastasis with bioinformatics tools.METHODS: Metastasis-related proteins were determined by stable isotope labeling and MS analysis and analyzed with bioinformatics resources, including Phobius, Kyoto encyclopedia of genes and genomes (KEGG), online mendelian inheritance in man (OHIH) and human protein reference database (HPRD).RESULTS: All the metastasis-related proteins were linked to 83 pathways in KEGG, including MAPK and p53 signal pathways. Protein-protein interaction network showed that all the metastasis-related proteins were categorized into 19 function groups, including cell cycle, apoptosis and signal transcluction. OMIM analysis linked these proteins to 186 OMIM entries.CONCLUSION: Metastasis-related proteins provide HCC cells with biological advantages in cell proliferation, migration and angiogenesis, and facilitate metastasis of HCC cells. The bird's eye view can reveal a global charac-teristic of metastasis-related proteins and many differen-tially expressed proteins can be identified as candidates for diagnosis and treatment of HCC.

  3. MOWServ: a web client for integration of bioinformatic resources.

    Science.gov (United States)

    Ramírez, Sergio; Muñoz-Mérida, Antonio; Karlsson, Johan; García, Maximiliano; Pérez-Pulido, Antonio J; Claros, M Gonzalo; Trelles, Oswaldo

    2010-07-01

    The productivity of any scientist is affected by cumbersome, tedious and time-consuming tasks that try to make the heterogeneous web services compatible so that they can be useful in their research. MOWServ, the bioinformatic platform offered by the Spanish National Institute of Bioinformatics, was released to provide integrated access to databases and analytical tools. Since its release, the number of available services has grown dramatically, and it has become one of the main contributors of registered services in the EMBRACE Biocatalogue. The ontology that enables most of the web-service compatibility has been curated, improved and extended. The service discovery has been greatly enhanced by Magallanes software and biodataSF. User data are securely stored on the main server by an authentication protocol that enables the monitoring of current or already-finished user's tasks, as well as the pipelining of successive data processing services. The BioMoby standard has been greatly extended with the new features included in the MOWServ, such as management of additional information (metadata such as extended descriptions, keywords and datafile examples), a qualified registry, error handling, asynchronous services and service replication. All of them have increased the MOWServ service quality, usability and robustness. MOWServ is available at http://www.inab.org/MOWServ/ and has a mirror at http://www.bitlab-es.com/MOWServ/.

  4. MAPI: towards the integrated exploitation of bioinformatics Web Services

    Directory of Open Access Journals (Sweden)

    Karlsson Johan

    2011-10-01

    Full Text Available Abstract Background Bioinformatics is commonly featured as a well assorted list of available web resources. Although diversity of services is positive in general, the proliferation of tools, their dispersion and heterogeneity complicate the integrated exploitation of such data processing capacity. Results To facilitate the construction of software clients and make integrated use of this variety of tools, we present a modular programmatic application interface (MAPI that provides the necessary functionality for uniform representation of Web Services metadata descriptors including their management and invocation protocols of the services which they represent. This document describes the main functionality of the framework and how it can be used to facilitate the deployment of new software under a unified structure of bioinformatics Web Services. A notable feature of MAPI is the modular organization of the functionality into different modules associated with specific tasks. This means that only the modules needed for the client have to be installed, and that the module functionality can be extended without the need for re-writing the software client. Conclusions The potential utility and versatility of the software library has been demonstrated by the implementation of several currently available clients that cover different aspects of integrated data processing, ranging from service discovery to service invocation with advanced features such as workflows composition and asynchronous services calls to multiple types of Web Services including those registered in repositories (e.g. GRID-based, SOAP, BioMOBY, R-bioconductor, and others.

  5. Web services at the European Bioinformatics Institute-2009.

    Science.gov (United States)

    McWilliam, Hamish; Valentin, Franck; Goujon, Mickael; Li, Weizhong; Narayanasamy, Menaka; Martin, Jenny; Miyar, Teresa; Lopez, Rodrigo

    2009-07-01

    The European Bioinformatics Institute (EMBL-EBI) has been providing access to mainstream databases and tools in bioinformatics since 1997. In addition to the traditional web form based interfaces, APIs exist for core data resources such as EMBL-Bank, Ensembl, UniProt, InterPro, PDB and ArrayExpress. These APIs are based on Web Services (SOAP/REST) interfaces that allow users to systematically access databases and analytical tools. From the user's point of view, these Web Services provide the same functionality as the browser-based forms. However, using the APIs frees the user from web page constraints and are ideal for the analysis of large batches of data, performing text-mining tasks and the casual or systematic evaluation of mathematical models in regulatory networks. Furthermore, these services are widespread and easy to use; require no prior knowledge of the technology and no more than basic experience in programming. In the following we wish to inform of new and updated services as well as briefly describe planned developments to be made available during the course of 2009-2010.

  6. The MPI Bioinformatics Toolkit for protein sequence analysis.

    Science.gov (United States)

    Biegert, Andreas; Mayer, Christian; Remmert, Michael; Söding, Johannes; Lupas, Andrei N

    2006-07-01

    The MPI Bioinformatics Toolkit is an interactive web service which offers access to a great variety of public and in-house bioinformatics tools. They are grouped into different sections that support sequence searches, multiple alignment, secondary and tertiary structure prediction and classification. Several public tools are offered in customized versions that extend their functionality. For example, PSI-BLAST can be run against regularly updated standard databases, customized user databases or selectable sets of genomes. Another tool, Quick2D, integrates the results of various secondary structure, transmembrane and disorder prediction programs into one view. The Toolkit provides a friendly and intuitive user interface with an online help facility. As a key feature, various tools are interconnected so that the results of one tool can be forwarded to other tools. One could run PSI-BLAST, parse out a multiple alignment of selected hits and send the results to a cluster analysis tool. The Toolkit framework and the tools developed in-house will be packaged and freely available under the GNU Lesser General Public Licence (LGPL). The Toolkit can be accessed at http://toolkit.tuebingen.mpg.de.

  7. mockrobiota: a Public Resource for Microbiome Bioinformatics Benchmarking.

    Science.gov (United States)

    Bokulich, Nicholas A; Rideout, Jai Ram; Mercurio, William G; Shiffer, Arron; Wolfe, Benjamin; Maurice, Corinne F; Dutton, Rachel J; Turnbaugh, Peter J; Knight, Rob; Caporaso, J Gregory

    2016-01-01

    Mock communities are an important tool for validating, optimizing, and comparing bioinformatics methods for microbial community analysis. We present mockrobiota, a public resource for sharing, validating, and documenting mock community data resources, available at http://caporaso-lab.github.io/mockrobiota/. The materials contained in mockrobiota include data set and sample metadata, expected composition data (taxonomy or gene annotations or reference sequences for mock community members), and links to raw data (e.g., raw sequence data) for each mock community data set. mockrobiota does not supply physical sample materials directly, but the data set metadata included for each mock community indicate whether physical sample materials are available. At the time of this writing, mockrobiota contains 11 mock community data sets with known species compositions, including bacterial, archaeal, and eukaryotic mock communities, analyzed by high-throughput marker gene sequencing. IMPORTANCE The availability of standard and public mock community data will facilitate ongoing method optimizations, comparisons across studies that share source data, and greater transparency and access and eliminate redundancy. These are also valuable resources for bioinformatics teaching and training. This dynamic resource is intended to expand and evolve to meet the changing needs of the omics community.

  8. Bioinformatic prediction and functional characterization of human KIAA0100 gene

    Directory of Open Access Journals (Sweden)

    He Cui

    2017-02-01

    Full Text Available Our previous study demonstrated that human KIAA0100 gene was a novel acute monocytic leukemia-associated antigen (MLAA gene. But the functional characterization of human KIAA0100 gene has remained unknown to date. Here, firstly, bioinformatic prediction of human KIAA0100 gene was carried out using online softwares; Secondly, Human KIAA0100 gene expression was downregulated by the clustered regularly interspaced short palindromic repeats (CRISPR/CRISPR-associated (Cas 9 system in U937 cells. Cell proliferation and apoptosis were next evaluated in KIAA0100-knockdown U937 cells. The bioinformatic prediction showed that human KIAA0100 gene was located on 17q11.2, and human KIAA0100 protein was located in the secretory pathway. Besides, human KIAA0100 protein contained a signalpeptide, a transmembrane region, three types of secondary structures (alpha helix, extended strand, and random coil , and four domains from mitochondrial protein 27 (FMP27. The observation on functional characterization of human KIAA0100 gene revealed that its downregulation inhibited cell proliferation, and promoted cell apoptosis in U937 cells. To summarize, these results suggest human KIAA0100 gene possibly comes within mitochondrial genome; moreover, it is a novel anti-apoptotic factor related to carcinogenesis or progression in acute monocytic leukemia, and may be a potential target for immunotherapy against acute monocytic leukemia.

  9. Protecting innovation in bioinformatics and in-silico biology.

    Science.gov (United States)

    Harrison, Robert

    2003-01-01

    Commercial success or failure of innovation in bioinformatics and in-silico biology requires the appropriate use of legal tools for protecting and exploiting intellectual property. These tools include patents, copyrights, trademarks, design rights, and limiting information in the form of 'trade secrets'. Potentially patentable components of bioinformatics programmes include lines of code, algorithms, data content, data structure and user interfaces. In both the US and the European Union, copyright protection is granted for software as a literary work, and most other major industrial countries have adopted similar rules. Nonetheless, the grant of software patents remains controversial and is being challenged in some countries. Current debate extends to aspects such as whether patents can claim not only the apparatus and methods but also the data signals and/or products, such as a CD-ROM, on which the programme is stored. The patentability of substances discovered using in-silico methods is a separate debate that is unlikely to be resolved in the near future.

  10. The Bioinformatics of Integrative Medical Insights: Proposals for an International PsychoSocial and Cultural Bioinformatics Project

    Directory of Open Access Journals (Sweden)

    Ernest Rossi

    2006-01-01

    Full Text Available We propose the formation of an International PsychoSocial and Cultural Bioinformatics Project (IPCBP to explore the research foundations of Integrative Medical Insights (IMI on all levels from the molecular-genomic to the psychological, cultural, social, and spiritual. Just as The Human Genome Project identified the molecular foundations of modern medicine with the new technology of sequencing DNA during the past decade, the IPCBP would extend and integrate this neuroscience knowledge base with the technology of gene expression via DNA/proteomic microarray research and brain imaging in development, stress, healing, rehabilitation, and the psychotherapeutic facilitation of existentional wellness. We anticipate that the IPCBP will require a unique international collaboration of, academic institutions, researchers, and clinical practioners for the creation of a new neuroscience of mind-body communication, brain plasticity, memory, learning, and creative processing during optimal experiential states of art, beauty, and truth. We illustrate this emerging integration of bioinformatics with medicine with a videotape of the classical 4-stage creative process in a neuroscience approach to psychotherapy.

  11. BATMAN-TCM: a Bioinformatics Analysis Tool for Molecular mechANism of Traditional Chinese Medicine

    Science.gov (United States)

    Liu, Zhongyang; Guo, Feifei; Wang, Yong; Li, Chun; Zhang, Xinlei; Li, Honglei; Diao, Lihong; Gu, Jiangyong; Wang, Wei; Li, Dong; He, Fuchu

    2016-01-01

    Traditional Chinese Medicine (TCM), with a history of thousands of years of clinical practice, is gaining more and more attention and application worldwide. And TCM-based new drug development, especially for the treatment of complex diseases is promising. However, owing to the TCM’s diverse ingredients and their complex interaction with human body, it is still quite difficult to uncover its molecular mechanism, which greatly hinders the TCM modernization and internationalization. Here we developed the first online Bioinformatics Analysis Tool for Molecular mechANism of TCM (BATMAN-TCM). Its main functions include 1) TCM ingredients’ target prediction; 2) functional analyses of targets including biological pathway, Gene Ontology functional term and disease enrichment analyses; 3) the visualization of ingredient-target-pathway/disease association network and KEGG biological pathway with highlighted targets; 4) comparison analysis of multiple TCMs. Finally, we applied BATMAN-TCM to Qishen Yiqi dripping Pill (QSYQ) and combined with subsequent experimental validation to reveal the functions of renin-angiotensin system responsible for QSYQ’s cardioprotective effects for the first time. BATMAN-TCM will contribute to the understanding of the “multi-component, multi-target and multi-pathway” combinational therapeutic mechanism of TCM, and provide valuable clues for subsequent experimental validation, accelerating the elucidation of TCM’s molecular mechanism. BATMAN-TCM is available at http://bionet.ncpsb.org/batman-tcm. PMID:26879404

  12. BATMAN-TCM: a Bioinformatics Analysis Tool for Molecular mechANism of Traditional Chinese Medicine

    Science.gov (United States)

    Liu, Zhongyang; Guo, Feifei; Wang, Yong; Li, Chun; Zhang, Xinlei; Li, Honglei; Diao, Lihong; Gu, Jiangyong; Wang, Wei; Li, Dong; He, Fuchu

    2016-02-01

    Traditional Chinese Medicine (TCM), with a history of thousands of years of clinical practice, is gaining more and more attention and application worldwide. And TCM-based new drug development, especially for the treatment of complex diseases is promising. However, owing to the TCM’s diverse ingredients and their complex interaction with human body, it is still quite difficult to uncover its molecular mechanism, which greatly hinders the TCM modernization and internationalization. Here we developed the first online Bioinformatics Analysis Tool for Molecular mechANism of TCM (BATMAN-TCM). Its main functions include 1) TCM ingredients’ target prediction; 2) functional analyses of targets including biological pathway, Gene Ontology functional term and disease enrichment analyses; 3) the visualization of ingredient-target-pathway/disease association network and KEGG biological pathway with highlighted targets; 4) comparison analysis of multiple TCMs. Finally, we applied BATMAN-TCM to Qishen Yiqi dripping Pill (QSYQ) and combined with subsequent experimental validation to reveal the functions of renin-angiotensin system responsible for QSYQ’s cardioprotective effects for the first time. BATMAN-TCM will contribute to the understanding of the “multi-component, multi-target and multi-pathway” combinational therapeutic mechanism of TCM, and provide valuable clues for subsequent experimental validation, accelerating the elucidation of TCM’s molecular mechanism. BATMAN-TCM is available at http://bionet.ncpsb.org/batman-tcm.

  13. Bioinformatics in the secondary science classroom: A study of state content standards and students' perceptions of, and performance in, bioinformatics lessons

    Science.gov (United States)

    Wefer, Stephen H.

    The proliferation of bioinformatics in modern Biology marks a new revolution in science, which promises to influence science education at all levels. This thesis examined state standards for content that articulated bioinformatics, and explored secondary students' affective and cognitive perceptions of, and performance in, a bioinformatics mini-unit. The results are presented as three studies. The first study analyzed secondary science standards of 49 U.S States (Iowa has no science framework) and the District of Columbia for content related to bioinformatics at the introductory high school biology level. The bionformatics content of each state's Biology standards were categorized into nine areas and the prevalence of each area documented. The nine areas were: The Human Genome Project, Forensics, Evolution, Classification, Nucleotide Variations, Medicine, Computer Use, Agriculture/Food Technology, and Science Technology and Society/Socioscientific Issues (STS/SSI). Findings indicated a generally low representation of bioinformatics related content, which varied substantially across the different areas. Recommendations are made for reworking existing standards to incorporate bioinformatics and to facilitate the goal of promoting science literacy in this emerging new field among secondary school students. The second study examined thirty-two students' affective responses to, and content mastery of, a two-week bioinformatics mini-unit. The findings indicate that the students generally were positive relative to their interest level, the usefulness of the lessons, the difficulty level of the lessons, likeliness to engage in additional bioinformatics, and were overall successful on the assessments. A discussion of the results and significance is followed by suggestions for future research and implementation for transferability. The third study presents a case study of individual differences among ten secondary school students, whose cognitive and affective percepts were

  14. Atlas – a data warehouse for integrative bioinformatics

    Directory of Open Access Journals (Sweden)

    Yuen Macaire MS

    2005-02-01

    Full Text Available Abstract Background We present a biological data warehouse called Atlas that locally stores and integrates biological sequences, molecular interactions, homology information, functional annotations of genes, and biological ontologies. The goal of the system is to provide data, as well as a software infrastructure for bioinformatics research and development. Description The Atlas system is based on relational data models that we developed for each of the source data types. Data stored within these relational models are managed through Structured Query Language (SQL calls that are implemented in a set of Application Programming Interfaces (APIs. The APIs include three languages: C++, Java, and Perl. The methods in these API libraries are used to construct a set of loader applications, which parse and load the source datasets into the Atlas database, and a set of toolbox applications which facilitate data retrieval. Atlas stores and integrates local instances of GenBank, RefSeq, UniProt, Human Protein Reference Database (HPRD, Biomolecular Interaction Network Database (BIND, Database of Interacting Proteins (DIP, Molecular Interactions Database (MINT, IntAct, NCBI Taxonomy, Gene Ontology (GO, Online Mendelian Inheritance in Man (OMIM, LocusLink, Entrez Gene and HomoloGene. The retrieval APIs and toolbox applications are critical components that offer end-users flexible, easy, integrated access to this data. We present use cases that use Atlas to integrate these sources for genome annotation, inference of molecular interactions across species, and gene-disease associations. Conclusion The Atlas biological data warehouse serves as data infrastructure for bioinformatics research and development. It forms the backbone of the research activities in our laboratory and facilitates the integration of disparate, heterogeneous biological sources of data enabling new scientific inferences. Atlas achieves integration of diverse data sets at two levels. First

  15. Integration of Bioinformatics into an Undergraduate Biology Curriculum and the Impact on Development of Mathematical Skills

    Science.gov (United States)

    Wightman, Bruce; Hark, Amy T.

    2012-01-01

    The development of fields such as bioinformatics and genomics has created new challenges and opportunities for undergraduate biology curricula. Students preparing for careers in science, technology, and medicine need more intensive study of bioinformatics and more sophisticated training in the mathematics on which this field is based. In this…

  16. Exploring Cystic Fibrosis Using Bioinformatics Tools: A Module Designed for the Freshman Biology Course

    Science.gov (United States)

    Zhang, Xiaorong

    2011-01-01

    We incorporated a bioinformatics component into the freshman biology course that allows students to explore cystic fibrosis (CF), a common genetic disorder, using bioinformatics tools and skills. Students learn about CF through searching genetic databases, analyzing genetic sequences, and observing the three-dimensional structures of proteins…

  17. Visualizing and Sharing Results in Bioinformatics Projects: GBrowse and GenBank Exports

    Science.gov (United States)

    Effective tools for presenting and sharing data are necessary for collaborative projects, typical for bioinformatics. In order to facilitate sharing our data with other genomics, molecular biology, and bioinformatics researchers, we have developed software to export our data to GenBank and combined ...

  18. Making Bioinformatics Projects a Meaningful Experience in an Undergraduate Biotechnology or Biomedical Science Programme

    Science.gov (United States)

    Sutcliffe, Iain C.; Cummings, Stephen P.

    2007-01-01

    Bioinformatics has emerged as an important discipline within the biological sciences that allows scientists to decipher and manage the vast quantities of data (such as genome sequences) that are now available. Consequently, there is an obvious need to provide graduates in biosciences with generic, transferable skills in bioinformatics. We present…

  19. Bioinformatics in Middle East Program Curricula--A Focus on the Arabian Gulf

    Science.gov (United States)

    Loucif, Samia

    2014-01-01

    The purpose of this paper is to investigate the inclusion of bioinformatics in program curricula in the Middle East, focusing on educational institutions in the Arabian Gulf. Bioinformatics is a multidisciplinary field which has emerged in response to the need for efficient data storage and retrieval, and accurate and fast computational and…

  20. Teaching Bioinformatics and Neuroinformatics by Using Free Web-Based Tools

    Science.gov (United States)

    Grisham, William; Schottler, Natalie A.; Valli-Marill, Joanne; Beck, Lisa; Beatty, Jackson

    2010-01-01

    This completely computer-based module's purpose is to introduce students to bioinformatics resources. We present an easy-to-adopt module that weaves together several important bioinformatic tools so students can grasp how these tools are used in answering research questions. Students integrate information gathered from websites dealing with…

  1. Bioinformatics in High School Biology Curricula: A Study of State Science Standards

    Science.gov (United States)

    Wefer, Stephen H.; Sheppard, Keith

    2008-01-01

    The proliferation of bioinformatics in modern biology marks a modern revolution in science that promises to influence science education at all levels. This study analyzed secondary school science standards of 49 U.S. states (Iowa has no science framework) and the District of Columbia for content related to bioinformatics. The bioinformatics…

  2. BioStar: an online question & answer resource for the bioinformatics community

    Science.gov (United States)

    Although the era of big data has produced many bioinformatics tools and databases, using them effectively often requires specialized knowledge. Many groups lack bioinformatics expertise, and frequently find that software documentation is inadequate and local colleagues may be overburdened or unfamil...

  3. A Portable Bioinformatics Course for Upper-Division Undergraduate Curriculum in Sciences

    Science.gov (United States)

    Floraino, Wely B.

    2008-01-01

    This article discusses the challenges that bioinformatics education is facing and describes a bioinformatics course that is successfully taught at the California State Polytechnic University, Pomona, to the fourth year undergraduate students in biological sciences, chemistry, and computer science. Information on lecture and computer practice…

  4. Incorporating a Collaborative Web-Based Virtual Laboratory in an Undergraduate Bioinformatics Course

    Science.gov (United States)

    Weisman, David

    2010-01-01

    Face-to-face bioinformatics courses commonly include a weekly, in-person computer lab to facilitate active learning, reinforce conceptual material, and teach practical skills. Similarly, fully-online bioinformatics courses employ hands-on exercises to achieve these outcomes, although students typically perform this work offsite. Combining a…

  5. Computer Programming and Biomolecular Structure Studies: A Step beyond Internet Bioinformatics

    Science.gov (United States)

    Likic, Vladimir A.

    2006-01-01

    This article describes the experience of teaching structural bioinformatics to third year undergraduate students in a subject titled "Biomolecular Structure and Bioinformatics." Students were introduced to computer programming and used this knowledge in a practical application as an alternative to the well established Internet bioinformatics…

  6. A Summer Program Designed to Educate College Students for Careers in Bioinformatics

    Science.gov (United States)

    Krilowicz, Beverly; Johnston, Wendie; Sharp, Sandra B.; Warter-Perez, Nancy; Momand, Jamil

    2007-01-01

    A summer program was created for undergraduates and graduate students that teaches bioinformatics concepts, offers skills in professional development, and provides research opportunities in academic and industrial institutions. We estimate that 34 of 38 graduates (89%) are in a career trajectory that will use bioinformatics. Evidence from…

  7. ISEV position paper: extracellular vesicle RNA analysis and bioinformatics

    Directory of Open Access Journals (Sweden)

    Andrew F. Hill

    2013-12-01

    Full Text Available Extracellular vesicles (EVs are the collective term for the various vesicles that are released by cells into the extracellular space. Such vesicles include exosomes and microvesicles, which vary by their size and/or protein and genetic cargo. With the discovery that EVs contain genetic material in the form of RNA (evRNA has come the increased interest in these vesicles for their potential use as sources of disease biomarkers and potential therapeutic agents. Rapid developments in the availability of deep sequencing technologies have enabled the study of EV-related RNA in detail. In October 2012, the International Society for Extracellular Vesicles (ISEV held a workshop on “evRNA analysis and bioinformatics.” Here, we report the conclusions of one of the roundtable discussions where we discussed evRNA analysis technologies and provide some guidelines to researchers in the field to consider when performing such analysis.

  8. Developing sustainable software solutions for bioinformatics by the " Butterfly" paradigm.

    Science.gov (United States)

    Ahmed, Zeeshan; Zeeshan, Saman; Dandekar, Thomas

    2014-01-01

    Software design and sustainable software engineering are essential for the long-term development of bioinformatics software. Typical challenges in an academic environment are short-term contracts, island solutions, pragmatic approaches and loose documentation. Upcoming new challenges are big data, complex data sets, software compatibility and rapid changes in data representation. Our approach to cope with these challenges consists of iterative intertwined cycles of development (" Butterfly" paradigm) for key steps in scientific software engineering. User feedback is valued as well as software planning in a sustainable and interoperable way. Tool usage should be easy and intuitive. A middleware supports a user-friendly Graphical User Interface (GUI) as well as a database/tool development independently. We validated the approach of our own software development and compared the different design paradigms in various software solutions.

  9. Bioinformatic Analysis of BBTV Satellite DNA in Hainan

    Institute of Scientific and Technical Information of China (English)

    Nai-tong Yu; Tuan-cheng Feng; Yu-liang Zhang; Jian-hua Wang; Zhi-xin Liu

    2011-01-01

    Banana bunchy top virus (BBTV),family Nanaviridae,genus Babuvirus,is a single stranded DNA virus (ssDNA) that causes banana bunchy top disease (BBTD) in banana plants.It is the most common and most destructive of all viruses in these plants and is widespread throughout the Asia-Pacific region.In this study we isolated,cloned and sequenced a BBTV sample from Hainan Island,China.The results from sequencing and bioinformatics analysis indicate this isolate represents a satellite DNA component with 12 DNA sequences motifs.We also predicted the physical and chemical properties,structure,signal peptide,phosphorylation,secondary structure,tertiary structure and functional domains of its encoding protein,and compare them with the corresponding quantities in the replication initiation protein of BBTV DNA1.

  10. Systems biology and bioinformatics in aging research: a workshop report.

    Science.gov (United States)

    Fuellen, Georg; Dengjel, Jörn; Hoeflich, Andreas; Hoeijemakers, Jan; Kestler, Hans A; Kowald, Axel; Priebe, Steffen; Rebholz-Schuhmann, Dietrich; Schmeck, Bernd; Schmitz, Ulf; Stolzing, Alexandra; Sühnel, Jürgen; Wuttke, Daniel; Vera, Julio

    2012-12-01

    In an "aging society," health span extension is most important. As in 2010, talks in this series of meetings in Rostock-Warnemünde demonstrated that aging is an apparently very complex process, where computational work is most useful for gaining insights and to find interventions that counter aging and prevent or counteract aging-related diseases. The specific topics of this year's meeting entitled, "RoSyBA: Rostock Symposium on Systems Biology and Bioinformatics in Ageing Research," were primarily related to "Cancer and Aging" and also had a focus on work funded by the German Federal Ministry of Education and Research (BMBF). The next meeting in the series, scheduled for September 20-21, 2013, will focus on the use of ontologies for computational research into aging, stem cells, and cancer. Promoting knowledge formalization is also at the core of the set of proposed action items concluding this report.

  11. Why Polyphenols have Promiscuous Actions? An Investigation by Chemical Bioinformatics.

    Science.gov (United States)

    Tang, Guang-Yan

    2016-05-01

    Despite their diverse pharmacological effects, polyphenols are poor for use as drugs, which have been traditionally ascribed to their low bioavailability. However, Baell and co-workers recently proposed that the redox potential of polyphenols also plays an important role in this, because redox reactions bring promiscuous actions on various protein targets and thus produce non-specific pharmacological effects. To investigate whether the redox reactivity behaves as a critical factor in polyphenol promiscuity, we performed a chemical bioinformatics analysis on the structure-activity relationships of twenty polyphenols. It was found that the gene expression profiles of human cell lines induced by polyphenols were not correlated with the presence or not of redox moieties in the polyphenols, but significantly correlated with their molecular structures. Therefore, it is concluded that the promiscuous actions of polyphenols are likely to result from their inherent structural features rather than their redox potential.

  12. An Adaptive Hybrid Multiprocessor technique for bioinformatics sequence alignment

    KAUST Repository

    Bonny, Talal

    2012-07-28

    Sequence alignment algorithms such as the Smith-Waterman algorithm are among the most important applications in the development of bioinformatics. Sequence alignment algorithms must process large amounts of data which may take a long time. Here, we introduce our Adaptive Hybrid Multiprocessor technique to accelerate the implementation of the Smith-Waterman algorithm. Our technique utilizes both the graphics processing unit (GPU) and the central processing unit (CPU). It adapts to the implementation according to the number of CPUs given as input by efficiently distributing the workload between the processing units. Using existing resources (GPU and CPU) in an efficient way is a novel approach. The peak performance achieved for the platforms GPU + CPU, GPU + 2CPUs, and GPU + 3CPUs is 10.4 GCUPS, 13.7 GCUPS, and 18.6 GCUPS, respectively (with the query length of 511 amino acid). © 2010 IEEE.

  13. Integrative content-driven concepts for bioinformatics ``beyond the cell"

    Indian Academy of Sciences (India)

    Edgar Wingender; Torsten Crass; Jennifer D Hogan; Alexander E Kel; Olga V Kel-Margoulis; Anatolij P Potapov

    2007-01-01

    Bioinformatics has delivered great contributions to genome and genomics research, without which the world-wide success of this and other global (‘omics’) approaches would not have been possible. More recently, it has developed further towards the analysis of different kinds of networks thus laying the foundation for comprehensive description, analysis and manipulation of whole living systems in modern ``systems biology”. The next step which is necessary for developing a systems biology that deals with systemic phenomena is to expand the existing and develop new methodologies that are appropriate to characterize intercellular processes and interactions without omitting the causal underlying molecular mechanisms. Modelling the processes on the different levels of complexity involved requires a comprehensive integration of information on gene regulatory events, signal transduction pathways, protein interaction and metabolic networks as well as cellular functions in the respective tissues/organs.

  14. The web server of IBM's Bioinformatics and Pattern Discovery group.

    Science.gov (United States)

    Huynh, Tien; Rigoutsos, Isidore; Parida, Laxmi; Platt, Daniel; Shibuya, Tetsuo

    2003-07-01

    We herein present and discuss the services and content which are available on the web server of IBM's Bioinformatics and Pattern Discovery group. The server is operational around the clock and provides access to a variety of methods that have been published by the group's members and collaborators. The available tools correspond to applications ranging from the discovery of patterns in streams of events and the computation of multiple sequence alignments, to the discovery of genes in nucleic acid sequences and the interactive annotation of amino acid sequences. Additionally, annotations for more than 70 archaeal, bacterial, eukaryotic and viral genomes are available on-line and can be searched interactively. The tools and code bundles can be accessed beginning at http://cbcsrv.watson.ibm.com/Tspd.html whereas the genomics annotations are available at http://cbcsrv.watson.ibm.com/Annotations/.

  15. Hydroxysteroid dehydrogenases (HSDs) in bacteria: a bioinformatic perspective.

    Science.gov (United States)

    Kisiela, Michael; Skarka, Adam; Ebert, Bettina; Maser, Edmund

    2012-03-01

    Steroidal compounds including cholesterol, bile acids and steroid hormones play a central role in various physiological processes such as cell signaling, growth, reproduction, and energy homeostasis. Hydroxysteroid dehydrogenases (HSDs), which belong to the superfamily of short-chain dehydrogenases/reductases (SDR) or aldo-keto reductases (AKR), are important enzymes involved in the steroid hormone metabolism. HSDs function as an enzymatic switch that controls the access of receptor-active steroids to nuclear hormone receptors and thereby mediate a fine-tuning of the steroid response. The aim of this study was the identification of classified functional HSDs and the bioinformatic annotation of these proteins in all complete sequenced bacterial genomes followed by a phylogenetic analysis. For the bioinformatic annotation we constructed specific hidden Markov models in an iterative approach to provide a reliable identification for the specific catalytic groups of HSDs. Here, we show a detailed phylogenetic analysis of 3α-, 7α-, 12α-HSDs and two further functional related enzymes (3-ketosteroid-Δ(1)-dehydrogenase, 3-ketosteroid-Δ(4)(5α)-dehydrogenase) from the superfamily of SDRs. For some bacteria that have been previously reported to posses a specific HSD activity, we could annotate the corresponding HSD protein. The dominating phyla that were identified to express HSDs were that of Actinobacteria, Proteobacteria, and Firmicutes. Moreover, some evolutionarily more ancient microorganisms (e.g., Cyanobacteria and Euryachaeota) were found as well. A large number of HSD-expressing bacteria constitute the normal human gastro-intestinal flora. Another group of bacteria were originally isolated from natural habitats like seawater, soil, marine and permafrost sediments. These bacteria include polycyclic aromatic hydrocarbons-degrading species such as Pseudomonas, Burkholderia and Rhodococcus. In conclusion, HSDs are found in a wide variety of microorganisms including

  16. USDA Stakeholder Workshop on Animal Bioinformatics: Summary and Recommendations

    Directory of Open Access Journals (Sweden)

    David L. Adelson

    2006-04-01

    Full Text Available An electronic workshop was conducted on 4 November–13 December 2002 to discuss current issues and needs in animal bioinformatics. The electronic (e-mail listserver format was chosen to provide a relatively speedy process that is broad in scope, cost-efficient and easily accessible to all participants. Approximately 40 panelists with diverse species and discipline expertise communicated through the panel e-mail listserver. The panel included scientists from academia, industry and government, in the USA, Australia and the UK. A second ‘stakeholder’ e-mail listserver was used to obtain input from a broad audience with general interests in animal genomics. The objectives of the electronic workshop were: (a to define priorities for animal genome database development; and (b to recommend ways in which the USDA could provide leadership in the area of animal genome database development. E-mail messages from panelists and stakeholders are archived at http://genome.cvm.umn.edu/bioinfo/. Priorities defined for animal genome database development included: (a data repository; (b tools for genome analysis; (c annotation; (d practical application of genomic data; and (e a biological framework for DNA sequence. A stable source of funding, such as the USDA Agricultural Research Service (ARS, was recommended to support maintenance of data repositories and data curation. Continued support for competitive grants programs within the USDA Cooperative State Research, Education and Extension Service (CSREES was recommended for tool development and hypothesis-driven research projects in genome analysis. Additional stakeholder input will be required to continuously refine priorities and maximize the use of limited resources for animal bioinformatics within the USDA.

  17. USDA Stakeholder Workshop on Animal Bioinformatics: Summary and Recommendations.

    Science.gov (United States)

    Hamernik, Debora L; Adelson, David L

    2003-01-01

    An electronic workshop was conducted on 4 November-13 December 2002 to discuss current issues and needs in animal bioinformatics. The electronic (e-mail listserver) format was chosen to provide a relatively speedy process that is broad in scope, cost-efficient and easily accessible to all participants. Approximately 40 panelists with diverse species and discipline expertise communicated through the panel e-mail listserver. The panel included scientists from academia, industry and government, in the USA, Australia and the UK. A second 'stakeholder' e-mail listserver was used to obtain input from a broad audience with general interests in animal genomics. The objectives of the electronic workshop were: (a) to define priorities for animal genome database development; and (b) to recommend ways in which the USDA could provide leadership in the area of animal genome database development. E-mail messages from panelists and stakeholders are archived at http://genome.cvm.umn.edu/bioinfo/. Priorities defined for animal genome database development included: (a) data repository; (b) tools for genome analysis; (c) annotation; (d) practical application of genomic data; and (e) a biological framework for DNA sequence. A stable source of funding, such as the USDA Agricultural Research Service (ARS), was recommended to support maintenance of data repositories and data curation. Continued support for competitive grants programs within the USDA Cooperative State Research, Education and Extension Service (CSREES) was recommended for tool development and hypothesis-driven research projects in genome analysis. Additional stakeholder input will be required to continuously refine priorities and maximize the use of limited resources for animal bioinformatics within the USDA.

  18. Model-driven user interfaces for bioinformatics data resources: regenerating the wheel as an alternative to reinventing it

    Directory of Open Access Journals (Sweden)

    Swainston Neil

    2006-12-01

    Full Text Available Abstract Background The proliferation of data repositories in bioinformatics has resulted in the development of numerous interfaces that allow scientists to browse, search and analyse the data that they contain. Interfaces typically support repository access by means of web pages, but other means are also used, such as desktop applications and command line tools. Interfaces often duplicate functionality amongst each other, and this implies that associated development activities are repeated in different laboratories. Interfaces developed by public laboratories are often created with limited developer resources. In such environments, reducing the time spent on creating user interfaces allows for a better deployment of resources for specialised tasks, such as data integration or analysis. Laboratories maintaining data resources are challenged to reconcile requirements for software that is reliable, functional and flexible with limitations on software development resources. Results This paper proposes a model-driven approach for the partial generation of user interfaces for searching and browsing bioinformatics data repositories. Inspired by the Model Driven Architecture (MDA of the Object Management Group (OMG, we have developed a system that generates interfaces designed for use with bioinformatics resources. This approach helps laboratory domain experts decrease the amount of time they have to spend dealing with the repetitive aspects of user interface development. As a result, the amount of time they can spend on gathering requirements and helping develop specialised features increases. The resulting system is known as Pierre, and has been validated through its application to use cases in the life sciences, including the PEDRoDB proteomics database and the e-Fungi data warehouse. Conclusion MDAs focus on generating software from models that describe aspects of service capabilities, and can be applied to support rapid development of repository

  19. Genetic and bioinformatics analysis of four novel GCK missense variants detected in Caucasian families with GCK-MODY phenotype.

    Science.gov (United States)

    Costantini, S; Malerba, G; Contreas, G; Corradi, M; Marin Vargas, S P; Giorgetti, A; Maffeis, C

    2015-05-01

    Heterozygous loss-of-function mutations in the glucokinase (GCK) gene cause maturity-onset diabetes of the young (MODY) subtype GCK (GCK-MODY/MODY2). GCK sequencing revealed 16 distinct mutations (13 missense, 1 nonsense, 1 splice site, and 1 frameshift-deletion) co-segregating with hyperglycaemia in 23 GCK-MODY families. Four missense substitutions (c.718A>G/p.Asn240Asp, c.757G>T/p.Val253Phe, c.872A>C/p.Lys291Thr, and c.1151C>T/p.Ala384Val) were novel and a founder effect for the nonsense mutation (c.76C>T/p.Gln26*) was supposed. We tested whether an accurate bioinformatics approach could strengthen family-genetic evidence for missense variant pathogenicity in routine diagnostics, where wet-lab functional assays are generally unviable. In silico analyses of the novel missense variants, including orthologous sequence conservation, amino acid substitution (AAS)-pathogenicity predictors, structural modeling and splicing predictors, suggested that the AASs and/or the underlying nucleotide changes are likely to be pathogenic. This study shows how a careful bioinformatics analysis could provide effective suggestions to help molecular-genetic diagnosis in absence of wet-lab validations.

  20. G-quadruplexes as novel cis-elements controlling transcription during embryonic development.

    Science.gov (United States)

    David, Aldana P; Margarit, Ezequiel; Domizi, Pablo; Banchio, Claudia; Armas, Pablo; Calcaterra, Nora B

    2016-05-19

    G-quadruplexes are dynamic structures folded in G-rich single-stranded DNA regions. These structures have been recognized as a potential nucleic acid based mechanism for regulating multiple cellular processes such as replication, transcription and genomic maintenance. So far, their transcriptional role in vivo during vertebrate embryonic development has not yet been addressed. Here, we performed an in silico search to find conserved putative G-quadruplex sequences (PQSs) within proximal promoter regions of human, mouse and zebrafish developmental genes. Among the PQSs able to fold in vitro as G-quadruplex, those present in nog3, col2a1 and fzd5 promoters were selected for further studies. In cellulo studies revealed that the selected G-quadruplexes affected the transcription of luciferase controlled by the SV40 nonrelated promoter. G-quadruplex disruption in vivo by microinjection in zebrafish embryos of either small ligands or DNA oligonucleotides complementary to the selected PQSs resulted in lower transcription of the targeted genes. Moreover, zebrafish embryos and larvae phenotypes caused by the presence of complementary oligonucleotides fully resembled those ones reported for nog3, col2a1 and fzd5 morphants. To our knowledge, this is the first work revealing in vivo the role of conserved G-quadruplexes in the embryonic development, one of the most regulated processes of the vertebrates biology.

  1. Structural Analysis and Identification of Cis-Elements of Rice osRACD Gene

    Institute of Scientific and Technical Information of China (English)

    Yun-Zhe XU; Rui-Lin YOU; Nai-Hu WU

    2004-01-01

    The osRACD gene correlated with fertility transformation in the photoperiod sensitive genic male sterile rice(PGMR),Nongken 58S,encoded a rice(Oryza sativa L.ssp.japonica)small GTPase belonging to the Rac/Rho family.Inverse PCR was performed to amplify a fragment about 1.4 kb in 5′ upstream region of the osRACD promoter.Deletion mutation and gel mobility shift assay characterized two fragments(-799 to -686 nt,and -686 to -431 nt)in the osRACD promoter that could be involved in its transcriptional regulation.When these two deletion fragments were used as probe respectively,a retarded band appeared in the nuclear extracts of fertile 58S rice under short day(58S-SD).Whereas no retarded band was shown in the nuclear extracts of sterile 58S rice under long day(58S-LD).Competition assay indicated that the factors in the retarded bands binding to these two fragments were the same trans-acting factor(termed rice factor,RF).The binding affinity of RF was affected by phosphorylation and was higher in SD-growth rice than that of LD-growth rice.

  2. Comparative genome sequencing of drosophila pseudoobscura: Chromosomal, gene and cis-element evolution

    Energy Technology Data Exchange (ETDEWEB)

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Todd, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catherine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenee; Verduzco, Daniel; Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

    2004-04-01

    The genome sequence of a second fruit fly, D. pseudoobscura, presents an opportunity for comparative analysis of a primary model organism D. melanogaster. The vast majority of Drosophila genes have remained on the same arm, but within each arm gene order has been extensively reshuffled leading to the identification of approximately 1300 syntenic blocks. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 35 My since divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome wide average consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than control sequences between the species but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a picture of repeat mediated chromosomal rearrangement, and high co-adaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila.

  3. Human Papillomavirus Type 18 cis-Elements Crucial for Segregation and Latency.

    Directory of Open Access Journals (Sweden)

    Mart Ustav

    Full Text Available Stable maintenance replication is characteristic of the latency phase of HPV infection, during which the viral genomes are actively maintained as extrachromosomal genetic elements in infected proliferating basal keratinocytes. Active replication in the S-phase and segregation of the genome into daughter cells in mitosis are required for stable maintenance replication. Most of our knowledge about papillomavirus genome segregation has come from studies of bovine papillomavirus type 1 (BPV-1, which have demonstrated that the E2 protein cooperates with cellular trans-factors and that E2 binding sites act as cis-regulatory elements in the viral genome that are essential for the segregation process. However, the genomic organization of the regulatory region in HPVs, and the properties of the viral proteins are different from those of their BPV-1 counterparts. We have designed a segregation assay for HPV-18 and used it to demonstrate that the E2 protein performs segregation in combination with at least two E2 binding sites. The cooperative binding of the E2 protein to two E2 binding sites is a major determinant of HPV-18 genome segregation, as demonstrated by the change in spacing between adjacent binding sites #1 and #2 in the HPV-18 Upstream Regulatory Region (URR. Duplication or triplication of the natural 4 bp 5'-CGGG-3' spacer between the E2 binding sites increased the cooperative binding of the E2 molecules as well as E2-dependent segregation. Removal of any spacing between these sites eliminated cooperative binding of the E2 protein and disabled segregation of the URR and HPV-18 genome. Transfer of these configurations of the E2 binding sites into viral genomes confirmed the role of the E2 protein and binding sites #1 and #2 in the segregation process. Additional analysis demonstrated that these sites also play an important role in the transcriptional regulation of viral gene expression from different HPV-18 promoters.

  4. Comparative analyses of prophage-like elements present in two Lactococcus lactis strains

    NARCIS (Netherlands)

    Ventura, Marco; Zomer, Aldert; Canchaya, Carlos; O'Connell-Motherway, Mary; Kuipers, Oscar; Turroni, Francesca; Ribbera, Angela; Foroni, Elena; Buist, Girbe; Wegmann, Udo; Shearman, Claire; Gasson, Michael J.; Fitzgerald, Gerald F.; Kok, Jan; van Sinderen, Douwe; O’Connell-Motherway, Mary

    2007-01-01

    In this study, we describe the genetic organizations of six and five apparent prophage-like elements present in the genomes of the Lactococcus lactis subsp. cremoris strains MG1363 and SK11, respectively. Phylogenetic investigation as well bioinformatic analyses indicates that all 11 prophages belon

  5. Buying in to bioinformatics: an introduction to commercial sequence analysis software.

    Science.gov (United States)

    Smith, David Roy

    2015-07-01

    Advancements in high-throughput nucleotide sequencing techniques have brought with them state-of-the-art bioinformatics programs and software packages. Given the importance of molecular sequence data in contemporary life science research, these software suites are becoming an essential component of many labs and classrooms, and as such are frequently designed for non-computer specialists and marketed as one-stop bioinformatics toolkits. Although beautifully designed and powerful, user-friendly bioinformatics packages can be expensive and, as more arrive on the market each year, it can be difficult for researchers, teachers and students to choose the right software for their needs, especially if they do not have a bioinformatics background. This review highlights some of the currently available and most popular commercial bioinformatics packages, discussing their prices, usability, features and suitability for teaching. Although several commercial bioinformatics programs are arguably overpriced and overhyped, many are well designed, sophisticated and, in my opinion, worth the investment. If you are just beginning your foray into molecular sequence analysis or an experienced genomicist, I encourage you to explore proprietary software bundles. They have the potential to streamline your research, increase your productivity, energize your classroom and, if anything, add a bit of zest to the often dry detached world of bioinformatics.

  6. Automation of Bioinformatics Workflows using CloVR, a Cloud Virtual Resource

    Science.gov (United States)

    Vangala, Mahesh

    2013-01-01

    Exponential growth of biological data, mainly due to revolutionary developments in NGS technologies in past couple of years, created a multitude of challenges in downstream data analysis using bioinformatics approaches. To handle such tsunami of data, bioinformatics analysis must be carried out in an automated and parallel fashion. A successful analysis often requires more than a few computational steps and bootstrapping these individual steps (scripts) into components and the components into pipelines certainly makes bioinformatics a reproducible and manageable segment of scientific research. CloVR (http://clovr.org) is one such flexible framework that facilitates the abstraction of bioinformatics workflows into executable pipelines. CloVR comes packaged with various built-in bioinformatics pipelines that can make use of multicore processing power when run on servers and/or cloud. CloVR is amenable to build custom pipelines based on individual laboratory requirements. CloVR is available as a single executable virtual image file that comes bundled with pre-installed and pre-configured bioinformatics tools and packages and thus circumvents the cumbersome installation difficulties. CloVR is highly portable and can be run on traditional desktop/laptop computers, central servers and cloud compute farms. In conclusion, CloVR provides built-in automated analysis pipelines for microbial genomics with a scope to develop and integrate custom-workflows that make use of parallel processing power when run on compute clusters, there by addressing the bioinformatics challenges with NGS data.

  7. A web services choreography scenario for interoperating bioinformatics applications

    Directory of Open Access Journals (Sweden)

    Cheung David W

    2004-03-01

    Full Text Available Abstract Background Very often genome-wide data analysis requires the interoperation of multiple databases and analytic tools. A large number of genome databases and bioinformatics applications are available through the web, but it is difficult to automate interoperation because: 1 the platforms on which the applications run are heterogeneous, 2 their web interface is not machine-friendly, 3 they use a non-standard format for data input and output, 4 they do not exploit standards to define application interface and message exchange, and 5 existing protocols for remote messaging are often not firewall-friendly. To overcome these issues, web services have emerged as a standard XML-based model for message exchange between heterogeneous applications. Web services engines have been developed to manage the configuration and execution of a web services workflow. Results To demonstrate the benefit of using web services over traditional web interfaces, we compare the two implementations of HAPI, a gene expression analysis utility developed by the University of California San Diego (UCSD that allows visual characterization of groups or clusters of genes based on the biomedical literature. This utility takes a set of microarray spot IDs as input and outputs a hierarchy of MeSH Keywords that correlates to the input and is grouped by Medical Subject Heading (MeSH category. While the HTML output is easy for humans to visualize, it is difficult for computer applications to interpret semantically. To facilitate the capability of machine processing, we have created a workflow of three web services that replicates the HAPI functionality. These web services use document-style messages, which means that messages are encoded in an XML-based format. We compared three approaches to the implementation of an XML-based workflow: a hard coded Java application, Collaxa BPEL Server and Taverna Workbench. The Java program functions as a web services engine and interoperates

  8. Microsatellites for next-generation ecologists: a post-sequencing bioinformatics pipeline.

    Directory of Open Access Journals (Sweden)

    Iria Fernandez-Silva

    Full Text Available Microsatellites are the markers of choice for a variety of population genetic studies. The recent advent of next-generation pyrosequencing has drastically accelerated microsatellite locus discovery by providing a greater amount of DNA sequencing reads at lower costs compared to other techniques. However, laboratory testing of PCR primers targeting potential microsatellite markers remains time consuming and costly. Here we show how to reduce this workload by screening microsatellite loci via bioinformatic analyses prior to primer design. Our method emphasizes the importance of sequence quality, and we avoid loci associated with repetitive elements by screening with repetitive sequence databases available for a growing number of taxa. Testing with the Yellowstripe Goatfish Mulloidichthys flavolineatus and the marine planktonic copepod Pleuromamma xiphias we show higher success rate of primers selected by our pipeline in comparison to previous in silico microsatellite detection methodologies. Following the same pipeline, we discover and select microsatellite loci in nine additional species including fishes, sea stars, copepods and octopuses.

  9. A Critical Review on the Use of Support Values in Tree Viewers and Bioinformatics Toolkits.

    Science.gov (United States)

    Czech, Lucas; Huerta-Cepas, Jaime; Stamatakis, Alexandros

    2017-03-22

    Phylogenetic trees are routinely visualized to present and interpret the evolutionary relationships of species. Most empirical evolutionary data studies contain a visualization of the inferred tree with branch support values. Ambiguous semantics in tree file formats can lead to erroneous tree visualizations and therefore to incorrect interpretations of phylogenetic analyses. Here, we discuss problems that arise when displaying branch values on trees after rerooting. Branch values are typically stored as node labels in the widely-used Newick tree format. However, such values are attributes of branches. Storing them as node labels can therefore yield errors when rerooting trees. This depends on the mostly implicit semantics that tools deploy to interpret node labels. We reviewed ten tree viewers and ten bioinformatics toolkits that can display and reroot trees. We found that 14 out of 20 of these tools do not permit users to select the semantics of node labels. Thus, unaware users might obtain incorrect results when rooting trees. We illustrate such incorrect mappings for several test cases and real examples taken from the literature. This review has already led to improvements in eight tools. We suggest tools should provide options that explicitly force users to define the semantics of node labels.

  10. [Cloning and bioinformatics analysis of SLA-DR genes in Hunan Shaziling pigs].

    Science.gov (United States)

    Tang, Yi-Ya; Xing, Xiao-Wei; Xue, Li-Qun; Huang, Sheng-Qiang; Wang, Wei

    2007-12-01

    In order to clone class II DRA and DRB genes of swine leukocyte antigen (SLA) in Hunan Shaziling pigs, to analyze their characteristics and polymorphism and to provide immunological basic parameters for xenotransplantation from pigs to humans. SLA-DRA and SLA-DRB genes in two Shaziling pigs with the absence of porcine endogenous retrovirus (PERV) env-c were amplified by RT-PCR, cloned into PUCm-T vectors, sequenced and analyzed through BLAST in NCBI and related software in ExPASY. The obtained SLA-DRA and SLA-DRB genes of Shaziling pigs were 1,177 and 909 nucleotides in length with their accession numbers in Genbank as EF143987 and EF143988. Bioinformatics analyses have shown that they both contain opening reading frame (ORF) and encode 252 and 266 amino acids respectively. Comparing the ORF and protein sequences of the Shaziling SLA-DRA and SLA-DRB genes with their counterpart sequences of human, the homologies of nucleotide sequences were 83% and 83%, and the homologies of amino acid sequences 83 % and 79% respectively. Further comparison with SLA sequences published in GenBank indicated that SLA-DRB gene found in Shaziling pigs has polymorphism while the homology of SLA-DRA gene is up to 100 % .

  11. Bioinformatical parsing of folding-on-binding proteins reveals their compositional and evolutionary sequence design.

    Science.gov (United States)

    Narasumani, Mohanalakshmi; Harrison, Paul M

    2015-12-18

    Intrinsic disorder occurs when (part of) a protein remains unfolded during normal functioning. Intrinsically-disordered regions can contain segments that 'fold on binding' to another molecule. Here, we perform bioinformatical parsing of human 'folding-on-binding' (FB) proteins, into four subsets: Ordered regions, FB regions, Disordered regions that surround FB regions ('Disordered-around-FB'), and Other-Disordered regions. We examined the composition and evolutionary behaviour (across vertebrate orthologs) of these subsets. From a convergence of three separate analyses, we find that for hydrophobicity, Ordered regions segregate from the other subsets, but the Ordered and FB regions group together as highly conserved, and the Disordered-around-FB and Other-Disordered regions as less conserved (with a lesser significant difference between Ordered and FB regions). FB regions are highly-conserved with net positive charge, whereas Disordered-around-FB have net negative charge and are relatively less hydrophobic than FB regions. Indeed, these Disordered-around-FB regions are excessively hydrophilic compared to other disordered regions generally. We describe how our results point towards a possible compositionally-based steering mechanism of folding-on-binding.

  12. Advances in translational bioinformatics and population genomics in the Asia-Pacific.

    Science.gov (United States)

    Ranganathan, Shoba; Tongsima, Sissades; Chan, Jonathan; Tan, Tin Wee; Schönbach, Christian

    2012-01-01

    The theme of the 2012 International Conference on Bioinformatics (InCoB) in Bangkok, Thailand was "From Biological Data to Knowledge to Technological Breakthroughs." Besides providing a forum for life scientists and bioinformatics researchers in the Asia-Pacific region to meet and interact, the conference also hosted thematic sessions on the Pan-Asian Pacific Genome Initiative and immunoinformatics. Over the seven years of conference papers published in BMC Bioinformatics and four years in BMC Genomics, we note that there is increasing interest in the applications of -omics technologies to the understanding of diseases, as a forerunner to personalized genomic medicine.

  13. Cancer bioinformatics: detection of chromatin states,SNP-containing motifs, and functional enrichment modules

    Institute of Scientific and Technical Information of China (English)

    Xiaobo Zhou

    2013-01-01

    In this editorial preface,I briefly review cancer bioinformatics and introduce the four articles in this special issue highlighting important applications of the field:detection of chromatin states; detection of SNP-containing motifs and association with transcription factor-binding sites; improvements in functional enrichment modules; and gene association studies on aging and cancer.We expect this issue to provide bioinformatics scientists,cancer biologists,and clinical doctors with a better understanding of how cancer bioinformatics can be used to identify candidate biomarkers and targets and to conduct functional analysis.

  14. New Link in Bioinformatics Services Value Chain: Position, Organization and Business Model

    Directory of Open Access Journals (Sweden)

    Mladen Čudanov

    2012-11-01

    Full Text Available This paper presents development in the bioinformatics services industry value chain, based on cloud computing paradigm. As genome sequencing costs per Megabase exponentially drop, industry needs to adopt. Paper has two parts: theoretical analysis and practical example of Seven Bridges Genomics Company. We are focused on explaining organizational, business and financial aspects of new business model in bioinformatics services, rather than technical side of the problem. In the light of that we present twofold business model fit for core bioinformatics research and Information and Communication Technologie (ICT support in the new environment, with higher level of capital utilization and better resistance to business risks.

  15. Bioinformatic approaches reveal metagenomic characterization of soil microbial community.

    Directory of Open Access Journals (Sweden)

    Zhuofei Xu

    Full Text Available As is well known, soil is a complex ecosystem harboring the most prokaryotic biodiversity on the Earth. In recent years, the advent of high-throughput sequencing techniques has greatly facilitated the progress of soil ecological studies. However, how to effectively understand the underlying biological features of large-scale sequencing data is a new challenge. In the present study, we used 33 publicly available metagenomes from diverse soil sites (i.e. grassland, forest soil, desert, Arctic soil, and mangrove sediment and integrated some state-of-the-art computational tools to explore the phylogenetic and functional characterizations of the microbial communities in soil. Microbial composition and metabolic potential in soils were comprehensively illustrated at the metagenomic level. A spectrum of metagenomic biomarkers containing 46 taxa and 33 metabolic modules were detected to be significantly differential that could be used as indicators to distinguish at least one of five soil communities. The co-occurrence associations between complex microbial compositions and functions were inferred by network-based approaches. Our results together with the established bioinformatic pipelines should provide a foundation for future research into the relation between soil biodiversity and ecosystem function.

  16. Bioinformatics Analysis of MAPKKK Family Genes in Medicago truncatula

    Science.gov (United States)

    Li, Wei; Xu, Hanyun; Liu, Ying; Song, Lili; Guo, Changhong; Shu, Yongjun

    2016-01-01

    Mitogen-activated protein kinase kinase kinase (MAPKKK) is a component of the MAPK cascade pathway that plays an important role in plant growth, development, and response to abiotic stress, the functions of which have been well characterized in several plant species, such as Arabidopsis, rice, and maize. In this study, we performed genome-wide and systemic bioinformatics analysis of MAPKKK family genes in Medicago truncatula. In total, there were 73 MAPKKK family members identified by search of homologs, and they were classified into three subfamilies, MEKK, ZIK, and RAF. Based on the genomic duplication function, 72 MtMAPKKK genes were located throughout all chromosomes, but they cluster in different chromosomes. Using microarray data and high-throughput sequencing-data, we assessed their expression profiles in growth and development processes; these results provided evidence for exploring their important functions in developmental regulation, especially in the nodulation process. Furthermore, we investigated their expression in abiotic stresses by RNA-seq, which confirmed their critical roles in signal transduction and regulation processes under stress. In summary, our genome-wide, systemic characterization and expressional analysis of MtMAPKKK genes will provide insights that will be useful for characterizing the molecular functions of these genes in M. truncatula. PMID:27049397

  17. Learning structural bioinformatics and evolution with a snake puzzle

    Directory of Open Access Journals (Sweden)

    Gonzalo S. Nido

    2016-12-01

    Full Text Available We propose here a working unit for teaching basic concepts of structural bioinformatics and evolution through the example of a wooden snake puzzle, strikingly similar to toy models widely used in the literature of protein folding. In our experience, developed at a Master’s course at the Universidad Autónoma de Madrid (Spain, the concreteness of this example helps to overcome difficulties caused by the interdisciplinary nature of this field and its high level of abstraction, in particular for students coming from traditional disciplines. The puzzle will allow us discussing a simple algorithm for finding folded solutions, through which we will introduce the concept of the configuration space and the contact matrix representation. This is a central tool for comparing protein structures, for studying simple models of protein energetics, and even for a qualitative discussion of folding kinetics, through the concept of the Contact Order. It also allows a simple representation of misfolded conformations and their free energy. These concepts will motivate evolutionary questions, which we will address by simulating a structurally constrained model of protein evolution, again modelled on the snake puzzle. In this way, we can discuss the analogy between evolutionary concepts and statistical mechanics that facilitates the understanding of both concepts. The proposed examples and literature are accessible, and we provide supplementary material (see ‘Data Availability’ to reproduce the numerical experiments. We also suggest possible directions to expand the unit. We hope that this work will further stimulate the adoption of games in teaching practice.

  18. Progress and challenges in bioinformatics approaches for enhancer identification

    KAUST Repository

    Kleftogiannis, Dimitrios A.

    2017-02-03

    Enhancers are cis-acting DNA elements that play critical roles in distal regulation of gene expression. Identifying enhancers is an important step for understanding distinct gene expression programs that may reflect normal and pathogenic cellular conditions. Experimental identification of enhancers is constrained by the set of conditions used in the experiment. This requires multiple experiments to identify enhancers, as they can be active under specific cellular conditions but not in different cell types/tissues or cellular states. This has opened prospects for computational prediction methods that can be used for high-throughput identification of putative enhancers to complement experimental approaches. Potential functions and properties of predicted enhancers have been catalogued and summarized in several enhancer-oriented databases. Because the current methods for the computational prediction of enhancers produce significantly different enhancer predictions, it will be beneficial for the research community to have an overview of the strategies and solutions developed in this field. In this review, we focus on the identification and analysis of enhancers by bioinformatics approaches. First, we describe a general framework for computational identification of enhancers, present relevant data types and discuss possible computational solutions. Next, we cover over 30 existing computational enhancer identification methods that were developed since 2000. Our review highlights advantages, limitations and potentials, while suggesting pragmatic guidelines for development of more efficient computational enhancer prediction methods. Finally, we discuss challenges and open problems of this topic, which require further consideration.

  19. Exploiting graphics processing units for computational biology and bioinformatics.

    Science.gov (United States)

    Payne, Joshua L; Sinnott-Armstrong, Nicholas A; Moore, Jason H

    2010-09-01

    Advances in the video gaming industry have led to the production of low-cost, high-performance graphics processing units (GPUs) that possess more memory bandwidth and computational capability than central processing units (CPUs), the standard workhorses of scientific computing. With the recent release of generalpurpose GPUs and NVIDIA's GPU programming language, CUDA, graphics engines are being adopted widely in scientific computing applications, particularly in the fields of computational biology and bioinformatics. The goal of this article is to concisely present an introduction to GPU hardware and programming, aimed at the computational biologist or bioinformaticist. To this end, we discuss the primary differences between GPU and CPU architecture, introduce the basics of the CUDA programming language, and discuss important CUDA programming practices, such as the proper use of coalesced reads, data types, and memory hierarchies. We highlight each of these topics in the context of computing the all-pairs distance between instances in a dataset, a common procedure in numerous disciplines of scientific computing. We conclude with a runtime analysis of the GPU and CPU implementations of the all-pairs distance calculation. We show our final GPU implementation to outperform the CPU implementation by a factor of 1700.

  20. Databases, models, and algorithms for functional genomics: a bioinformatics perspective.

    Science.gov (United States)

    Singh, Gautam B; Singh, Harkirat

    2005-02-01

    A variety of patterns have been observed on the DNA and protein sequences that serve as control points for gene expression and cellular functions. Owing to the vital role of such patterns discovered on biological sequences, they are generally cataloged and maintained within internationally shared databases. Furthermore,the variability in a family of observed patterns is often represented using computational models in order to facilitate their search within an uncharacterized biological sequence. As the biological data is comprised of a mosaic of sequence-levels motifs, it is significant to unravel the synergies of macromolecular coordination utilized in cell-specific differential synthesis of proteins. This article provides an overview of the various pattern representation methodologies and the surveys the pattern databases available for use to the molecular biologists. Our aim is to describe the principles behind the computational modeling and analysis techniques utilized in bioinformatics research, with the objective of providing insight necessary to better understand and effectively utilize the available databases and analysis tools. We also provide a detailed review of DNA sequence level patterns responsible for structural conformations within the Scaffold or Matrix Attachment Regions (S/MARs).

  1. Phylogenetic diversity (PD and biodiversity conservation: some bioinformatics challenges

    Directory of Open Access Journals (Sweden)

    Daniel P. Faith

    2006-01-01

    Full Text Available Biodiversity conservation addresses information challenges through estimations encapsulated in measures of diversity. A quantitative measure of phylogenetic diversity, “PD”, has been defined as the minimum total length of all the phylogenetic branches required to span a given set of taxa on the phylogenetic tree (Faith 1992a. While a recent paper incorrectly characterizes PD as not including information about deeper phylogenetic branches, PD applications over the past decade document the proper incorporation of shared deep branches when assessing the total PD of a set of taxa. Current PD applications to macroinvertebrate taxa in streams of New South Wales, Australia illustrate the practical importance of this definition. Phylogenetic lineages, often corresponding to new, “cryptic”, taxa, are restricted to a small number of stream localities. A recent case of human impact causing loss of taxa in one locality implies a higher PD value for another locality, because it now uniquely represents a deeper branch. This molecular-based phylogenetic pattern supports the use of DNA barcoding programs for biodiversity conservation planning. Here, PD assessments side-step the contentious use of barcoding-based “species” designations. Bio-informatics challenges include combining different phylogenetic evidence, optimization problems for conservation planning, and effective integration of phylogenetic information with environmental and socio-economic data.

  2. Bioinformatics Analysis of MAPKKK Family Genes in Medicago truncatula.

    Science.gov (United States)

    Li, Wei; Xu, Hanyun; Liu, Ying; Song, Lili; Guo, Changhong; Shu, Yongjun

    2016-01-01

    Mitogen-activated protein kinase kinase kinase (MAPKKK) is a component of the MAPK cascade pathway that plays an important role in plant growth, development, and response to abiotic stress, the functions of which have been well characterized in several plant species, such as Arabidopsis, rice, and maize. In this study, we performed genome-wide and systemic bioinformatics analysis of MAPKKK family genes in Medicago truncatula. In total, there were 73 MAPKKK family members identified by search of homologs, and they were classified into three subfamilies, MEKK, ZIK, and RAF. Based on the genomic duplication function, 72 MtMAPKKK genes were located throughout all chromosomes, but they cluster in different chromosomes. Using microarray data and high-throughput sequencing-data, we assessed their expression profiles in growth and development processes; these results provided evidence for exploring their important functions in developmental regulation, especially in the nodulation process. Furthermore, we investigated their expression in abiotic stresses by RNA-seq, which confirmed their critical roles in signal transduction and regulation processes under stress. In summary, our genome-wide, systemic characterization and expressional analysis of MtMAPKKK genes will provide insights that will be useful for characterizing the molecular functions of these genes in M. truncatula.

  3. Coherent pipeline for biomarker discovery using mass spectrometry and bioinformatics

    Directory of Open Access Journals (Sweden)

    Al-Shahib Ali

    2010-08-01

    Full Text Available Abstract Background Robust biomarkers are needed to improve microbial identification and diagnostics. Proteomics methods based on mass spectrometry can be used for the discovery of novel biomarkers through their high sensitivity and specificity. However, there has been a lack of a coherent pipeline connecting biomarker discovery with established approaches for evaluation and validation. We propose such a pipeline that uses in silico methods for refined biomarker discovery and confirmation. Results The pipeline has four main stages: Sample preparation, mass spectrometry analysis, database searching and biomarker validation. Using the pathogen Clostridium botulinum as a model, we show that the robustness of candidate biomarkers increases with each stage of the pipeline. This is enhanced by the concordance shown between various database search algorithms for peptide identification. Further validation was done by focusing on the peptides that are unique to C. botulinum strains and absent in phylogenetically related Clostridium species. From a list of 143 peptides, 8 candidate biomarkers were reliably identified as conserved across C. botulinum strains. To avoid discarding other unique peptides, a confidence scale has been implemented in the pipeline giving priority to unique peptides that are identified by a union of algorithms. Conclusions This study demonstrates that implementing a coherent pipeline which includes intensive bioinformatics validation steps is vital for discovery of robust biomarkers. It also emphasises the importance of proteomics based methods in biomarker discovery.

  4. Bioinformatic Prediction of WSSV-Host Protein-Protein Interaction

    Directory of Open Access Journals (Sweden)

    Zheng Sun

    2014-01-01

    Full Text Available WSSV is one of the most dangerous pathogens in shrimp aquaculture. However, the molecular mechanism of how WSSV interacts with shrimp is still not very clear. In the present study, bioinformatic approaches were used to predict interactions between proteins from WSSV and shrimp. The genome data of WSSV (NC_003225.1 and the constructed transcriptome data of F. chinensis were used to screen potentially interacting proteins by searching in protein interaction databases, including STRING, Reactome, and DIP. Forty-four pairs of proteins were suggested to have interactions between WSSV and the shrimp. Gene ontology analysis revealed that 6 pairs of these interacting proteins were classified into “extracellular region” or “receptor complex” GO-terms. KEGG pathway analysis showed that they were involved in the “ECM-receptor interaction pathway.” In the 6 pairs of interacting proteins, an envelope protein called “collagen-like protein” (WSSV-CLP encoded by an early virus gene “wsv001” in WSSV interacted with 6 deduced proteins from the shrimp, including three integrin alpha (ITGA, two integrin beta (ITGB, and one syndecan (SDC. Sequence analysis on WSSV-CLP, ITGA, ITGB, and SDC revealed that they possessed the sequence features for protein-protein interactions. This study might provide new insights into the interaction mechanisms between WSSV and shrimp.

  5. Bioinformatics Analysis of MAPKKK Family Genes in Medicago truncatula

    Directory of Open Access Journals (Sweden)

    Wei Li

    2016-04-01

    Full Text Available Mitogen‐activated protein kinase kinase kinase (MAPKKK is a component of the MAPK cascade pathway that plays an important role in plant growth, development, and response to abiotic stress, the functions of which have been well characterized in several plant species, such as Arabidopsis, rice, and maize. In this study, we performed genome‐wide and systemic bioinformatics analysis of MAPKKK family genes in Medicago truncatula. In total, there were 73 MAPKKK family members identified by search of homologs, and they were classified into three subfamilies, MEKK, ZIK, and RAF. Based on the genomic duplication function, 72 MtMAPKKK genes were located throughout all chromosomes, but they cluster in different chromosomes. Using microarray data and high‐throughput sequencing‐data, we assessed their expression profiles in growth and development processes; these results provided evidence for exploring their important functions in developmental regulation, especially in the nodulation process. Furthermore, we investigated their expression in abiotic stresses by RNA‐seq, which confirmed their critical roles in signal transduction and regulation processes under stress. In summary, our genome‐wide, systemic characterization and expressional analysis of MtMAPKKK genes will provide insights that will be useful for characterizing the molecular functions of these genes in M. truncatula.

  6. A Bioinformatics Filtering Strategy for Identifying Radiation Response Biomarker Candidates

    Science.gov (United States)

    Oh, Jung Hun; Wong, Harry P.; Wang, Xiaowei; Deasy, Joseph O.

    2012-01-01

    The number of biomarker candidates is often much larger than the number of clinical patient data points available, which motivates the use of a rational candidate variable filtering methodology. The goal of this paper is to apply such a bioinformatics filtering process to isolate a modest number (<10) of key interacting genes and their associated single nucleotide polymorphisms involved in radiation response, and to ultimately serve as a basis for using clinical datasets to identify new biomarkers. In step 1, we surveyed the literature on genetic and protein correlates to radiation response, in vivo or in vitro, across cellular, animal, and human studies. In step 2, we analyzed two publicly available microarray datasets and identified genes in which mRNA expression changed in response to radiation. Combining results from Step 1 and Step 2, we identified 20 genes that were common to all three sources. As a final step, a curated database of protein interactions was used to generate the most statistically reliable protein interaction network among any subset of the 20 genes resulting from Steps 1 and 2, resulting in identification of a small, tightly interacting network with 7 out of 20 input genes. We further ranked the genes in terms of likely importance, based on their location within the network using a graph-based scoring function. The resulting core interacting network provides an attractive set of genes likely to be important to radiation response. PMID:22768051

  7. A bioinformatics filtering strategy for identifying radiation response biomarker candidates.

    Directory of Open Access Journals (Sweden)

    Jung Hun Oh

    Full Text Available The number of biomarker candidates is often much larger than the number of clinical patient data points available, which motivates the use of a rational candidate variable filtering methodology. The goal of this paper is to apply such a bioinformatics filtering process to isolate a modest number (<10 of key interacting genes and their associated single nucleotide polymorphisms involved in radiation response, and to ultimately serve as a basis for using clinical datasets to identify new biomarkers. In step 1, we surveyed the literature on genetic and protein correlates to radiation response, in vivo or in vitro, across cellular, animal, and human studies. In step 2, we analyzed two publicly available microarray datasets and identified genes in which mRNA expression changed in response to radiation. Combining results from Step 1 and Step 2, we identified 20 genes that were common to all three sources. As a final step, a curated database of protein interactions was used to generate the most statistically reliable protein interaction network among any subset of the 20 genes resulting from Steps 1 and 2, resulting in identification of a small, tightly interacting network with 7 out of 20 input genes. We further ranked the genes in terms of likely importance, based on their location within the network using a graph-based scoring function. The resulting core interacting network provides an attractive set of genes likely to be important to radiation response.

  8. Fundamentals of bioinformatics and computational biology methods and exercises in matlab

    CERN Document Server

    Singh, Gautam B

    2015-01-01

    This book offers comprehensive coverage of all the core topics of bioinformatics, and includes practical examples completed using the MATLAB bioinformatics toolbox™. It is primarily intended as a textbook for engineering and computer science students attending advanced undergraduate and graduate courses in bioinformatics and computational biology. The book develops bioinformatics concepts from the ground up, starting with an introductory chapter on molecular biology and genetics. This chapter will enable physical science students to fully understand and appreciate the ultimate goals of applying the principles of information technology to challenges in biological data management, sequence analysis, and systems biology. The first part of the book also includes a survey of existing biological databases, tools that have become essential in today’s biotechnology research. The second part of the book covers methodologies for retrieving biological information, including fundamental algorithms for sequence compar...

  9. [Post-translational modification (PTM) bioinformatics in China: progresses and perspectives].

    Science.gov (United States)

    Zexian, Liu; Yudong, Cai; Xuejiang, Guo; Ao, Li; Tingting, Li; Jianding, Qiu; Jian, Ren; Shaoping, Shi; Jiangning, Song; Minghui, Wang; Lu, Xie; Yu, Xue; Ziding, Zhang; Xingming, Zhao

    2015-07-01

    Post-translational modifications (PTMs) are essential for regulating conformational changes, activities and functions of proteins, and are involved in almost all cellular pathways and processes. Identification of protein PTMs is the basis for understanding cellular and molecular mechanisms. In contrast with labor-intensive and time-consuming experiments, the PTM prediction using various bioinformatics approaches can provide accurate, convenient, and efficient strategies and generate valuable information for further experimental consideration. In this review, we summarize the current progresses made by Chineses bioinformaticians in the field of PTM Bioinformatics, including the design and improvement of computational algorithms for predicting PTM substrates and sites, design and maintenance of online and offline tools, establishment of PTM-related databases and resources, and bioinformatics analysis of PTM proteomics data. Through comparing similar studies in China and other countries, we demonstrate both advantages and limitations of current PTM bioinformatics as well as perspectives for future studies in China.

  10. Computer programming and biomolecular structure studies: A step beyond internet bioinformatics.

    Science.gov (United States)

    Likić, Vladimir A

    2006-01-01

    This article describes the experience of teaching structural bioinformatics to third year undergraduate students in a subject titled Biomolecular Structure and Bioinformatics. Students were introduced to computer programming and used this knowledge in a practical application as an alternative to the well established Internet bioinformatics approach that relies on access to the Internet and biological databases. This was an ambitious approach considering that the students mostly had a biological background. There were also time constraints of eight lectures in total and two accompanying practical sessions. The main challenge was that students had to be introduced to computer programming from a beginner level and in a short time provided with enough knowledge to independently solve a simple bioinformatics problem. This was accomplished with a problem directly relevant to the rest of the subject, concerned with the structure-function relationships and experimental techniques for the determination of macromolecular structure.

  11. Bioinformatics in high school biology curricula: a study of state science standards.

    Science.gov (United States)

    Wefer, Stephen H; Sheppard, Keith

    2008-01-01

    The proliferation of bioinformatics in modern biology marks a modern revolution in science that promises to influence science education at all levels. This study analyzed secondary school science standards of 49 U.S. states (Iowa has no science framework) and the District of Columbia for content related to bioinformatics. The bioinformatics content of each state's biology standards was analyzed and categorized into nine areas: Human Genome Project/genomics, forensics, evolution, classification, nucleotide variations, medicine, computer use, agriculture/food technology, and science technology and society/socioscientific issues. Findings indicated a generally low representation of bioinformatics-related content, which varied substantially across the different areas, with Human Genome Project/genomics and computer use being the lowest (8%), and evolution being the highest (64%) among states' science frameworks. This essay concludes with recommendations for reworking/rewording existing standards to facilitate the goal of promoting science literacy among secondary school students.

  12. How to Establish a Bioinformatics Postgraduate Degree Programme: A Case Study from South Africa

    CERN Document Server

    Machanick, Philip

    2014-01-01

    The Research Unit in Bioinformatics at Rhodes University (RUBi), South Africa offers a Masters of Science in Bioinformatics. Growing demand for Bioinformatics qualifications results in applications from across Africa. Courses aim to bridge gaps in the diverse backgrounds of students who range from biologists with no prior computing exposure to computer scientists with no biology background. The programme is evenly split between coursework and research, with diverse modules from a range of departments covering mathematics, statistics, computer science and biology, with emphasis on application to bioinformatics research. The early focus on research helps bring students up to speed with working as a researcher. We measure success of the programme by the high rate of subsequent entry to PhD study: 10 out of 14 students who completed in the years 2011-2013.

  13. Bioinformatic science and devices for computer analysis and visualization of macromolecules

    Directory of Open Access Journals (Sweden)

    Yu.B. Porozov

    2010-06-01

    Full Text Available The goals and objectives of bioinformatic science are presented in the article. The main methods and approaches used in computer biology are highlighted. Areas in which bioinformatic science can greatly facilitate and speed up the work of practical biologist and pharmacologist are revealed. The features of both the basic packages and software devices for complete, thorough analysis of macromolecules and for development and modeling of ligands and binding centers are described

  14. Recent advances in operations research in computational biology, bioinformatics and medicine

    OpenAIRE

    Türkay, Metin; Felici, Giovanni; Szachniuk, Marta; Lukasiak, Piotr

    2014-01-01

    The EURO Working Group on Operations Research in Computational Biology, Bioinformatics and Medicine held its fourth conference in Poznan-Biedrusko, Poland, June 26-28, 2014. The editorial board of RAIRO-OR invited submissions of papers to a special issue on Recent Advances in Operations Research in Computational Biology, Bioinformatics and Medicine. This special issue includes nine papers that were selected among forty presentations and included in this special issue after two rounds of revie...

  15. A New Technique to Manage Big Bioinformatics Data Using Genetic Algorithms

    Directory of Open Access Journals (Sweden)

    Huda Jalil Dikhil

    2016-10-01

    Full Text Available The continuous growth of data, mainly the medical data at laboratories becomes very complex to use and to manage by using traditional ways. So, the researchers start studying genetic information field which increased in the past thirty years in bioinformatics domain (the computer science field, genetic biology field, and DNA. This growth of data becomes known as big bioinformatics data. Thus, efficient algorithms such as Genetic Algorithms are needed to deal with this big and vast amount of bioinformatics data in genetic laboratories. So the researchers proposed two models to manage the big bioinformatics data in addition to the traditional model. The first model by applying Genetic Algorithms before MapReduce, the second model by applying Genetic Algorithms after the MapReduce, and the original or the traditional model by applying only MapReduce without using Genetic Algorithms. The three models were implemented and evaluated using big bioinformatics data collected from the Duchenne Muscular Dystrophy (DMD disorder. The researchers conclude that the second model is the best one among the three models in reducing the size of the data, in execution time, and in addition to the ability to manage and summarize big bioinformatics data. Finally by comparing the percentage errors of the second model with the first model and the traditional model, the researchers obtained the following results 1.136%, 10.227%, and 11.363% respectively. So the second model is the most accurate model with the less percentage error.

  16. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis.

    Directory of Open Access Journals (Sweden)

    Roslyn D Noar

    Full Text Available Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that

  17. What can bioinformatics do for Natural History museums?

    Directory of Open Access Journals (Sweden)

    Becerra, José María

    2003-06-01

    Full Text Available We propose the founding of a Natural History bioinformatics framework, which would solve one of the main problems in Natural History: data which is scattered around in many incompatible systems (not only computer systems, but also paper ones. This framework consists of computer resources (hardware and software, methodologies that ease the circulation of data, and staff expert in dealing with computers, who will develop software solutions to the problems encountered by naturalists. This system is organized in three layers: acquisition, data and analysis. Each layer is described, and an account of the elements that constitute it given.

    Se presentan las bases de una estructura bioinformática para Historia Natural, que trata de resolver uno de los principales problemas en ésta: la presencia de datos distribuidos a lo largo de muchos sistemas incompatibles entre sí (y no sólo hablamos de sistemas informáticos, sino también en papel. Esta estructura se sustenta en recursos informáticos (en sus dos vertientes: hardware y software, en metodologías que permitan la fácil circulación de los datos, y personal experto en el uso de ordenadores que se encargue de desarrollar soluciones software a los problemas que plantean los naturalistas. Este sistema estaría organizado en tres capas: de adquisición, de datos y de análisis. Cada una de estas capas se describe, indicando los elementos que la componen.

  18. Automatic Discovery and Inferencing of Complex Bioinformatics Web Interfaces

    Energy Technology Data Exchange (ETDEWEB)

    Ngu, A; Rocco, D; Critchlow, T; Buttler, D

    2003-12-22

    The World Wide Web provides a vast resource to genomics researchers in the form of web-based access to distributed data sources--e.g. BLAST sequence homology search interfaces. However, the process for seeking the desired scientific information is still very tedious and frustrating. While there are several known servers on genomic data (e.g., GeneBank, EMBL, NCBI), that are shared and accessed frequently, new data sources are created each day in laboratories all over the world. The sharing of these newly discovered genomics results are hindered by the lack of a common interface or data exchange mechanism. Moreover, the number of autonomous genomics sources and their rate of change out-pace the speed at which they can be manually identified, meaning that the available data is not being utilized to its full potential. An automated system that can find, classify, describe and wrap new sources without tedious and low-level coding of source specific wrappers is needed to assist scientists to access to hundreds of dynamically changing bioinformatics web data sources through a single interface. A correct classification of any kind of Web data source must address both the capability of the source and the conversation/interaction semantics which is inherent in the design of the Web data source. In this paper, we propose an automatic approach to classify Web data sources that takes into account both the capability and the conversational semantics of the source. The ability to discover the interaction pattern of a Web source leads to increased accuracy in the classification process. At the same time, it facilitates the extraction of process semantics, which is necessary for the automatic generation of wrappers that can interact correctly with the sources.

  19. Bioinformatic Identification and Analysis of Extensins in the Plant Kingdom

    Science.gov (United States)

    Liu, Xiao; Wolfe, Richard; Welch, Lonnie R.; Domozych, David S.; Popper, Zoë A.; Showalter, Allan M.

    2016-01-01

    Extensins (EXTs) are a family of plant cell wall hydroxyproline-rich glycoproteins (HRGPs) that are implicated to play important roles in plant growth, development, and defense. Structurally, EXTs are characterized by the repeated occurrence of serine (Ser) followed by three to five prolines (Pro) residues, which are hydroxylated as hydroxyproline (Hyp) and glycosylated. Some EXTs have Tyrosine (Tyr)-X-Tyr (where X can be any amino acid) motifs that are responsible for intramolecular or intermolecular cross-linkings. EXTs can be divided into several classes: classical EXTs, short EXTs, leucine-rich repeat extensins (LRXs), proline-rich extensin-like receptor kinases (PERKs), formin-homolog EXTs (FH EXTs), chimeric EXTs, and long chimeric EXTs. To guide future research on the EXTs and understand evolutionary history of EXTs in the plant kingdom, a bioinformatics study was conducted to identify and classify EXTs from 16 fully sequenced plant genomes, including Ostreococcus lucimarinus, Chlamydomonas reinhardtii, Volvox carteri, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Pinus taeda, Picea abies, Brachypodium distachyon, Zea mays, Oryza sativa, Glycine max, Medicago truncatula, Brassica rapa, Solanum lycopersicum, and Solanum tuberosum, to supplement data previously obtained from Arabidopsis thaliana and Populus trichocarpa. A total of 758 EXTs were newly identified, including 87 classical EXTs, 97 short EXTs, 61 LRXs, 75 PERKs, 54 FH EXTs, 38 long chimeric EXTs, and 346 other chimeric EXTs. Several notable findings were made: (1) classical EXTs were likely derived after the terrestrialization of plants; (2) LRXs, PERKs, and FHs were derived earlier than classical EXTs; (3) monocots have few classical EXTs; (4) Eudicots have the greatest number of classical EXTs and Tyr-X-Tyr cross-linking motifs are predominantly in classical EXTs; (5) green algae have no classical EXTs but have a number of long chimeric EXTs that are absent in

  20. The bioinformatics of psychosocial genomics in alternative and complementary medicine.

    Science.gov (United States)

    Rossi, E

    2003-06-01

    The bioinformatics of alternative and complementary medicine is outlined in 3 hypotheses that extend the molecular-genomic revolution initiated by Watson and Crick 50 years ago to include psychology in the new discipline of psychosocial and cultural genomics. Stress-induced changes in the alternative splicing of genes demonstrate how psychosomatic stress in humans modulates activity-dependent gene expression, protein formation, physiological function, and psychological experience. The molecular messengers generated by stress, injury, and disease can activate immediate early genes within stem cells so that they then signal the target genes required to synthesize the proteins that will transform (differentiate) stem cells into mature well-functioning tissues. Such activity-dependent gene expression and its consequent activity-dependent neurogenesis and stem cell healing is proposed as the molecular-genomic-cellular basis of rehabilitative medicine, physical, and occupational therapy as well as the many alternative and complementary approaches to mind-body healing. The therapeutic replaying of enriching life experiences that evoke the novelty-numinosum-neurogenesis effect during creative moments of art, music, dance, drama, humor, literature, poetry, and spirituality, as well as cultural rituals of life transitions (birth, puberty, marriage, illness, healing, and death) can optimize consciousness, personal relationships, and healing in a manner that has much in common with the psychogenomic foundations of naturalistic and complementary medicine. The entire history of alternative and complementary approaches to healing is consistent with this new neuroscience world view about the role of psychological arousal and fascination in modulating gene expression, neurogenesis, and healing via the psychosocial and cultural rites of human societies.

  1. Bioinformatics Analyses of the Role of Vascular Endothelial Growth Factor in Patients with Non-Small Cell Lung Cancer.

    Directory of Open Access Journals (Sweden)

    Ying Wang

    Full Text Available This study was aimed to identify the expression pattern of vascular endothelial growth factor (VEGF in non-small cell lung cancer (NSCLC and to explore its potential correlation with the progression of NSCLC.Gene expression profile GSE39345 was downloaded from the Gene Expression Omnibus database. Twenty healthy controls and 32 NSCLC samples before chemotherapy were analyzed to identify the differentially expressed genes (DEGs. Then pathway enrichment analysis of the DEGs was performed and protein-protein interaction networks were constructed. Particularly, VEGF genes and the VEGF signaling pathway were analyzed. The sub-network was constructed followed by functional enrichment analysis.Total 1666 up-regulated and 1542 down-regulated DEGs were identified. The down-regulated DEGs were mainly enriched in the pathways associated with cancer. VEGFA and VEGFB were found to be the initiating factor of VEGF signaling pathway. In addition, in the epidermal growth factor receptor (EGFR, VEGFA and VEGFB associated sub-network, kinase insert domain receptor (KDR, fibronectin 1 (FN1, transforming growth factor beta induced (TGFBI and proliferating cell nuclear antigen (PCNA were found to interact with at least two of the three hub genes. The DEGs in this sub-network were mainly enriched in Gene Ontology terms related to cell proliferation.EGFR, KDR, FN1, TGFBI and PCNA may interact with VEGFA to play important roles in NSCLC tumorigenesis. These genes and corresponding proteins may have the potential to be used as the targets for either diagnosis or treatment of patients with NSCLC.

  2. Microscopy and Bioinformatic Analyses of Lipid Metabolism Implicate a Sporophytic Signaling Network Supporting Pollen Development in Arabidopsis

    Institute of Scientific and Technical Information of China (English)

    Yixing Wang; Hong Wu; Ming Yang

    2008-01-01

    The Arabidopsis sporophytic tapetum undergoes a programmed degeneration process to secrete lipid and other materials to support pollen development.However,the molecular mechanism regulating the degeneration process is unknown.To gain insight into this molecular mechanism,we first determined that the most critical period for tapetal secretion to support pollen development iS from the vacuolate microspore stage to the early binucleate pollen stage.We then analyzed the expression of enzymes responsible for lipid biosynthesis and degradation with available in-silico data.The genes for these enzymes that are expressed in the stamen but not in the concurrent uninucleate microspore and binucleate pollen are of particular interest,as they presumably hold the clues to unique molecular processes in the sporophytic tissues compared to the gametophytic tissue.No gene for lipid biosynthesis but a single gene encoding a patatin-like protein likely for lipid mobilization was identified based on the selection criterion.A search for genes co-expressed with this gene identified additional genes encoding typical signal transduction components such as a leucine-rich repeat receptor kinase,an extra-large G-protein,other protein kinases,and transcription factors.In addition,proteases,cell wall degradation enzymes,and other proteins were also identified.These proteins thus may be components of a signaling network leading to degradation of a broad range of cellular components.Since a broad range of degradation activities is expected to occur only in the tapetal degeneration process at this stage in the stamen,it iS further hypothesized that the signaling network acts in the tapetal degeneration process.

  3. Bioinformatics pipelines for targeted resequencing and whole-exome sequencing of human and mouse genomes: a virtual appliance approach for instant deployment.

    Directory of Open Access Journals (Sweden)

    Jason Li

    Full Text Available Targeted resequencing by massively parallel sequencing has become an effective and affordable way to survey small to large portions of the genome for genetic variation. Despite the rapid development in open source software for analysis of such data, the practical implementation of these tools through construction of sequencing analysis pipelines still remains a challenging and laborious activity, and a major hurdle for many small research and clinical laboratories. We developed TREVA (Targeted REsequencing Virtual Appliance, making pre-built pipelines immediately available as a virtual appliance. Based on virtual machine technologies, TREVA is a solution for rapid and efficient deployment of complex bioinformatics pipelines to laboratories of all sizes, enabling reproducible results. The analyses that are supported in TREVA include: somatic and germline single-nucleotide and insertion/deletion variant calling, copy number analysis, and cohort-based analyses such as pathway and significantly mutated genes analyses. TREVA is flexible and easy to use, and can be customised by Linux-based extensions if required. TREVA can also be deployed on the cloud (cloud computing, enabling instant access without investment overheads for additional hardware. TREVA is available at http://bioinformatics.petermac.org/treva/.

  4. Rapid cloning and bioinformatic analysis of spinach Y chromosome-specific EST sequences

    Indian Academy of Sciences (India)

    Chuan-Liang Deng; Wei-Li Zhang; Ying Cao; Shao-Jing Wang; Shu-Fen Li; Wu-Jun Gao; Long-Dou Lu

    2015-12-01

    The genome of spinach single chromosome complement is about 1000 Mbp, which is the model material to study the molecular mechanisms of plant sex differentiation. The cytological study showed that the biggest spinach chromosome (chromosome 1) was taken as spinach sex chromosome. It had three alleles of sex-related , m and . Many researchers have been trying to clone the sex-determining genes and investigated the molecular mechanism of spinach sex differentiation. However, there are no successful cloned reports about these genes. A new technology combining chromosome microdissection with hybridization-specific amplification (HSA) was adopted. The spinach Y chromosome degenerate oligonucleotide primed-PCR (DOP-PCR) products were hybridized with cDNA of the male spinach flowers in florescence. The female spinach genome was taken as blocker and cDNA library specifically expressed in Y chromosome was constructed. Moreover, expressed sequence tag (EST) sequences in cDNA library were cloned, sequenced and bioinformatics was analysed. There were 63 valid EST sequences obtained in this study. The fragment size was between 53 and 486 bp. BLASTn homologous alignment indicated that 12 EST sequences had homologous sequences of nucleic acids, the rest were new sequences. BLASTx homologous alignment indicated that 16 EST sequences had homologous protein-encoding nucleic acid sequence. The spinach Y chromosome-specific EST sequences laid the foundation for cloning the functional genes, specifically expressed in spinach Y chromosome. Meanwhile, the establishment of the technology system in the research provided a reference for rapid cloning of other biological sex chromosome-specific EST sequences.

  5. Bioinformatic and functional characterization of the basic peroxidase 72 from Arabidopsis thaliana involved in lignin biosynthesis.

    Science.gov (United States)

    Herrero, Joaquín; Fernández-Pérez, Francisco; Yebra, Tatiana; Novo-Uzal, Esther; Pomar, Federico; Pedreño, Ma Ángeles; Cuello, Juan; Guéra, Alfredo; Esteban-Carrasco, Alberto; Zapata, José Miguel

    2013-06-01

    Lignins result from the oxidative polymerization of three hydroxycinnamyl (p-coumaryl, coniferyl, and sinapyl) alcohols in a reaction mediated by peroxidases. The most important of these is the cationic peroxidase from Zinnia elegans (ZePrx), an enzyme considered to be responsible for the last step of lignification in this plant. Bibliographical evidence indicates that the arabidopsis peroxidase 72 (AtPrx72), which is homolog to ZePrx, could have an important role in lignification. For this reason, we performed a bioinformatic, histochemical, photosynthetic, and phenotypical and lignin composition analysis of an arabidopsis knock-out mutant of AtPrx72 with the aim of characterizing the effects that occurred due to the absence of expression of this peroxidase from the aspects of plant physiology such as vascular development, lignification, and photosynthesis. In silico analyses indicated a high homology between AtPrx72 and ZePrx, cell wall localization and probably optimal levels of translation of AtPrx72. The histochemical study revealed a low content in syringyl units and a decrease in the amount of lignin in the atprx72 mutant plants compared to WT. The atprx72 mutant plants grew more slowly than WT plants, with both smaller rosette and principal stem, and with fewer branches and siliques than the WT plants. Lastly, chlorophyll a fluorescence revealed a significant decrease in ΦPSII and q L in atprx72 mutant plants that could be related to changes in carbon partitioning and/or utilization of redox equivalents in arabidopsis metabolism. The results suggest an important role of AtPrx72 in lignin biosynthesis. In addition, knock-out plants were able to respond and adapt to an insufficiency of lignification.

  6. Bioinformatics analysis for structure and function ofCPR ofPlasmodium falciparum

    Institute of Scientific and Technical Information of China (English)

    ZhigangFan; Lingmin Zhang; GuogangYan; QiangWu; XiufengGan; Saifeng Zhong; GuifenLin

    2011-01-01

    Objective:To analyse the structure and function ofNADPH-cytochrome p450 reductase(CYPOR orCPR) fromPlasmodium falciparum (Pf), and to predict its’ drug target and vaccine target. Methods: The structure, function, drug target and vaccine target ofCPR fromPlasmodium falciparum were analyzed and predicted by bioinformatics methods.Results:PfCPR, which was olderCPR, had close relationship with theCPR from otherPlasmodium species, but it was distant from its hosts, such asHomo sapiens andAnopheles.PfCPR was located in the cellular nucleus ofPlasmodium falciparum.335aa-352aa and591aa -608aa were inserted the interior side of the nuclear membrane, while151aa-265aa was located in the nucleolus organizer regions.PfCPR had40 function sites and44 protein-protein binding sites in amino acid sequence. The teriary structure of 1aa-700aa was forcep-shaped with wings.15 segments ofPfCPR had no homology withHomo sapien CPR and most were exposed on the surface of the protein. These segments had25 protein-protein binding sites. While13other segments all possessed function sites. Conclusions: The evolution or genesis ofPlasmodium falciparum is earlier than those ofHomo sapiens. PfCPR is a possible resistance site of antimalarial drug and may involve immune evasion, which is associated with parasite of sporozoite in hepatocytes.PfCPR is unsuitable as vaccine target, but it has at least 13 ideal drug targets.

  7. Molecular cloning and bioinformatic analysis of the Streptococcus agalactiae neuA gene isolated from tilapia.

    Science.gov (United States)

    Wang, E L; Wang, K Y; Chen, D F; Geng, Y; Huang, L Y; Wang, J; He, Y

    2015-06-01

    Cytidine monophosphate (CMP) N-acetylneuraminic acid (NeuNAc) synthetase, which is encoded by the neuA gene, can catalyze the activation of sialic acid with CMP, and plays an important role in Streptococcus agalactiae infection pathogenesis. To study the structure and function of the S. agalactiae neuA gene, we isolated it from diseased tilapia, amplified it using polymerase chain reaction (PCR) with specific primers, and cloned it into a pMD19-T vector. The recombinant plasmid was confirmed by PCR and restriction enzyme digestion, and identified by sequencing. Molecular characterization analyses of the neuA nucleotide amino acid sequence were performed using bioinformatic tools and an online server. The results showed that the neuA nucleotide sequence contained a complete coding region, which comprised 1242 bp, encoding 413 amino acids (aa). The aa sequence was highly conserved and contained a Glyco_tranf_GTA_type superfamily and an SGNH_hydrolase superfamily conserved domain, which are related to sialic acid activation catalysis. The NeuA protein possessed many important sites related to post-translational modification, including 28 potential phosphorylation sites and 2 potential N-glycosylation sites, had no signal peptides or transmembrane regions, and was predicted to reside in the cytoplasm. Moreover, the protein had some B-cell epitopes, which suggests its potential in development of a vaccine against S. agalactiae infection. The codon usage frequency of neuA differed greatly in Escherichia coli and Homo sapiens genes, and neuA may be more efficiently expressed in eukaryotes (yeast). S. agalactiae neuA from tilapia maintains high structural homology and sequence identity with CMP-NeuNAc synthetases from other bacteria.

  8. Bioinformatic prediction, deep sequencing of microRNAs and expression analysis during phenotypic plasticity in the pea aphid, Acyrthosiphon pisum

    Directory of Open Access Journals (Sweden)

    Leterme Nathalie

    2010-05-01

    Full Text Available Abstract Background Post-transcriptional regulation in eukaryotes can be operated through microRNA (miRNAs mediated gene silencing. MiRNAs are small (18-25 nucleotides non-coding RNAs that play crucial role in regulation of gene expression in eukaryotes. In insects, miRNAs have been shown to be involved in multiple mechanisms such as embryonic development, tissue differentiation, metamorphosis or circadian rhythm. Insect miRNAs have been identified in different species belonging to five orders: Coleoptera, Diptera, Hymenoptera, Lepidoptera and Orthoptera. Results We developed high throughput Solexa sequencing and bioinformatic analyses of the genome of the pea aphid Acyrthosiphon pisum in order to identify the first miRNAs from a hemipteran insect. By combining these methods we identified 149 miRNAs including 55 conserved and 94 new miRNAs. Moreover, we investigated the regulation of these miRNAs in different alternative morphs of the pea aphid by analysing the expression of miRNAs across the switch of reproduction mode. Pea aphid microRNA sequences have been posted to miRBase: http://microrna.sanger.ac.uk/sequences/ Conclusions Our study has identified candidates as putative regulators involved in reproductive polyphenism in aphids and opens new avenues for further functional analyses.

  9. Genomics Politics through Space and Time: The Case of Bioinformatics in Brazil.

    Science.gov (United States)

    Bicudo, Edison

    2016-01-01

    The emergence of scientific disciplines, as well as the policies aimed to steer them, have geographical implications. This becomes visible in areas such as genomics and related fields. In this paper, the relation between scientific evolution, political decisions and geographical configuration is studied. The recent formation of bioinformatics in Brazil is focused on. The study involves an analysis of data collected on the website of CNPq, a funding agency attached to the Ministry of Science and Technology. Furthermore, I conducted fieldwork in four cities, interviewing 15 bioinformaticians. In the history of Brazilian bioinformatics, three periods can be identified. In the first period (1900-1996), bioinformatics was actually absent, but biology research groups were formed which would subsequently explore bioinformatics. The second period (1997-2006) was marked by the emergence of the discipline and geographical concentration of major research groups in the southern part of Brazil. A third period can be pointed to (2007-2014), in which political choices have turned geographical diffusion and institutional equality into a national target. As a consequence of the recent shifts, genomics and bioinformatics researchers have been involved in a debate, some defending the existence of few specialized research and sequencing platforms, whereas others welcoming the constitution of a scientific scenario based on decentralized platforms. I defend an intermediate solution, whereby some places would be selected to be genomics hubs. This would fit the regional diversity of this vast country, in addition to tackling the scientific weaknesses of the northern area.

  10. Bioinformatics for transporter pharmacogenomics and systems biology: data integration and modeling with UML.

    Science.gov (United States)

    Yan, Qing

    2010-01-01

    Bioinformatics is the rational study at an abstract level that can influence the way we understand biomedical facts and the way we apply the biomedical knowledge. Bioinformatics is facing challenges in helping with finding the relationships between genetic structures and functions, analyzing genotype-phenotype associations, and understanding gene-environment interactions at the systems level. One of the most important issues in bioinformatics is data integration. The data integration methods introduced here can be used to organize and integrate both public and in-house data. With the volume of data and the high complexity, computational decision support is essential for integrative transporter studies in pharmacogenomics, nutrigenomics, epigenetics, and systems biology. For the development of such a decision support system, object-oriented (OO) models can be constructed using the Unified Modeling Language (UML). A methodology is developed to build biomedical models at different system levels and construct corresponding UML diagrams, including use case diagrams, class diagrams, and sequence diagrams. By OO modeling using UML, the problems of transporter pharmacogenomics and systems biology can be approached from different angles with a more complete view, which may greatly enhance the efforts in effective drug discovery and development. Bioinformatics resources of membrane transporters and general bioinformatics databases and tools that are frequently used in transporter studies are also collected here. An informatics decision support system based on the models presented here is available at http://www.pharmtao.com/transporter . The methodology developed here can also be used for other biomedical fields.

  11. Bioinformatics education in high school: implications for promoting science, technology, engineering, and mathematics careers.

    Science.gov (United States)

    Kovarik, Dina N; Patterson, Davis G; Cohen, Carolyn; Sanders, Elizabeth A; Peterson, Karen A; Porter, Sandra G; Chowning, Jeanne Ting

    2013-01-01

    We investigated the effects of our Bio-ITEST teacher professional development model and bioinformatics curricula on cognitive traits (awareness, engagement, self-efficacy, and relevance) in high school teachers and students that are known to accompany a developing interest in science, technology, engineering, and mathematics (STEM) careers. The program included best practices in adult education and diverse resources to empower teachers to integrate STEM career information into their classrooms. The introductory unit, Using Bioinformatics: Genetic Testing, uses bioinformatics to teach basic concepts in genetics and molecular biology, and the advanced unit, Using Bioinformatics: Genetic Research, utilizes bioinformatics to study evolution and support student research with DNA barcoding. Pre-post surveys demonstrated significant growth (n = 24) among teachers in their preparation to teach the curricula and infuse career awareness into their classes, and these gains were sustained through the end of the academic year. Introductory unit students (n = 289) showed significant gains in awareness, relevance, and self-efficacy. While these students did not show significant gains in engagement, advanced unit students (n = 41) showed gains in all four cognitive areas. Lessons learned during Bio-ITEST are explored in the context of recommendations for other programs that wish to increase student interest in STEM careers.

  12. XML schemas for common bioinformatic data types and their application in workflow systems

    Directory of Open Access Journals (Sweden)

    Mersch Henning

    2006-11-01

    Full Text Available Abstract Background Today, there is a growing need in bioinformatics to combine available software tools into chains, thus building complex applications from existing single-task tools. To create such workflows, the tools involved have to be able to work with each other's data – therefore, a common set of well-defined data formats is needed. Unfortunately, current bioinformatic tools use a great variety of heterogeneous formats. Results Acknowledging the need for common formats, the Helmholtz Open BioInformatics Technology network (HOBIT identified several basic data types used in bioinformatics and developed appropriate format descriptions, formally defined by XML schemas, and incorporated them in a Java library (BioDOM. These schemas currently cover sequence, sequence alignment, RNA secondary structure and RNA secondary structure alignment formats in a form that is independent of any specific program, thus enabling seamless interoperation of different tools. All XML formats are available at http://bioschemas.sourceforge.net, the BioDOM library can be obtained at http://biodom.sourceforge.net. Conclusion The HOBIT XML schemas and the BioDOM library simplify adding XML support to newly created and existing bioinformatic tools, enabling these tools to interoperate seamlessly in workflow scenarios.

  13. High-throughput bioinformatics with the Cyrille2 pipeline system.

    NARCIS (Netherlands)

    Fiers, M.W.E.J.; Burgt, van der A.; Datema, E.; Groot, de J.C.W.; Ham, van R.C.H.J.

    2008-01-01

    Background - Modern omics research involves the application of high-throughput technologies that generate vast volumes of data. These data need to be pre-processed, analyzed and integrated with existing knowledge through the use of diverse sets of software tools, models and databases. The analyses a

  14. Innovative bioinformatic approaches for developing peptide-based vaccines against hypervariable viruses.

    Science.gov (United States)

    Sirskyj, Danylo; Diaz-Mitoma, Francisco; Golshani, Ashkan; Kumar, Ashok; Azizi, Ali

    2011-01-01

    The application of the fields of pharmacogenomics and pharmacogenetics to vaccine design has been recently labeled 'vaccinomics'. This newly named area of vaccine research, heavily intertwined with bioinformatics, seems to be leading the charge in developing novel vaccines for currently unmet medical needs against hypervariable viruses such as human immunodeficiency virus (HIV), hepatitis C and emerging avian and swine influenza. Some of the more recent bioinformatic approaches in the area of vaccine research include the use of epitope determination and prediction algorithms for exploring the use of peptide epitopes as vaccine immunogens. This paper briefly discusses and explores some current uses of bioinformatics in vaccine design toward the pursuit of peptide vaccines for hypervariable viruses. The various informatics and vaccine design strategies attempted by other groups toward hypervariable viruses will also be briefly examined, along with the strategy used by our group in the design and synthesis of peptide immunogens for candidate HIV and influenza vaccines.

  15. NETTAB 2014: From high-throughput structural bioinformatics to integrative systems biology.

    Science.gov (United States)

    Romano, Paolo; Cordero, Francesca

    2016-03-02

    The fourteenth NETTAB workshop, NETTAB 2014, was devoted to a range of disciplines going from structural bioinformatics, to proteomics and to integrative systems biology. The topics of the workshop were centred around bioinformatics methods, tools, applications, and perspectives for models, standards and management of high-throughput biological data, structural bioinformatics, functional proteomics, mass spectrometry, drug discovery, and systems biology.43 scientific contributions were presented at NETTAB 2014, including keynote, special guest and tutorial talks, oral communications, and posters. Full papers from some of the best contributions presented at the workshop were later submitted to a special Call for this Supplement.Here, we provide an overview of the workshop and introduce manuscripts that have been accepted for publication in this Supplement.

  16. Engaging Students in a Bioinformatics Activity to Introduce Gene Structure and Function

    Directory of Open Access Journals (Sweden)

    Barbara J. May

    2013-02-01

    Full Text Available Bioinformatics spans many fields of biological research and plays a vital role in mining and analyzing data. Therefore, there is an ever-increasing need for students to understand not only what can be learned from this data, but also how to use basic bioinformatics tools.  This activity is designed to provide secondary and undergraduate biology students to a hands-on activity meant to explore and understand gene structure with the use of basic bioinformatic tools.  Students are provided an “unknown” sequence from which they are asked to use a free online gene finder program to identify the gene. Students then predict the putative function of this gene with the use of additional online databases.

  17. A Survey of Bioinformatics Database and Software Usage through Mining the Literature.

    Directory of Open Access Journals (Sweden)

    Geraint Duck

    Full Text Available Computer-based resources are central to much, if not most, biological and medical research. However, while there is an ever expanding choice of bioinformatics resources to use, described within the biomedical literature, little work to date has provided an evaluation of the full range of availability or levels of usage of database and software resources. Here we use text mining to process the PubMed Central full-text corpus, identifying mentions of databases or software within the scientific literature. We provide an audit of the resources contained within the biomedical literature, and a comparison of their relative usage, both over time and between the sub-disciplines of bioinformatics, biology and medicine. We find that trends in resource usage differs between these domains. The bioinformatics literature emphasises novel resource development, while database and software usage within biology and medicine is more stable and conservative. Many resources are only mentioned in the bioinformatics literature, with a relatively small number making it out into general biology, and fewer still into the medical literature. In addition, many resources are seeing a steady decline in their usage (e.g., BLAST, SWISS-PROT, though some are instead seeing rapid growth (e.g., the GO, R. We find a striking imbalance in resource usage with the top 5% of resource names (133 names accounting for 47% of total usage, and over 70% of resources extracted being only mentioned once each. While these results highlight the dynamic and creative nature of bioinformatics research they raise questions about software reuse, choice and the sharing of bioinformatics practice. Is it acceptable that so many resources are apparently never reused? Finally, our work is a step towards automated extraction of scientific method from text. We make the dataset generated by our study available under the CC0 license here: http://dx.doi.org/10.6084/m9.figshare.1281371.

  18. Cake: a bioinformatics pipeline for the integrated analysis of somatic variants in cancer genomes

    Science.gov (United States)

    Rashid, Mamunur; Robles-Espinoza, Carla Daniela; Rust, Alistair G.; Adams, David J.

    2013-01-01

    Summary: We have developed Cake, a bioinformatics software pipeline that integrates four publicly available somatic variant-calling algorithms to identify single nucleotide variants with higher sensitivity and accuracy than any one algorithm alone. Cake can be run on a high-performance computer cluster or used as a stand-alone application. Availabilty: Cake is open-source and is available from http://cakesomatic.sourceforge.net/ Contact: da1@sanger.ac.uk Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:23803469

  19. Bioinformatics for the synthetic biology of natural products: integrating across the Design–Build–Test cycle

    Science.gov (United States)

    Currin, Andrew; Jervis, Adrian J.; Rattray, Nicholas J. W.; Swainston, Neil; Yan, Cunyu; Breitling, Rainer

    2016-01-01

    Covering: 2000 to 2016 Progress in synthetic biology is enabled by powerful bioinformatics tools allowing the integration of the design, build and test stages of the biological engineering cycle. In this review we illustrate how this integration can be achieved, with a particular focus on natural products discovery and production. Bioinformatics tools for the DESIGN and BUILD stages include tools for the selection, synthesis, assembly and optimization of parts (enzymes and regulatory elements), devices (pathways) and systems (chassis). TEST tools include those for screening, identification and quantification of metabolites for rapid prototyping. The main advantages and limitations of these tools as well as their interoperability capabilities are highlighted. PMID:27185383

  20. Rough-fuzzy pattern recognition applications in bioinformatics and medical imaging

    CERN Document Server

    Maji, Pradipta

    2012-01-01

    Learn how to apply rough-fuzzy computing techniques to solve problems in bioinformatics and medical image processing Emphasizing applications in bioinformatics and medical image processing, this text offers a clear framework that enables readers to take advantage of the latest rough-fuzzy computing techniques to build working pattern recognition models. The authors explain step by step how to integrate rough sets with fuzzy sets in order to best manage the uncertainties in mining large data sets. Chapters are logically organized according to the major phases of pattern recognition systems dev

  1. Evaluating the Effectiveness of a Practical Inquiry-Based Learning Bioinformatics Module on Undergraduate Student Engagement and Applied Skills

    Science.gov (United States)

    Brown, James A. L.

    2016-01-01

    A pedagogic intervention, in the form of an inquiry-based peer-assisted learning project (as a practical student-led bioinformatics module), was assessed for its ability to increase students' engagement, practical bioinformatic skills and process-specific knowledge. Elements assessed were process-specific knowledge following module completion,…

  2. A Critical Analysis of Assessment Quality in Genomics and Bioinformatics Education Research

    Science.gov (United States)

    Campbell, Chad E.; Nehm, Ross H.

    2013-01-01

    The growing importance of genomics and bioinformatics methods and paradigms in biology has been accompanied by an explosion of new curricula and pedagogies. An important question to ask about these educational innovations is whether they are having a meaningful impact on students' knowledge, attitudes, or skills. Although assessments are…

  3. Implementation and Assessment of a Molecular Biology and Bioinformatics Undergraduate Degree Program

    Science.gov (United States)

    Pham, Daphne Q. -D.; Higgs, David C.; Statham, Anne; Schleiter, Mary Kay

    2008-01-01

    The Department of Biological Sciences at the University of Wisconsin-Parkside has developed and implemented an innovative, multidisciplinary undergraduate curriculum in Molecular Biology and Bioinformatics (MBB). The objective of the MBB program is to give students a hands-on facility with molecular biology theories and laboratory techniques, an…

  4. A Linked Series of Laboratory Exercises in Molecular Biology Utilizing Bioinformatics and GFP

    Science.gov (United States)

    Medin, Carey L.; Nolin, Katie L.

    2011-01-01

    Molecular biologists commonly use bioinformatics to map and analyze DNA and protein sequences and to align different DNA and protein sequences for comparison. Additionally, biologists can create and view 3D models of protein structures to further understand intramolecular interactions. The primary goal of this 10-week laboratory was to introduce…

  5. Alu Insertions and Genetic Diversity: A Preliminary Investigation by an Undergraduate Bioinformatics Class

    Science.gov (United States)

    Elwess, Nancy L.; Duprey, Stephen L.; Harney, Lindesay A.; Langman, Jessie E.; Marino, Tara C.; Martinez, Carolina; McKeon, Lauren L.; Moss, Chantel I. E.; Myrie, Sasha S.; Taylor, Luke Ryan

    2008-01-01

    "Alu"-insertion polymorphisms were used by an undergraduate Bioinformatics class to study how these insertion sites could be the basis for an investigation in human population genetics. Based on the students' investigation, both allele and genotype "Alu" frequencies were determined for African-American and Japanese populations as well as a…

  6. Expression and bioinformatic analysis of lymphoma-associated novel gene KIAA0372

    Institute of Scientific and Technical Information of China (English)

    BAI Xiangyang; TANG Duozhuang; ZHU Tao; SUN Lishi; YAN Lingling; LU Yunping; ZHOU Jianfeng; MA Ding

    2007-01-01

    The purpose of this study was to explore the differentially expressed genes in lymph-node cells (LNC) of lymphomas and reactive lymph node hyperplasia,and to perform an initial bioinformatic analysis on a novel gene,KIAA0372,which is highly expressed in the LNC of lymphomas.mRNA extracted from LNC of lymphomas and reactive lymph node hyperplasia were respectively marked with biotin and hybridized with Gene Expression Chips,resulting in differentially expressed genes.Initial bioinformatic analysis was then performed on a novel gene named KIAA0372,whose function has not yet been explored.Its structure and genomic location,its product's physical and chemical properties,subcellular localization and functional domains,were also predicted.Further,a systematic evolution analysis was performed on similar proteins from among several species.Using Gene Expression Chips,many differentially expressed genes were uncovered.Efficient bioinformatic analysis has fundamentally determined that KIAA0372 is an extracellular protein which may be involved in TGF-β signaling.Microarray is an efficient and high throughput strategy for detection of differentially expressed genes.And KIAA0372 is thought to be a potential target for tumor research using bioinformatic analysis.

  7. IFPA meeting 2016 workshop report I: Genomic communication, bioinformatics, trophoblast biology and transport systems.

    Science.gov (United States)

    Albrecht, Christiane; Baker, Julie C; Blundell, Cassidy; Chavez, Shawn L; Carbone, Lucia; Chamley, Larry; Hannibal, Roberta L; Illsley, Nick; Kurre, Peter; Laurent, Louise C; McKenzie, Charles; Morales-Prieto, Diana; Pantham, Priyadarshini; Paquette, Alison; Powell, Katie; Price, Nathan; Rao, Balaji M; Sadovsky, Yoel; Salomon, Carlos; Tuteja, Geetu; Wilson, Samantha; O'Tierney-Ginn, P F

    2017-01-11

    Workshops are an important part of the IFPA annual meeting as they allow for discussion of specialized topics. At IFPA meeting 2016 there were twelve themed workshops, four of which are summarized in this report. These workshops covered innovative technologies applied to new and traditional areas of placental research: 1) genomic communication; 2) bioinformatics; 3) trophoblast biology and pathology; 4) placental transport systems.

  8. Nanoinformatics: an emerging area of information technology at the intersection of bioinformatics, computational chemistry and nanobiotechnology.

    Science.gov (United States)

    González-Nilo, Fernando; Pérez-Acle, Tomás; Guínez-Molinos, Sergio; Geraldo, Daniela A; Sandoval, Claudia; Yévenes, Alejandro; Santos, Leonardo S; Laurie, V Felipe; Mendoza, Hegaly; Cachau, Raúl E

    2011-01-01

    After the progress made during the genomics era, bioinformatics was tasked with supporting the flow of information generated by nanobiotechnology efforts. This challenge requires adapting classical bioinformatic and computational chemistry tools to store, standardize, analyze, and visualize nanobiotechnological information. Thus, old and new bioinformatic and computational chemistry tools have been merged into a new sub-discipline: nanoinformatics. This review takes a second look at the development of this new and exciting area as seen from the perspective of the evolution of nanobiotechnology applied to the life sciences. The knowledge obtained at the nano-scale level implies answers to new questions and the development of new concepts in different fields. The rapid convergence of technologies around nanobiotechnologies has spun off collaborative networks and web platforms created for sharing and discussing the knowledge generated in nanobiotechnology. The implementation of new database schemes suitable for storage, processing and integrating physical, chemical, and biological properties of nanoparticles will be a key element in achieving the promises in this convergent field. In this work, we will review some applications of nanobiotechnology to life sciences in generating new requirements for diverse scientific fields, such as bioinformatics and computational chemistry.

  9. An "in silico" Bioinformatics Laboratory Manual for Bioscience Departments: "Prediction of Glycosylation Sites in Phosphoethanolamine Transferases"

    Science.gov (United States)

    Alyuruk, Hakan; Cavas, Levent

    2014-01-01

    Genomics and proteomics projects have produced a huge amount of raw biological data including DNA and protein sequences. Although these data have been stored in data banks, their evaluation is strictly dependent on bioinformatics tools. These tools have been developed by multidisciplinary experts for fast and robust analysis of biological data.…

  10. CattleTickBase: An integrated Internet-based bioinformatics resource for Rhipicephalus (Boophilus) microplus

    Science.gov (United States)

    The Rhipicephalus microplus genome is large and complex in structure, making a genome sequence difficult to assemble and costly to resource the required bioinformatics. In light of this, a consortium of international collaborators was formed to pool resources to begin sequencing this genome. We have...

  11. The DBCLS BioHackathon: standardization and interoperability for bioinformatics web services and workflows

    NARCIS (Netherlands)

    Katayama, T.; Arakawa, K.; Nakao, M.; Prins, J.C.P.

    2010-01-01

    Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However, vario

  12. A bioinformatics-based overview of protein Lys-Ne-acetylation

    Science.gov (United States)

    Among posttranslational modifications, there are some conceptual similarities between Lys-N'-acetylation and Ser/Thr/Tyr O-phosphorylation. Herein we present a bioinformatics-based overview of reversible protein Lys-acetylation, including some comparisons with reversible protein phosphorylation. T...

  13. Infusing Bioinformatics and Research-Like Experience into a Molecular Biology Laboratory Course

    Science.gov (United States)

    Nogaj, Luiza A.

    2014-01-01

    A nine-week laboratory project designed for a sophomore level molecular biology course is described. Small groups of students (3-4 per group) choose a tumor suppressor gene (TSG) or an oncogene for this project. Each group researches the role of their TSG/oncogene from primary literature articles and uses bioinformatics engines to find the gene…

  14. Introducing Bioinformatics into the Biology Curriculum: Exploring the National Center for Biotechnology Information.

    Science.gov (United States)

    Smith, Thomas M.; Emmeluth, Donald S.

    2002-01-01

    Explains the potential of computer technology in science education and argues for integrating bioinformatics into the biology curriculum. Describes three modules designed to introduce students to technological advancements in biology. Aims to develop a better understanding of molecular biology among students. (YDS)

  15. Ramping up to the Biology Workbench: A Multi-Stage Approach to Bioinformatics Education

    Science.gov (United States)

    Greene, Kathleen; Donovan, Sam

    2005-01-01

    In the process of designing and field-testing bioinformatics curriculum materials, we have adopted a three-stage, progressive model that emphasizes collaborative scientific inquiry. The elements of the model include: (1) context setting, (2) introduction to concepts, processes, and tools, and (3) development of competent use of technologically…

  16. Incorporating a New Bioinformatics Component into Genetics at a Historically Black College: Outcomes and Lessons

    Science.gov (United States)

    Holtzclaw, J. David; Eisen, Arri; Whitney, Erika M.; Penumetcha, Meera; Hoey, J. Joseph; Kimbro, K. Sean

    2006-01-01

    Many students at minority-serving institutions are underexposed to Internet resources such as the human genome project, PubMed, NCBI databases, and other Web-based technologies because of a lack of financial resources. To change this, we designed and implemented a new bioinformatics component to supplement the undergraduate Genetics course at…

  17. Bioinformatics Education in High School: Implications for Promoting Science, Technology, Engineering, and Mathematics Careers

    Science.gov (United States)

    Kovarik, Dina N.; Patterson, Davis G.; Cohen, Carolyn; Sanders, Elizabeth A.; Peterson, Karen A.; Porter, Sandra G.; Chowning, Jeanne Ting

    2013-01-01

    We investigated the effects of our Bio-ITEST teacher professional development model and bioinformatics curricula on cognitive traits (awareness, engagement, self-efficacy, and relevance) in high school teachers and students that are known to accompany a developing interest in science, technology, engineering, and mathematics (STEM) careers. The…

  18. Bioinformatics analysis identifies several intrinsically disordered human E3 ubiquitin-protein ligases

    DEFF Research Database (Denmark)

    Boomsma, Wouter Krogh; Nielsen, Sofie Vincents; Lindorff-Larsen, Kresten;

    2016-01-01

    conduct a bioinformatics analysis to examine >600 human and S. cerevisiae E3 ligases to identify enzymes that are similar to San1 in terms of function and/or mechanism of substrate recognition. An initial sequence-based database search was found to detect candidates primarily based on the homology...

  19. A novel bioinformatic strategy to characterise microbial communities in biogas reactors

    DEFF Research Database (Denmark)

    Treu, Laura; Campanaro, Stefano; De Francisci, Davide;

    2014-01-01

    , J.J. et al., 2011). For this reason we developed a bioinformatics strategy in order to create a tool to review the generated dataset and to obtain a more strict control on the bacterial composition at the species level, with estimation of its reliability. The program perform local similarity search...

  20. 10 years for the Journal of Bioinformatics and Computational Biology (2003-2013) -- a retrospective.

    Science.gov (United States)

    Eisenhaber, Frank; Sherman, Westley Arthur

    2014-06-01

    The Journal of Bioinformatics and Computational Biology (JBCB) started publishing scientific articles in 2003. It has established itself as home for solid research articles in the field (~ 60 per year) that are surprisingly well cited. JBCB has an important function as alternative publishing channel in addition to other, bigger journals.

  1. Strategies for Using Peer-Assisted Learning Effectively in an Undergraduate Bioinformatics Course

    Science.gov (United States)

    Shapiro, Casey; Ayon, Carlos; Moberg-Parker, Jordan; Levis-Fitzgerald, Marc; Sanders, Erin R.

    2013-01-01

    This study used a mixed methods approach to evaluate hybrid peer-assisted learning approaches incorporated into a bioinformatics tutorial for a genome annotation research project. Quantitative and qualitative data were collected from undergraduates who enrolled in a research-based laboratory course during two different academic terms at UCLA.…

  2. FDA Bioinformatics Tool for Microbial Genomics Research on Molecular Characterization of Bacterial Foodborne Pathogens Using Microarrays

    Science.gov (United States)

    Background: Advances in microbial genomics and bioinformatics are offering greater insights into the emergence and spread of foodborne pathogens in outbreak scenarios. The Food and Drug Administration (FDA) has developed the genomics tool ArrayTrackTM, which provides extensive functionalities to man...

  3. Bioinformatic analysis of functional differences between the immunoproteasome and the constitutive proteasome

    DEFF Research Database (Denmark)

    Kesmir, Can; van Noort, V.; de Boer, R.J.;

    2003-01-01

    not yet been quantified how different the specificity of two forms of the proteasome are. The main question, which still lacks direct evidence, is whether the immunoproteasome generates more MHC ligands. Here we use bioinformatics tools to quantify these differences and show that the immunoproteasome...

  4. In-depth analysis of the adipocyte proteome by mass spectrometry and bioinformatics

    DEFF Research Database (Denmark)

    Adachi, Jun; Kumar, Chanchal; Zhang, Yanling;

    2007-01-01

    , mitochondria, membrane, and cytosol of 3T3-L1 adipocytes. We identified 3,287 proteins while essentially eliminating false positives, making this one of the largest high confidence proteomes reported to date. Comprehensive bioinformatics analysis revealed that the adipocyte proteome, despite its specialized...

  5. Integration of Proteomics, Bioinformatics and Systems biology in Brain Injury Biomarker Discovery

    Directory of Open Access Journals (Sweden)

    Joy eGuingab-Cagmat

    2013-05-01

    Full Text Available Traumatic brain injury (TBI is a major medical crisis without any FDA-approved pharmacological therapies that have been demonstrated to improve functional outcomes. It has been argued that discovery of disease-relevant biomarkers might help to guide successful clinical trials for TBI. Major advances in mass spectrometry (MS have revolutionized the field of proteomic biomarker discovery and facilitated the identification of several candidate markers that are being further evaluated for their efficacy as TBI biomarkers. However, several hurdles have to be overcome even during the discovery phase which is only the first step in the long process of biomarker development. The high throughput nature of MS-based proteomic experiments generates a massive amount of mass spectral data presenting great challenges in downstream interpretation. Currently, different bioinformatics platforms are available for functional analysis and data mining of MS-generated proteomic data. These tools provide a way to convert data sets to biologically interpretable results and functional outcomes. A strategy that has promise in advancing biomarker development involves the triad of proteomics, bioinformatics and systems biology. In this review, a brief overview of how bioinformatics and systems biology tools analyze, transform and interpret complex MS datasets into biologically relevant results is discussed. In addition, challenges and limitations of proteomics, bioinformatics and systems biology in TBI biomarker discovery are presented. A brief survey of researches that utilized these three overlapping disciplines in TBI biomarker discovery is also presented. Finally, examples of TBI biomarkers and their applications are discussed.

  6. A bioinformatics approach to the development of immunoassays for specified risk material in canned meat products

    NARCIS (Netherlands)

    Reece, P.; Bremer, M.G.E.G.; Stones, R.; Danks, C.; Baumgartner, S.; Tomkies, V.; Hemetsberger, C.; Smits, N.G.E.; Lubbe, W.

    2009-01-01

    A bioinformatics approach to developing antibodies to specific proteins has been evaluated for the production of antibodies to heat-processed specified risk tissues from ruminants (brain and eye tissue). The approach involved the identification of proteins specific to ruminant tissues by interrogati

  7. Bioinformatics-Driven Identification and Examination of Candidate Genes for Non-Alcoholic Fatty Liver Disease

    DEFF Research Database (Denmark)

    Banasik, Karina; Justesen, Johanne M.; Hornbak, Malene

    2011-01-01

    Objective: Candidate genes for non-alcoholic fatty liver disease (NAFLD) identified by a bioinformatics approach were examined for variant associations to quantitative traits of NAFLD-related phenotypes. Research Design and Methods: By integrating public database text mining, trans-organism protein...

  8. A complementary bioinformatics approach to identify potential plant cell wall glycosytransferase encoding genes

    DEFF Research Database (Denmark)

    Egelund, Jack; Skjøt, Michael; Geshi, Naomi;

    2004-01-01

    . Although much is known with regard to composition and fine structures of the plant CW, only a handful of CW biosynthetic GT genes-all classified in the CAZy system-have been characterized. In an effort to identify CW GTs that have not yet been classified in the CAZy database, a simple bioinformatics...

  9. Virginia Bioinformatics Institute names Otto Folkerts associate director of technology development

    OpenAIRE

    2005-01-01

    The Virginia Bioinformatics Institute (VBI) at Virginia Tech has named Otto Folkerts as the institute's associate director of technology development. In his new position, Folkerts will serve as a member of the senior management team with direct responsibility for the development of new business initiatives as well as expanding the institute's current commercial activities.

  10. A telescope for the RNA universe : novel bioinformatic approaches to analyze RNA sequencing data

    NARCIS (Netherlands)

    Pulyakhina, Irina

    2016-01-01

    In this thesis I focus on the application of bioinformatics to analyze RNA. The type of experimental data of interest is sequencing data generated with various Next Generation Sequencing technique: nuclear RNA, cytoplasmic RNA, captured polyadenylated RNA fragments, etc. I highlight the necessity in

  11. Genomic-bioinformatic analysis of transcripts enriched in the third-stage larva of the parasitic nematode Ascaris suum.

    Directory of Open Access Journals (Sweden)

    Cui-Qin Huang

    Full Text Available Differential transcription in Ascaris suum was investigated using a genomic-bioinformatic approach. A cDNA archive enriched for molecules in the infective third-stage larva (L3 of A. suum was constructed by suppressive-subtractive hybridization (SSH, and a subset of cDNAs from 3075 clones subjected to microarray analysis using cDNA probes derived from RNA from different developmental stages of A. suum. The cDNAs (n = 498 shown by microarray analysis to be enriched in the L3 were sequenced and subjected to bioinformatic analyses using a semi-automated pipeline (ESTExplorer. Using gene ontology (GO, 235 of these molecules were assigned to 'biological process' (n = 68, 'cellular component' (n = 50, or 'molecular function' (n = 117. Of the 91 clusters assembled, 56 molecules (61.5% had homologues/orthologues in the free-living nematodes Caenorhabditis elegans and C. briggsae and/or other organisms, whereas 35 (38.5% had no significant similarity to any sequences available in current gene databases. Transcripts encoding protein kinases, protein phosphatases (and their precursors, and enolases were abundantly represented in the L3 of A. suum, as were molecules involved in cellular processes, such as ubiquitination and proteasome function, gene transcription, protein-protein interactions, and function. In silico analyses inferred the C. elegans orthologues/homologues (n = 50 to be involved in apoptosis and insulin signaling (2%, ATP synthesis (2%, carbon metabolism (6%, fatty acid biosynthesis (2%, gap junction (2%, glucose metabolism (6%, or porphyrin metabolism (2%, although 34 (68% of them could not be mapped to a specific metabolic pathway. Small numbers of these 50 molecules were predicted to be secreted (10%, anchored (2%, and/or transmembrane (12% proteins. Functionally, 17 (34% of them were predicted to be associated with (non-wild-type RNAi phenotypes in C. elegans, the majority being embryonic lethality (Emb (13 types; 58.8%, larval arrest

  12. Sproglig Metode og Analyse

    DEFF Research Database (Denmark)

    le Fevre Jakobsen, Bjarne

    Publikationen indeholder øvematerialer, tekster, powerpointpræsentationer og handouts til undervisningsfaget Sproglig Metode og Analyse på BA og tilvalg i Dansk/Nordisk 2010-2011......Publikationen indeholder øvematerialer, tekster, powerpointpræsentationer og handouts til undervisningsfaget Sproglig Metode og Analyse på BA og tilvalg i Dansk/Nordisk 2010-2011...

  13. Functional Characterization of the Osteoarthritis Susceptibility Mapping to CHST11-A Bioinformatics and Molecular Study.

    Directory of Open Access Journals (Sweden)

    Louise N Reynard

    Full Text Available The single nucleotide polymorphism (SNP rs835487 is associated with hip osteoarthritis (OA at the genome-wide significance level and is located within CHST11, which codes for carbohydrate sulfotransferase 11. This enzyme post-translationally modifies proteoglycan prior to its deposition in the cartilage extracellular matrix. Using bioinformatics and experimental analyses, our aims were to characterise the rs835487 association signal and to identify the causal functional variant/s. Database searches revealed that rs835487 resides within a linkage disequilibrium (LD block of only 2.7 kb and is in LD (r2 ≥ 0.8 with six other SNPs. These are all located within intron 2 of CHST11, in a region that has predicted enhancer activity and which shows a high degree of conservation in primates. Luciferase reporter assays revealed that of the seven SNPs, rs835487 and rs835488, which have a pairwise r2 of 0.962, are the top functional candidates; the haplotype composed of the OA-risk conferring G allele of rs835487 and the corresponding T allele of rs835488 (the G-T haplotype demonstrated significantly different enhancer activity relative to the haplotype composed of the non-risk A allele of rs835487 and the corresponding C allele of rs835488 (the A-C haplotype (p < 0.001. Electrophoretic mobility shift assays and supershifts identified several transcription factors that bind more strongly to the risk-conferring G and T alleles of the two SNPs, including SP1, SP3, YY1 and SUB1. CHST11 was found to be upregulated in OA versus non-OA cartilage (p < 0.001 and was expressed dynamically during chondrogenesis. Its expression in adult cartilage did not however correlate with rs835487 genotype. Our data demonstrate that the OA susceptibility is mediated by differential protein binding to the alleles of rs835487 and rs835488, which are located within an enhancer whose target may be CHST11 during chondrogenesis or an alternative gene.

  14. Functional, genetic and bioinformatic characterization of a calcium/calmodulin kinase gene in Sporothrix schenckii

    Directory of Open Access Journals (Sweden)

    Rodriguez-del Valle Nuri

    2007-11-01

    Full Text Available Abstract Background Sporothrix schenckii is a pathogenic, dimorphic fungus, the etiological agent of sporotrichosis, a subcutaneous lymphatic mycosis. Dimorphism in S. schenckii responds to second messengers such as cAMP and calcium, suggesting the possible involvement of a calcium/calmodulin kinase in its regulation. In this study we describe a novel calcium/calmodulin-dependent protein kinase gene in S. schenckii, sscmk1, and the effects of inhibitors of calmodulin and calcium/calmodulin kinases on the yeast to mycelium transition and the yeast cell cycle. Results Using the PCR homology approach a new member of the calcium/calmodulin kinase family, SSCMK1, was identified in this fungus. The cDNA sequence of sscmk1 revealed an open reading frame of 1,221 nucleotides encoding a 407 amino acid protein with a predicted molecular weight of 45.6 kDa. The genomic sequence of sscmk1 revealed the same ORF interrupted by five introns. Bioinformatic analyses of SSCMK1 showed that this protein had the distinctive features that characterize a calcium/calmodulin protein kinase: a serine/threonine protein kinase domain and a calmodulin-binding domain. When compared to homologues from seven species of filamentous fungi, SSCMK1 showed substantial similarities, except for a large and highly variable region that encompasses positions 330 – 380 of the multiple sequence alignment. Inhibition studies using calmodulin inhibitor W-7, and calcium/calmodulin kinase inhibitors, KN-62 and lavendustin C, were found to inhibit budding by cells induced to re-enter the yeast cell cycle and to favor the yeast to mycelium transition. Conclusion This study constitutes the first evidence of the presence of a calcium/calmodulin kinase-encoding gene in S. schenckii and its possible involvement as an effector of dimorphism in this fungus. These results suggest that a calcium/calmodulin dependent signaling pathway could be involved in the regulation of dimorphism in this fungus

  15. Bioinformatic Analysis of Differential Protein Expression in Calu-3 Cells Exposed to Carbon Nanotubes

    Directory of Open Access Journals (Sweden)

    Pin Li

    2013-10-01

    Full Text Available Carbon nanomaterials are widely produced and used in industry, medicine and scientific research. To examine the impact of exposure to nanoparticles on human health, the human airway epithelial cell line, Calu-3, was used to evaluate changes in the cellular proteome that could account for alterations in cellular function of airway epithelia after 24 hexposure to 10 μg/mL and 100 ng/mLof two common carbon nanoparticles, single- and multi-wall carbon nanotubes (SWCNT, MWCNT. After exposure to the nanoparticles, label-free quantitative mass spectrometry (LFQMS was used to study the differential protein expression. Ingenuity Pathway Analysis (IPA was used to conduct a bioinformaticanalysis of proteins identified in LFQMS. Interestingly, after exposure to ahigh concentration (10 mg/mL; 0.4 mg/cm2 of MWCNT or SWCNT, only 8 and 13 proteins, respectively, exhibited changes in abundance. In contrast, the abundance of hundreds of proteins was altered in response to a low concentration (100 ng/mL; 4 ng/cm2 of either CNT. Of the 281 and 282 proteins that were significantly altered in response to MWCNT or SWCNT respectively, 231 proteins were the same. Bioinformatic analyses found that the proteins in common to both nanotubes occurred within the cellular functions of cell death and survival, cell-to-cell signaling and interaction, cellular assembly and organization, cellular growth and proliferation, infectious disease, molecular transport and protein synthesis. The majority of the protein changes represent a decrease in amount suggesting a general stress response to protect cells. The STRING database was used to analyze the various functional protein networks. Interestingly, some proteins like cadherin 1 (CDH1, signal transducer and activator of transcription 1 (STAT1, junction plakoglobin (JUP, and apoptosis-associated speck-like protein containing a CARD (PYCARD, appear in several functional categories and tend to be in the center of the networks. This

  16. Bioinformatics describes novel Loci for high resolution discrimination of leptospira isolates.

    Directory of Open Access Journals (Sweden)

    Gustavo M Cerqueira

    Full Text Available BACKGROUND: Leptospirosis is one of the most widespread zoonoses in the world and with over 260 pathogenic serovars there is an urgent need for a molecular system of classification. The development of multilocus sequence typing (MLST schemes for Leptospira spp. is addressing this issue. The aim of this study was to identify loci with potential to enhance Leptospira strain discrimination by sequencing-based methods. METHODOLOGY AND PRINCIPAL FINDINGS: We used bioinformatics to evaluate pre-existing loci with the potential to increase the discrimination of outbreak strains. Previously deposited sequence data were evaluated by phylogenetic analyses using either single or concatenated sequences. We identified and evaluated the applicability of the ligB, secY, rpoB and lipL41 loci, individually and in combination, to discriminate between 38 pathogenic Leptospira strains and to cluster them according to the species they belonged to. Pairwise identity among the loci ranged from 82.0-92.0%, while interspecies identity was 97.7-98.5%. Using the ligB-secY-rpoB-lipL41 superlocus it was possible to discriminate 34/38 strains, which belong to six pathogenic Leptospira species. In addition, the sequences were concatenated with the superloci from 16 sequence types from a previous MLST scheme employed to study the association of a leptospiral clone with an outbreak of human leptospirosis in Thailand. Their use enhanced the discriminative power of the existing scheme. The lipL41 and rpoB loci raised the resolution from 81.0-100%, but the enhanced scheme still remains limited to the L. interrogans and L. kirschneri species. CONCLUSIONS: As the first aim of our study, the ligB-secY-rpoB-lipL41 superlocus demonstrated a satisfactory level of discrimination among the strains evaluated. Second, the inclusion of the rpoB and lipL41 loci to a MLST scheme provided high resolution for discrimination of strains within L. interrogans and L. kirschneri and might be useful

  17. The Revolution in Viral Genomics as Exemplified by the Bioinformatic Analysis of Human Adenoviruses

    Directory of Open Access Journals (Sweden)

    Sarah Torres

    2010-06-01

    Full Text Available Over the past 30 years, genomic and bioinformatic analysis of human adenoviruses has been achieved using a variety of DNA sequencing methods; initially with the use of restriction enzymes and more currently with the use of the GS FLX pyrosequencing technology. Following the conception of DNA sequencing in the 1970s, analysis of adenoviruses has evolved from 100 base pair mRNA fragments to entire genomes. Comparative genomics of adenoviruses made its debut in 1984 when nucleotides and amino acids of coding sequences within the hexon genes of two human adenoviruses (HAdV, HAdV–C2 and HAdV–C5, were compared and analyzed. It was determined that there were three different zones (1-393, 394-1410, 1411-2910 within the hexon gene, of which HAdV–C2 and HAdV–C5 shared zones 1 and 3 with 95% and 89.5% nucleotide identity, respectively. In 1992, HAdV-C5 became the first adenovirus genome to be fully sequenced using the Sanger method. Over the next seven years, whole genome analysis and characterization was completed using bioinformatic tools such as blastn, tblastx, ClustalV and FASTA, in order to determine key proteins in species HAdV-A through HAdV-F. The bioinformatic revolution was initiated with the introduction of a novel species, HAdV-G, that was typed and named by the use of whole genome sequencing and phylogenetics as opposed to traditional serology. HAdV bioinformatics will continue to advance as the latest sequencing technology enables scientists to add to and expand the resource databases. As a result of these advancements, how novel HAdVs are typed has changed. Bioinformatic analysis has become the revolutionary tool that has significantly accelerated the in-depth study of HAdV microevolution through comparative genomics.

  18. Bioinformatics analysis and genetic diversity of the poliovirus.

    Science.gov (United States)

    Liu, Yanhan; Ma, Tengfei; Liu, Jianzhu; Zhao, Xiaona; Cheng, Ziqiang; Guo, Huijun; Wang, Shujing; Xu, Ruixue

    2014-12-01

    Poliomyelitis, a disease which can manifest as muscle paralysis, is caused by the poliovirus, which is a human enterovirus and member of the family Picornaviridae that usually transmits by the faecal-oral route. The viruses of the OPV (oral poliovirus attenuated-live vaccine) strains can mutate in the human intestine during replication and some of these mutations can lead to the recovery of serious neurovirulence. Informatics research of the poliovirus genome can be used to explain further the characteristics of this virus. In this study, sequences from 100 poliovirus isolates were acquired from GenBank. To determine the evolutionary relationship between the strains, we compared and analysed the sequences of the complete poliovirus genome and the VP1 region. The reconstructed phylogenetic trees for the complete sequences and the VP1 sequences were both divided into two branches, indicating that the genetic relationships of the whole poliovirus genome and the VP1 sequences are very similar. This branching indicates that the virulence and pathogenicity of poliomyelitis may be associated with the VP1 region. Sequence alignment of the VP1 region revealed numerous mutation sites in which mutation rates of >30 % were detected. In a group of strains recorded in the USA, mutation sites and mutation types were the same and this may be associated with their distribution in the evolutionary tree and their genetic relationship. In conclusion, the genetic evolutionary relationships of poliovirus isolate sequences are determined to a great extent by the VP1 protein, and poliovirus strains located on the same branch of the phylogenetic tree contain the same mutation spots and mutation types. Hence, the genetic characteristics of the VP1 region in the poliovirus genome should be analysed to identify the transmission route of poliovirus and provide the basis of viral immunity development.

  19. Toward the Replacement of Animal Experiments through the Bioinformatics-driven Analysis of 'Omics' Data from Human Cell Cultures.

    Science.gov (United States)

    Grafström, Roland C; Nymark, Penny; Hongisto, Vesa; Spjuth, Ola; Ceder, Rebecca; Willighagen, Egon; Hardy, Barry; Kaski, Samuel; Kohonen, Pekka

    2015-11-01

    This paper outlines the work for which Roland Grafström and Pekka Kohonen were awarded the 2014 Lush Science Prize. The research activities of the Grafström laboratory have, for many years, covered cancer biology studies, as well as the development and application of toxicity-predictive in vitro models to determine chemical safety. Through the integration of in silico analyses of diverse types of genomics data (transcriptomic and proteomic), their efforts have proved to fit well into the recently-developed Adverse Outcome Pathway paradigm. Genomics analysis within state-of-the-art cancer biology research and Toxicology in the 21st Century concepts share many technological tools. A key category within the Three Rs paradigm is the Replacement of animals in toxicity testing with alternative methods, such as bioinformatics-driven analyses of data obtained from human cell cultures exposed to diverse toxicants. This work was recently expanded within the pan-European SEURAT-1 project (Safety Evaluation Ultimately Replacing Animal Testing), to replace repeat-dose toxicity testing with data-rich analyses of sophisticated cell culture models. The aims and objectives of the SEURAT project have been to guide the application, analysis, interpretation and storage of 'omics' technology-derived data within the service-oriented sub-project, ToxBank. Particularly addressing the Lush Science Prize focus on the relevance of toxicity pathways, a 'data warehouse' that is under continuous expansion, coupled with the development of novel data storage and management methods for toxicology, serve to address data integration across multiple 'omics' technologies. The prize winners' guiding principles and concepts for modern knowledge management of toxicological data are summarised. The translation of basic discovery results ranged from chemical-testing and material-testing data, to information relevant to human health and environmental safety.

  20. The Development of Protein Microarrays and Their Applications in DNA-Protein and Protein-Protein Interaction Analyses of Arabidopsis Transcription Factors

    Institute of Scientific and Technical Information of China (English)

    Wei Gong; Kun He; Mike Covington; S.R Dinesh-Kumar; Michael Snyder; Stacey L.Harmer; Yu-Xian Zhu; Xing Wang Deng

    2008-01-01

    We used our collection of Arabidopsis transcription factor (TF) ORFeome clones to constructprotein microarrays containing as many as 802 TF proteins. These protein microarrays were used for both protein-DNA and proteinprotein interaction analyses. For protein-DNA interaction studies, we examined AP2/ERF family TFs and their cognate cis-elements. By careful comparison of the DNA-binding specificity of 13 TFs on the protein microarray with previous non-microarray data, we showed that protein microarrays provide an efficient and high throughput tool for genome-wide analysis of TF-DNA interactions. This microarray protein-DNA interaction analysis allowed us to derive a comprehensive view of DNA-binding profiles of AP2/ERF family proteins in Arabidopsis. It also revealed four TFs that bound the EE (evening element) and had the expected phased gene expression under clock-regulation, thus providing a basis for further functional analysis of their roles in clock regulation of gene expression. We also developed procedures for detecting protein interactions using this TF protein microarray and discovered four novel partners that interact with HY5, which can be validated by yeast two-hybrid assays. Thus, plant TF protein microarrays offer an attractive high-throughput alternative to traditional techniques for TF functional characterization on a global scale.

  1. The development of protein microarrays and their applications in DNA-protein and protein-protein interaction analyses of Arabidopsis transcription factors.

    Science.gov (United States)

    Gong, Wei; He, Kun; Covington, Mike; Dinesh-Kumar, S P; Snyder, Michael; Harmer, Stacey L; Zhu, Yu-Xian; Deng, Xing Wang

    2008-01-01

    We used our collection of Arabidopsis transcription factor (TF) ORFeome clones to construct protein microarrays containing as many as 802 TF proteins. These protein microarrays were used for both protein-DNA and protein-protein interaction analyses. For protein-DNA interaction studies, we examined AP2/ERF family TFs and their cognate cis-elements. By careful comparison of the DNA-binding specificity of 13 TFs on the protein microarray with previous non-microarray data, we showed that protein microarrays provide an efficient and high throughput tool for genome-wide analysis of TF-DNA interactions. This microarray protein-DNA interaction analysis allowed us to derive a comprehensive view of DNA-binding profiles of AP2/ERF family proteins in Arabidopsis. It also revealed four TFs that bound the EE (evening element) and had the expected phased gene expression under clock-regulation, thus providing a basis for further functional analysis of their roles in clock regulation of gene expression. We also developed procedures for detecting protein interactions using this TF protein microarray and discovered four novel partners that interact with HY5, which can be validated by yeast two-hybrid assays. Thus, plant TF protein microarrays offer an attractive high-throughput alternative to traditional techniques for TF functional characterization on a global scale.

  2. SeqBuster, a bioinformatic tool for the processing and analysis of small RNAs datasets, reveals ubiquitous miRNA modifications in human embryonic cells.

    Science.gov (United States)

    Pantano, Lorena; Estivill, Xavier; Martí, Eulàlia

    2010-03-01

    High-throughput sequencing technologies enable direct approaches to catalog and analyze snapshots of the total small RNA content of living cells. Characterization of high-throughput sequencing data requires bioinformatic tools offering a wide perspective of the small RNA transcriptome. Here we present SeqBuster, a highly versatile and reliable web-based toolkit to process and analyze large-scale small RNA datasets. The high flexibility of this tool is illustrated by the multiple choices offered in the pre-analysis for mapping purposes and in the different analysis modules for data manipulation. To overcome the storage capacity limitations of the web-based tool, SeqBuster offers a stand-alone version that permits the annotation against any custom database. SeqBuster integrates multiple analyses modules in a unique platform and constitutes the first bioinformatic tool offering a deep characterization of miRNA variants (isomiRs). The application of SeqBuster to small-RNA datasets of human embryonic stem cells revealed that most miRNAs present different types of isomiRs, some of them being associated to stem cell differentiation. The exhaustive description of the isomiRs provided by SeqBuster could help to identify miRNA-variants that are relevant in physiological and pathological processes. SeqBuster is available at http://estivill_lab.crg.es/seqbuster.

  3. Evolution of proline biosynthesis: enzymology, bioinformatics, genetics, and transcriptional regulation.

    Science.gov (United States)

    Fichman, Yosef; Gerdes, Svetlana Y; Kovács, Hajnalka; Szabados, László; Zilberstein, Aviah; Csonka, Laszlo N

    2015-11-01

    Proline is not only an essential component of proteins but it also has important roles in adaptation to osmotic and dehydration stresses, redox control, and apoptosis. Here, we review pathways of proline biosynthesis in the three domains of life. Pathway reconstruction from genome data for hundreds of eubacterial and dozens of archaeal and eukaryotic organisms revealed evolutionary conservation and variations of this pathway across different taxa. In the most prevalent pathway of proline synthesis, glutamate is phosphorylated to γ-glutamyl phosphate by γ-glutamyl kinase, reduced to γ-glutamyl semialdehyde by γ-glutamyl phosphate reductase, cyclized spontaneously to Δ(1)-pyrroline-5-carboxylate and reduced to proline by Δ(1)-pyrroline-5-carboxylate reductase. In higher plants and animals the first two steps are catalysed by a bi-functional Δ(1) -pyrroline-5-carboxylate synthase. Alternative pathways of proline formation use the initial steps of the arginine biosynthetic pathway to ornithine, which can be converted to Δ(1)-pyrroline-5-carboxylate by ornithine aminotransferase and then reduced to proline or converted directly to proline by ornithine cyclodeaminase. In some organisms, the latter pathways contribute to or could be fully responsible for the synthesis of proline. The conservation of proline biosynthetic enzymes and significance of specific residues for catalytic activity and allosteric regulation are analysed on the basis of protein structural data, multiple sequence alignments, and mutant studies, providing novel insights into proline biosynthesis in organisms. We also discuss the transcriptional control of the proline biosynthetic genes in bacteria and plants.

  4. Quantitative Analysis of the Trends Exhibited by the Three Interdisciplinary Biological Sciences: Biophysics, Bioinformatics, and Systems Biology.

    Science.gov (United States)

    Kang, Jonghoon; Park, Seyeon; Venkat, Aarya; Gopinath, Adarsh

    2015-12-01

    New interdisciplinary biological sciences like bioinformatics, biophysics, and systems biology have become increasingly relevant in modern science. Many papers have suggested the importance of adding these subjects, particularly bioinformatics, to an undergraduate curriculum; however, most of their assertions have relied on qualitative arguments. In this paper, we will show our metadata analysis of a scientific literature database (PubMed) that quantitatively describes the importance of the subjects of bioinformatics, systems biology, and biophysics as compared with a well-established interdisciplinary subject, biochemistry. Specifically, we found that the development of each subject assessed by its publication volume was well described by a set of simple nonlinear equations, allowing us to characterize them quantitatively. Bioinformatics, which had the highest ratio of publications produced, was predicted to grow between 77% and 93% by 2025 according to the model. Due to the large number of publications produced in bioinformatics, which nearly matches the number published in biochemistry, it can be inferred that bioinformatics is almost equal in significance to biochemistry. Based on our analysis, we suggest that bioinformatics be added to the standard biology undergraduate curriculum. Adding this course to an undergraduate curriculum will better prepare students for future research in biology.

  5. MSeqDR: A Centralized Knowledge Repository and Bioinformatics Web Resource to Facilitate Genomic Investigations in Mitochondrial Disease.

    Science.gov (United States)

    Shen, Lishuang; Diroma, Maria Angela; Gonzalez, Michael; Navarro-Gomez, Daniel; Leipzig, Jeremy; Lott, Marie T; van Oven, Mannis; Wallace, Douglas C; Muraresku, Colleen Clarke; Zolkipli-Cunningham, Zarazuela; Chinnery, Patrick F; Attimonelli, Marcella; Zuchner, Stephan; Falk, Marni J; Gai, Xiaowu

    2016-06-01

    MSeqDR is the Mitochondrial Disease Sequence Data Resource, a centralized and comprehensive genome and phenome bioinformatics resource built by the mitochondrial disease community to facilitate clinical diagnosis and research investigations of individual patient phenotypes, genomes, genes, and variants. A central Web portal (https://mseqdr.org) integrates community knowledge from expert-curated databases with genomic and phenotype data shared by clinicians and researchers. MSeqDR also functions as a centralized application server for Web-based tools to analyze data across both mitochondrial and nuclear DNA, including investigator-driven whole exome or genome dataset analyses through MSeqDR-Genesis. MSeqDR-GBrowse genome browser supports interactive genomic data exploration and visualization with custom tracks relevant to mtDNA variation and mitochondrial disease. MSeqDR-LSDB is a locus-specific database that currently manages 178 mitochondrial diseases, 1,363 genes associated with mitochondrial biology or disease, and 3,711 pathogenic variants in those genes. MSeqDR Disease Portal allows hierarchical tree-style disease exploration to evaluate their unique descriptions, phenotypes, and causative variants. Automated genomic data submission tools are provided that capture ClinVar compliant variant annotations. PhenoTips will be used for phenotypic data submission on deidentified patients using human phenotype ontology terminology. The development of a dynamic informed patient consent process to guide data access is underway to realize the full potential of these resources.

  6. Bioinformatics Analysis of the FREM1 Gene—Evolutionary Development of the IL-1R1 Co-Receptor, TILRR

    Directory of Open Access Journals (Sweden)

    Eva E. Qwarnstrom

    2012-09-01

    Full Text Available The TLRs and IL-1 receptors have evolved to coordinate the innate immune response following pathogen invasion. Receptors and signalling intermediates of these systems are generally characterised by a high level of evolutionary conservation. The recently described IL-1R1 co-receptor TILRR is a transcriptional variant of the FREM1 gene. Here we investigate whether innate co-receptor differences between teleosts and mammals extend to the expression of the TILRR isoform of FREM1. Bioinformatic and phylogenetic approaches were used to analyse the genome sequences of FREM1 from eukaryotic organisms including 37 tetrapods and five teleost fish. The TILRR consensus peptide sequence was present in the FREM1 gene of the tetrapods, but not in fish orthologs of FREM1, and neither FREM1 nor TILRR were present in invertebrates. The TILRR gene appears to have arisen via incorporation of adjacent non-coding DNA with a contiguous exonic sequence after the teleost divergence. Comparing co-receptors in other systems, points to their origin during the same stages of evolution. Our results show that modern teleost fish do not possess the IL-1RI co-receptor TILRR, but that this is maintained in tetrapods as early as amphibians. Further, they are consistent with data showing that co-receptors are recent additions to these regulatory systems and suggest this may underlie differences in innate immune responses between mammals and fish.

  7. Molecular ecological network analyses

    Directory of Open Access Journals (Sweden)

    Deng Ye

    2012-05-01

    Full Text Available Abstract Background Understanding the interaction among different species within a community and their responses to environmental changes is a central goal in ecology. However, defining the network structure in a microbial community is very challenging due to their extremely high diversity and as-yet uncultivated status. Although recent advance of metagenomic technologies, such as high throughout sequencing and functional gene arrays, provide revolutionary tools for analyzing microbial community structure, it is still difficult to examine network interactions in a microbial community based on high-throughput metagenomics data. Results Here, we describe a novel mathematical and bioinformatics framework to construct ecological association networks named molecular ecological networks (MENs through Random Matrix Theory (RMT-based methods. Compared to other network construction methods, this approach is remarkable in that the network is automatically defined and robust to noise, thus providing excellent solutions to several common issues associated with high-throughput metagenomics data. We applied it to determine the network structure of microbial communities subjected to long-term experimental warming based on pyrosequencing data of 16 S rRNA genes. We showed that the constructed MENs under both warming and unwarming conditions exhibited topological features of scale free, small world and modularity, which were consistent with previously described molecular ecological networks. Eigengene analysis indicated that the eigengenes represented the module profiles relatively well. In consistency with many other studies, several major environmental traits including temperature and soil pH were found to be important in determining network interactions in the microbial communities examined. To facilitate its application by the scientific community, all these methods and statistical tools have been integrated into a comprehensive Molecular Ecological

  8. Molecular docking of Glycine max and Medicago truncatula ureases with urea; bioinformatics approaches.

    Science.gov (United States)

    Filiz, Ertugrul; Vatansever, Recep; Ozyigit, Ibrahim Ilker

    2016-03-01

    Urease (EC 3.5.1.5) is a nickel-dependent metalloenzyme catalyzing the hydrolysis of urea into ammonia and carbon dioxide. It is present in many bacteria, fungi, yeasts and plants. Most species, with few exceptions, use nickel metalloenzyme urease to hydrolyze urea, which is one of the commonly used nitrogen fertilizer in plant growth thus its enzymatic hydrolysis possesses vital importance in agricultural practices. Considering the essentiality and importance of urea and urease activity in most plants, this study aimed to comparatively investigate the ureases of two important legume species such as Glycine max (soybean) and Medicago truncatula (barrel medic) from Fabaceae family. With additional plant species, primary and secondary structures of 37 plant ureases were comparatively analyzed using various bioinformatics tools. A structure based phylogeny was constructed using predicted 3D models of G. max and M. truncatula, whose crystallographic structures are not available, along with three additional solved urease structures from Canavalia ensiformis (PDB: 4GY7), Bacillus pasteurii (PDB: 4UBP) and Klebsiella aerogenes (PDB: 1FWJ). In addition, urease structures of these species were docked with urea to analyze the binding affinities, interacting amino acids and atom distances in urease-urea complexes. Furthermore, mutable amino acids which could potentially affect the protein active site, stability and flexibility as well as overall protein stability were analyzed in urease structures of G. max and M. truncatula. Plant ureases demonstrated similar physico-chemical properties with 833-878 amino acid residues and 89.39-90.91 kDa molecular weight with mainly acidic (5.15-6.10 pI) nature. Four protein domain structures such as urease gamma, urease beta, urease alpha and amidohydro 1 characterized the plant ureases. Secondary structure of plant ureases also demonstrated conserved protein architecture, with predominantly α-helix and random coil structures. In

  9. Proteomic analyses of apoplastic proteins from germinating Arabidopsis thaliana pollen.

    Science.gov (United States)

    Ge, Weina; Song, Yun; Zhang, Cuijun; Zhang, Yafang; Burlingame, Alma L; Guo, Yi

    2011-12-01

    Pollen grains play important roles in the reproductive processes of flowering plants. The roles of apoplastic proteins in pollen germination and in pollen tube growth are comparatively less well understood. To investigate the functions of apoplastic proteins in pollen germination, the global apoplastic proteins of mature and germinated Arabidopsis thaliana pollen grains were prepared for differential analyses by using 2-dimensional fluorescence difference gel electrophoresis (2-D DIGE) saturation labeling techniques. One hundred and three proteins differentially expressed (p value≤0.01) in pollen germinated for 6h compared with un-germination mature pollen, and 98 spots, which represented 71 proteins, were identified by LC-MS/MS. By bioinformatics analysis, 50 proteins were identified as secreted proteins. These proteins were mainly involved in cell wall modification and remodeling, protein metabolism and signal transduction. Three of the differentially expressed proteins were randomly selected to determine their subcellular localizations by transiently expressing YFP fusion proteins. The results of subcellular localization were identical with the bioinformatics prediction. Based on these data, we proposed a model for apoplastic proteins functioning in pollen germination and pollen tube growth. These results will lead to a better understanding of the mechanisms of pollen germination and pollen tube growth.

  10. Genomics, Proteomics & Bioinformatics (GPB) Has a New Start——Open Access

    Institute of Scientific and Technical Information of China (English)

    Jun Yu

    2012-01-01

    We are now presenting to our readers the first issue of Volume 10 of Genomics,Proteomics & Bioinformatics (GPB).It is also the first issue for the new status.GPB was founded in 2003 and published in English,focusing on research advancement in the fields of omics and bioinformatics.To ensure an international presence,GPB has its 30-50% editorial board members from outside China.From 2006,GPB has been co-published by Elsevier and Science Press,and its full-text articles are available for downloading from ScienceDirect.In 2011,GPB became a bimonthly journal.Annual downloading counts keep increasing with around 70% from outside China in 2011 (Figure 1).In addition,submissions from abroad account for 70% of the published articles after collaborating with Elsevier.

  11. Characterization of epitopes recognized by monoclonal antibodies: experimental approaches supported by freely accessible bioinformatic tools.

    Science.gov (United States)

    Clementi, Nicola; Mancini, Nicasio; Castelli, Matteo; Clementi, Massimo; Burioni, Roberto

    2013-05-01

    Monoclonal antibodies (mAbs) have been used successfully both in research and for clinical purposes. The possible use of protective mAbs directed against different microbial pathogens is currently being considered. The fine definition of the epitope recognized by a protective mAb is an important aspect to be considered for possible development in epitope-based vaccinology. The most accurate approach to this is the X-ray resolution of mAb/antigen crystal complex. Unfortunately, this approach is not always feasible. Under this perspective, several surrogate epitope mapping strategies based on the use of bioinformatics have been developed. In this article, we review the most common, freely accessible, bioinformatic tools used for epitope characterization and provide some basic examples of molecular visualization, editing and computational analysis.

  12. A middleware-based platform for the integration of bioinformatic services

    Directory of Open Access Journals (Sweden)

    Guzmán Llambías

    2015-08-01

    Full Text Available Performing Bioinformatic´s experiments involve an intensive access to distributed services and information resources through Internet. Although existing tools facilitate the implementation of workflow-oriented applications, they lack of capabilities to integrate services beyond low-scale applications, particularly integrating services with heterogeneous interaction patterns and in a larger scale. This is particularly required to enable a large-scale distributed processing of biological data generated by massive sequencing technologies. On the other hand, such integration mechanisms are provided by middleware products like Enterprise Service Buses (ESB, which enable to integrate distributed systems following a Service Oriented Architecture. This paper proposes an integration platform, based on enterprise middleware, to integrate Bioinformatics services. It presents a multi-level reference architecture and focuses on ESB-based mechanisms to provide asynchronous communications, event-based interactions and data transformation capabilities. The paper presents a formal specification of the platform using the Event-B model.

  13. Role of remote sensing, geographical information system (GIS) and bioinformatics in kala-azar epidemiology.

    Science.gov (United States)

    Bhunia, Gouri Sankar; Dikhit, Manas Ranjan; Kesari, Shreekant; Sahoo, Ganesh Chandra; Das, Pradeep

    2011-11-01

    Visceral leishmaniasis or kala-azar is a potent parasitic infection causing death of thousands of people each year. Medicinal compounds currently available for the treatment of kala-azar have serious side effects and decreased efficacy owing to the emergence of resistant strains. The type of immune reaction is also to be considered in patients infected with Leishmania donovani (L. donovani). For complete eradication of this disease, a high level modern research is currently being applied both at the molecular level as well as at the field level. The computational approaches like remote sensing, geographical information system (GIS) and bioinformatics are the key resources for the detection and distribution of vectors, patterns, ecological and environmental factors and genomic and proteomic analysis. Novel approaches like GIS and bioinformatics have been more appropriately utilized in determining the cause of visearal leishmaniasis and in designing strategies for preventing the disease from spreading from one region to another.

  14. Bioinformatics tools and databases for whole genome sequence analysis of Mycobacterium tuberculosis.

    Science.gov (United States)

    Faksri, Kiatichai; Tan, Jun Hao; Chaiprasert, Angkana; Teo, Yik-Ying; Ong, Rick Twee-Hee

    2016-11-01

    Tuberculosis (TB) is an infectious disease of global public health importance caused by Mycobacterium tuberculosis complex (MTC) in which M. tuberculosis (Mtb) is the major causative agent. Recent advancements in genomic technologies such as next generation sequencing have enabled high throughput cost-effective generation of whole genome sequence information from Mtb clinical isolates, providing new insights into the evolution, genomic diversity and transmission of the Mtb bacteria, including molecular mechanisms of antibiotic resistance. The large volume of sequencing data generated however necessitated effective and efficient management, storage, analysis and visualization of the data and results through development of novel and customized bioinformatics software tools and databases. In this review, we aim to provide a comprehensive survey of the current freely available bioinformatics software tools and publicly accessible databases for genomic analysis of Mtb for identifying disease transmission in molecular epidemiology and in rapid determination of the antibiotic profiles of clinical isolates for prompt and optimal patient treatment.

  15. 生物信息学与药物设计%Bioinformatics and drug design

    Institute of Scientific and Technical Information of China (English)

    胡俊; 梁学友; 杨建红

    2014-01-01

    As an interdisciplinary course with integrated theoretical methods of biology, mathematics, physics, information science and computer science, bioinformatics provides new approaches for drug design. The articles reviewed application of bioinformatics in drug design with reference to the current status and development of drug design.%生物信息学是综合运用生物学、数学、物理学、信息科学以及计算机科学等学科的理论方法而形成的交叉学科,它为药物设计提供了新的方法。本文结合药物设计的现状与发展对生物信息学在药物设计中的应用进行综述。

  16. 6th International Conference on Practical Applications of Computational Biology & Bioinformatics

    CERN Document Server

    Luscombe, Nicholas; Fdez-Riverola, Florentino; Rodríguez, Juan; Practical Applications of Computational Biology & Bioinformatics

    2012-01-01

    The growth in the Bioinformatics and Computational Biology fields over the last few years has been remarkable.. The analysis of the datasets of Next Generation Sequencing needs new algorithms and approaches from fields such as Databases, Statistics, Data Mining, Machine Learning, Optimization, Computer Science and Artificial Intelligence. Also Systems Biology has also been emerging as an alternative to the reductionist view that dominated biological research in the last decades. This book presents the results of the  6th International Conference on Practical Applications of Computational Biology & Bioinformatics held at University of Salamanca, Spain, 28-30th March, 2012 which brought together interdisciplinary scientists that have a strong background in the biological and computational sciences.

  17. An integrated approach utilizing proteomics and bioinformatics to detect ovarian cancer

    Institute of Scientific and Technical Information of China (English)

    YU Jie-kai; ZHENG Shu; TANG Yong; LI Li

    2005-01-01

    Objective: To find new potential biomarkers and establish the patterns for the detection of ovarian cancer. Methods:Sixty one serum samples including 32 ovarian cancer patients and 29 healthy people were detected by surface-enhanced laser desorption/ionization mass spectrometry (SELDI-MS). The protein fingerprint data were analyzed by bioinformatics tools. Ten folds cross-validation support vector machine (SVM) was used to establish the diagnostic pattern. Results: Five potential biomarkers were found (2085 Da, 5881 Da, 7564 Da, 9422 Da, 6044 Da), combined with which the diagnostic pattern separated the ovarian cancer from the healthy samples with a sensitivity of 96.7%, a specificity of 96.7% and a positive predictive value of96.7%. Conclusions: The combination of SELDI with bioinformatics tools could find new biomarkers and establish patterns with high sensitivity and specificity for the detection of ovarian cancer.

  18. Biologically inspired intelligent decision making: a commentary on the use of artificial neural networks in bioinformatics.

    Science.gov (United States)

    Manning, Timmy; Sleator, Roy D; Walsh, Paul

    2014-01-01

    Artificial neural networks (ANNs) are a class of powerful machine learning models for classification and function approximation which have analogs in nature. An ANN learns to map stimuli to responses through repeated evaluation of exemplars of the mapping. This learning approach results in networks which are recognized for their noise tolerance and ability to generalize meaningful responses for novel stimuli. It is these properties of ANNs which make them appealing for applications to bioinformatics problems where interpretation of data may not always be obvious, and where the domain knowledge required for deductive techniques is incomplete or can cause a combinatorial explosion of rules. In this paper, we provide an introduction to artificial neural network theory and review some interesting recent applications to bioinformatics problems.

  19. A Review of Bioinformatics Tools for Bio-Prospecting from Metagenomic Sequence Data

    Science.gov (United States)

    Roumpeka, Despoina D.; Wallace, R. John; Escalettes, Frank; Fotheringham, Ian; Watson, Mick

    2017-01-01

    The microbiome can be defined as the community of microorganisms that live in a particular environment. Metagenomics is the practice of sequencing DNA from the genomes of all organisms present in a particular sample, and has become a common method for the study of microbiome population structure and function. Increasingly, researchers are finding novel genes encoded within metagenomes, many of which may be of interest to the biotechnology and pharmaceutical industries. However, such “bioprospecting” requires a suite of sophisticated bioinformatics tools to make sense of the data. This review summarizes the most commonly used bioinformatics tools for the assembly and annotation of metagenomic sequence data with the aim of discovering novel genes. PMID:28321234

  20. Laser Beam Focus Analyser

    DEFF Research Database (Denmark)

    Nielsen, Peter Carøe; Hansen, Hans Nørgaard; Olsen, Flemming Ove

    2007-01-01

    The quantitative and qualitative description of laser beam characteristics is important for process implementation and optimisation. In particular, a need for quantitative characterisation of beam diameter was identified when using fibre lasers for micro manufacturing. Here the beam diameter limits...... the obtainable features in direct laser machining as well as heat affected zones in welding processes. This paper describes the development of a measuring unit capable of analysing beam shape and diameter of lasers to be used in manufacturing processes. The analyser is based on the principle of a rotating...... mechanical wire being swept through the laser beam at varying Z-heights. The reflected signal is analysed and the resulting beam profile determined. The development comprised the design of a flexible fixture capable of providing both rotation and Z-axis movement, control software including data capture...

  1. Bioinformatic Analysis of Putative Gene Products Encoded in SARS-HCoV Genome

    Institute of Scientific and Technical Information of China (English)

    赵心刚; 韩敬东; 宁元亨; 孟安明; 陈晔光

    2003-01-01

    The cause of severe acute respiratory syndrome (SARS) has been identified as a new coronavirus named as SARS-HCoV.Using bioinformatic methods, we have performed a detailed domain search.In addition to the viral structure proteins, we have found that several putative polypeptides share sequence similarity to known domains or proteins.This study may provide a basis for future studies on the infection and replication process of this notorious virus.

  2. PyPedia: using the wiki paradigm as crowd sourcing environment for bioinformatics protocols

    OpenAIRE

    Kanterakis, Alexandros; Kuiper, Joël; Potamias, George; Swertz, Morris A.

    2015-01-01

    Background Today researchers can choose from many bioinformatics protocols for all types of life sciences research, computational environments and coding languages. Although the majority of these are open source, few of them possess all virtues to maximize reuse and promote reproducible science. Wikipedia has proven a great tool to disseminate information and enhance collaboration between users with varying expertise and background to author qualitative content via crowdsourcing. However, it ...

  3. The web server of IBM's Bioinformatics and Pattern Discovery group: 2004 update

    OpenAIRE

    Huynh, Tien; Rigoutsos, Isidore

    2004-01-01

    In this report, we provide an update on the services and content which are available on the web server of IBM's Bioinformatics and Pattern Discovery group. The server, which is operational around the clock, provides access to a large number of methods that have been developed and published by the group's members. There is an increasing number of problems that these tools can help tackle; these problems range from the discovery of patterns in streams of events and the computation of multiple s...

  4. The CopC Family: Structural and Bioinformatic Insights into a Diverse Group of Periplasmic Copper Binding Proteins.

    Science.gov (United States)

    Lawton, Thomas J; Kenney, Grace E; Hurley, Joseph D; Rosenzweig, Amy C

    2016-04-19

    The CopC proteins are periplasmic copper binding proteins believed to play a role in bacterial copper homeostasis. Previous studies have focused on CopCs that are part of seven-protein Cop or Pco systems involved in copper resistance. These canonical CopCs contain distinct Cu(I) and Cu(II) binding sites. Mounting evidence suggests that CopCs are more widely distributed, often present only with the CopD inner membrane protein, frequently as a fusion protein, and that the CopC and CopD proteins together function in the uptake of copper to the cytoplasm. In the methanotroph Methylosinus trichosporium OB3b, genes encoding a CopCD pair are located adjacent to the particulate methane monooxygenase (pMMO) operon. The CopC from this organism (Mst-CopC) was expressed, purified, and structurally characterized. The 1.46 Å resolution crystal structure of Mst-CopC reveals a single Cu(II) binding site with coordination somewhat different from that in canonical CopCs, and the absence of a Cu(I) binding site. Extensive bioinformatic analyses indicate that the majority of CopCs in fact contain only a Cu(II) site, with just 10% of sequences corresponding to the canonical two-site CopC. Accordingly, a new classification scheme for CopCs was developed, and detailed analyses of the sequences and their genomic neighborhoods reveal new proteins potentially involved in copper homeostasis, providing a framework for expanded models of CopCD function.

  5. How might ZNF804A variants influence risk for schizophrenia and bipolar disorder? A literature review, synthesis, and bioinformatic analysis.

    Science.gov (United States)

    Hess, Jonathan L; Glatt, Stephen J

    2014-01-01

    The gene that encodes zinc finger protein 804A (ZNF804A) became a candidate risk gene for schizophrenia (SZ) after surpassing genome-wide significance thresholds in replicated genome-wide association scans and meta-analyses. Much remains unknown about this reported gene expression regulator; however, preliminary work has yielded insights into functional and biological effects of ZNF804A by targeting its regulatory activities in vitro and by characterizing allele-specific interactions with its risk-conferring single nucleotide polymorphisms (SNPs). There is now strong epidemiologic evidence for a role of ZNF804A polymorphisms in both SZ and bipolar disorder (BD); however, functional links between implicated variants and susceptible biological states have not been solidified. Here we briefly review the genetic evidence implicating ZNF804A polymorphisms as genetic risk factors for both SZ and BD, and discuss the potential functional consequences of these variants on the regulation of ZNF804A and its downstream targets. Empirical work and predictive bioinformatic analyses of the alternate alleles of the two most strongly implicated ZNF804A polymorphisms suggest they might alter the affinity of the gene sequence for DNA- and/or RNA-binding proteins, which might in turn alter expression levels of the gene or particular ZNF804A isoforms. Future work should focus on clarifying the critical periods and cofactors regulating these genetic influences on ZNF804A expression, as well as the downstream biological consequences of an imbalance in the expression of ZNF804A and its various mRNA isoforms.

  6. Possible Biomarkers for the Early Detection of HIV-associated Heart Diseases: A Proteomics and Bioinformatics Prediction

    Directory of Open Access Journals (Sweden)

    Suraiya Rasheed

    2015-01-01

    Full Text Available The frequency of cardiovascular disorders is increasing in HIV-infected individuals despite a significant reduction in the viral load by antiretroviral therapies (ART. Since the CD4+ T-cells are responsible for the viral load as well as immunological responses, we hypothesized that chronic HIV-infection of T-cells produces novel proteins/enzymes that cause cardiac dysfunctions. To identify specific factors that might cause cardiac disorders without the influence of numerous cofactors produced by other pathogenic microorganisms that co-inhabit most HIV-infected individuals, we analyzed genome-wide proteomes of a CD4+ T-cell line at different stages of HIV replication and cell growth over >6 months. Subtractive analyses of several hundred differentially regulated proteins from HIV-infected and uninfected counterpart cells and comparisons with proteins expressed from the same cells after treating with the antiviral drug Zidovudine/AZT and inhibiting virus replication, identified a well-coordinated network of 12 soluble/diffusible proteins in HIV-infected cells. Functional categorization, bioinformatics and statistical analyses of each protein predicted that the expression of cardiac-specific Ca2+ kinase together with multiple Ca2+ release channels causes a sustained overload of Ca2+ in the heart which induces fetal/cardiac myosin heavy chains (MYH6 and MYH7 and a myosin light-chain kinase. Each of these proteins has been shown to cause cardiac stress, arrhythmia, hypertrophic signaling, cardiomyopathy and heart failure (p = 8 × 10−11. Translational studies using the newly discovered proteins produced by HIV infection alone would provide additional biomarkers that could be added to the conventional markers for an early diagnosis and/or development of specific therapeutic interventions for heart diseases in HIV-infected individuals.

  7. The Enzyme Portal: a case study in applying user-centred design methods in bioinformatics.

    Science.gov (United States)

    de Matos, Paula; Cham, Jennifer A; Cao, Hong; Alcántara, Rafael; Rowland, Francis; Lopez, Rodrigo; Steinbeck, Christoph

    2013-03-20

    User-centred design (UCD) is a type of user interface design in which the needs and desires of users are taken into account at each stage of the design process for a service or product; often for software applications and websites. Its goal is to facilitate the design of software that is both useful and easy to use. To achieve this, you must characterise users' requirements, design suitable interactions to meet their needs, and test your designs using prototypes and real life scenarios.For bioinformatics, there is little practical information available regarding how to carry out UCD in practice. To address this we describe a complete, multi-stage UCD process used for creating a new bioinformatics resource for integrating enzyme information, called the Enzyme Portal (http://www.ebi.ac.uk/enzymeportal). This freely-available service mines and displays data about proteins with enzymatic activity from public repositories via a single search, and includes biochemical reactions, biological pathways, small molecule chemistry, disease information, 3D protein structures and relevant scientific literature.We employed several UCD techniques, including: persona development, interviews, 'canvas sort' card sorting, user workflows, usability testing and others. Our hope is that this case study will motivate the reader to apply similar UCD approaches to their own software design for bioinformatics. Indeed, we found the benefits included more effective decision-making for design ideas and technologies; enhanced team-working and communication; cost effectiveness; and ultimately a service that more closely meets the needs of our target audience.

  8. Interoperability of GADU in using heterogeneous grid resources for bioinformatics applications.

    Science.gov (United States)

    Sulakhe, Dinanath; Rodriguez, Alex; Wilde, Michael; Foster, Ian; Maltsev, Natalia

    2008-03-01

    Bioinformatics tools used for efficient and computationally intensive analysis of genetic sequences require large-scale computational resources to accommodate the growing data. Grid computational resources such as the Open Science Grid and TeraGrid have proved useful for scientific discovery. The genome analysis and database update system (GADU) is a high-throughput computational system developed to automate the steps involved in accessing the Grid resources for running bioinformatics applications. This paper describes the requirements for building an automated scalable system such as GADU that can run jobs on different Grids. The paper describes the resource-independent configuration of GADU using the Pegasus-based virtual data system that makes high-throughput computational tools interoperable on heterogeneous Grid resources. The paper also highlights the features implemented to make GADU a gateway to computationally intensive bioinformatics applications on the Grid. The paper will not go into the details of problems involved or the lessons learned in using individual Grid resources as it has already been published in our paper on genome analysis research environment (GNARE) and will focus primarily on the architecture that makes GADU resource independent and interoperable across heterogeneous Grid resources.

  9. The Web as an educational tool for/in learning/teaching bioinformatics statistics.

    Science.gov (United States)

    Oliver, J; Pisano, M E; Alonso, T; Roca, P

    2005-12-01

    Statistics provides essential tool in Bioinformatics to interpret the results of a database search or for the management of enormous amounts of information provided from genomics, proteomics and metabolomics. The goal of this project was the development of a software tool that would be as simple as possible to demonstrate the use of the Bioinformatics statistics. Computer Simulation Methods (CSMs) developed using Microsoft Excel were chosen for their broad range of applications, immediate and easy formula calculation, immediate testing and easy graphics representation, and of general use and acceptance by the scientific community. The result of these endeavours is a set of utilities which can be accessed from the following URL: http://gmein.uib.es/bioinformatica/statistics. When tested on students with previous coursework with traditional statistical teaching methods, the general opinion/overall consensus was that Web-based instruction had numerous advantages, but traditional methods with manual calculations were also needed for their theory and practice. Once having mastered the basic statistical formulas, Excel spreadsheets and graphics were shown to be very useful for trying many parameters in a rapid fashion without having to perform tedious calculations. CSMs will be of great importance for the formation of the students and professionals in the field of bioinformatics, and for upcoming applications of self-learning and continuous formation.

  10. Cloning and Bioinformatics Analysis of ZmERECTA-LIKE1 and Construction of Plant Expression Vector

    Institute of Scientific and Technical Information of China (English)

    Yihong JI; Jinbao PAN; Min LU; Jun HAN; Zhangjie NAN; Qingpeng SUN

    2016-01-01

    Objective] This study was conducted to clone and analyze ERECTA-LIKE1 gene in Zea mays by PCR and bioinfor-matics methods and to construct plant expression vector pCambia3301-zmERECTA-LIKE1. [Method] zmERECTA-LIKE1 (zmERL1) gene was obtained using RT-PCR, and physical-chemical properties were analyzed by bioinformatics methods, including domains, transmembrane regions, N-Glycosylation potential sites phosphorylation sites, and etc. [Result] Bioinformatics results showed that zmERL1 gene was 2 169 bp, which encoded a protein consisting of 722 amino acids, 11 N-glycosylation potential sites and 42 kinase specific phosphorylation sites. According to CDD2.23 and TMHMM Server v. 2.0 software, there were leucine-rich repeats, a PKC domain and a transmembrane region in this protein. The theoretical pI and molecular weight of zmERL1 encoded protein was 6.20 and 79 184.8 using Compute PI/Mw tool. Furthermore, we constructed the plant expression vector pCambia3301-zmERECTA-LIKE1 by subcloning zmERL1 gene into pCambia3301 instead of GUS. [Conclusion] The results provide a theoretical basis for the application of zmERL1 gene in future study.

  11. Using Bioinformatics to Develop and Test Hypotheses: E. coli-Specific Virulence Determinants

    Directory of Open Access Journals (Sweden)

    Joanna R. Klein

    2012-09-01

    Full Text Available Bioinformatics, the use of computer resources to understand biological information, is an important tool in research, and can be easily integrated into the curriculum of undergraduate courses. Such an example is provided in this series of four activities that introduces students to the field of bioinformatics as they design PCR based tests for pathogenic E. coli strains. A variety of computer tools are used including BLAST searches at NCBI, bacterial genome searches at the Integrated Microbial Genomes (IMG database, protein analysis at Pfam and literature research at PubMed. In the process, students also learn about virulence factors, enzyme function and horizontal gene transfer. Some or all of the four activities can be incorporated into microbiology or general biology courses taken by students at a variety of levels, ranging from high school through college. The activities build on one another as they teach and reinforce knowledge and skills, promote critical thinking, and provide for student collaboration and presentation. The computer-based activities can be done either in class or outside of class, thus are appropriate for inclusion in online or blended learning formats. Assessment data showed that students learned general microbiology concepts related to pathogenesis and enzyme function, gained skills in using tools of bioinformatics and molecular biology, and successfully developed and tested a scientific hypothesis.

  12. Robust High-dimensional Bioinformatics Data Streams Mining by ODR-ioVFDT

    Science.gov (United States)

    Wang, Dantong; Fong, Simon; Wong, Raymond K.; Mohammed, Sabah; Fiaidhi, Jinan; Wong, Kelvin K. L.

    2017-01-01

    Outlier detection in bioinformatics data streaming mining has received significant attention by research communities in recent years. The problems of how to distinguish noise from an exception and deciding whether to discard it or to devise an extra decision path for accommodating it are causing dilemma. In this paper, we propose a novel algorithm called ODR with incrementally Optimized Very Fast Decision Tree (ODR-ioVFDT) for taking care of outliers in the progress of continuous data learning. By using an adaptive interquartile-range based identification method, a tolerance threshold is set. It is then used to judge if a data of exceptional value should be included for training or otherwise. This is different from the traditional outlier detection/removal approaches which are two separate steps in processing through the data. The proposed algorithm is tested using datasets of five bioinformatics scenarios and comparing the performance of our model and other ones without ODR. The results show that ODR-ioVFDT has better performance in classification accuracy, kappa statistics, and time consumption. The ODR-ioVFDT applied onto bioinformatics streaming data processing for detecting and quantifying the information of life phenomena, states, characters, variables and components of the organism can help to diagnose and treat disease more effectively. PMID:28230161

  13. The Expression and Bioinformatic Analysis of a Novel Gene C20orf14 Associated with Lymphoma

    Institute of Scientific and Technical Information of China (English)

    Liangping SU; Deng CHEN; Jianming ZHANG; Ximing LI; Guihong PAN; Xiangyang BAI; Yunping LU; Jianfeng ZHOU; Shuang LI

    2008-01-01

    The aim of the present study was to explore the differentially expressed genes in the blood vessel endothelial cells (BVECs) between diffuse large B-cell lymphoma (DLBCL) and reac- tive lymph node hyperplasia (RLNH), and to perform an initial bioinformatics analysis on a novel gene, C20orf14, which is highly expressed in lymph node of lymphoma. The mRNA of the tissue from the BVECs of DLBCL and RLNH tissues was labeled with biotin respectively and hybridized with expression profile microarray, and the differentially expressed genes were obtained. Initial bio- informatics analysis was performed on a novel gene named C20orf14. Its gene structure, genomic lo- calization, the physical and chemical characteristics of the putative protein, subcellular localization, functional domain etc. were predicted, and the systematic evolution analysis was performed on the similar proteins among several species. By using expression profile microarray, many differentially expressed genes were uncovered. The efficient bioinformatics analysis have fundamentally identified that C20orfl4 was a nuclear protein, and may be involved in the post-transcription modification of mRNA. Therefore, microarray is an efficient and high throughout strategy for the detection of differ- entially expressed genes, and C20orf14 is thought to be a potential target for tumor metastasis re- searches by bioinformatics analysis.

  14. Controlling new knowledge: Genomic science, governance and the politics of bioinformatics.

    Science.gov (United States)

    Salter, Brian; Salter, Charlotte

    2017-01-01

    The rise of bioinformatics is a direct response to the political difficulties faced by genomics in its quest to be a new biomedical innovation, and the value of bioinformatics lies in its role as the bridge between the promise of genomics and its realization in the form of health benefits. Western scientific elites are able to use their close relationship with the state to control and facilitate the emergence of new domains compatible with the existing distribution of epistemic power - all within the embrace of public trust. The incorporation of bioinformatics as the saviour of genomics had to be integrated with the operation of two key aspects of governance in this field: the definition and ownership of the new knowledge. This was achieved mainly by the development of common standards and by the promotion of the values of communality, open access and the public ownership of data to legitimize and maintain the governance power of publicly funded genomic science. Opposition from industry advocating the private ownership of knowledge has been largely neutered through the institutions supporting the science-state concordat. However, in order for translation into health benefits to occur and public trust to be assured, genomic and clinical data have to be integrated and knowledge ownership agreed upon across the separate and distinct governance territories of scientist, clinical medicine and society. Tensions abound as science seeks ways of maintaining its control of knowledge production through the negotiation of new forms of governance with the institutions and values of clinicians and patients.

  15. Advances in G-protein coupled receptor research and related bioinformatics study

    Institute of Scientific and Technical Information of China (English)

    2003-01-01

    G-protein coupled receptor (GPCR) is one of the most important protein families for drug target. GPCR agonists and antagonists occupy approximately one third of the world small molecule drug market. Much effort has been invested in GPCR study by both academic institutions and pharmaceutical industries. With seven-transmembrane domains, GPCR plays significant roles in intercellular signal transduction and is involved in a variety of biological pathways. With the availability of sequence data of human and other mammalian genomes, as well as their expressed sequence tag (EST) data, the bioinformatics and genomics approaches can be applied to identifying novel GPCR in the post genomic era. Deorphanizing GPCR or matching ligands with GPCR greatly facilitates target validation process and automatically provides a possible compound screening assay. Similarly, bioinformatics data mining approach could also be applied to the identification of GPCR peptide or protein ligands. Here we give a general review of recent advances in the study of GPCR structure, function, as well as GPCR and ligand identification with the emphasis on the bioinformatics database mining of GPCR and their peptide or protein ligands.

  16. Recent developments in genomics, bioinformatics and drug discovery to combat emerging drug-resistant tuberculosis.

    Science.gov (United States)

    Swaminathan, Soumya; Sundaramurthi, Jagadish Chandrabose; Palaniappan, Alangudi Natarajan; Narayanan, Sujatha

    2016-12-01

    Emergence of drug-resistant tuberculosis (DR-TB) is a big challenge in TB control. The delay in diagnosis of DR-TB leads to its increased transmission, and therefore prevalence. Recent developments in genomics have enabled whole genome sequencing (WGS) of Mycobacterium tuberculosis (M. tuberculosis) from 3-day-old liquid culture and directly from uncultured sputa, while new bioinformatics tools facilitate to determine DR mutations rapidly from the resulting sequences. The present drug discovery and development pipeline is filled with candidate drugs which have shown efficacy against DR-TB. Furthermore, some of the FDA-approved drugs are being evaluated for repurposing, and this approach appears promising as several drugs are reported to enhance efficacy of the standard TB drugs, reduce drug tolerance, or modulate the host immune response to control the growth of intracellular M. tuberculosis. Recent developments in genomics and bioinformatics along with new drug discovery collectively have the potential to result in synergistic impact leading to the development of a rapid protocol to determine the drug resistance profile of the infecting strain so as to provide personalized medicine. Hence, in this review, we discuss recent developments in WGS, bioinformatics and drug discovery to perceive how they would transform the management of tuberculosis in a timely manner.

  17. Facilitating the use of large-scale biological data and tools in the era of translational bioinformatics

    DEFF Research Database (Denmark)

    Kouskoumvekaki, Irene; Shublaq, Nour; Brunak, Søren

    2014-01-01

    As both the amount of generated biological data and the processing compute power increase, computational experimentation is no longer the exclusivity of bioinformaticians, but it is moving across all biomedical domains. For bioinformatics to realize its translational potential, domain experts need...... access to user-friendly solutions to navigate, integrate and extract information out of biological databases, as well as to combine tools and data resources in bioinformatics workflows. In this review, we present services that assist biomedical scientists in incorporating bioinformatics tools...... into their research.We review recent applications of Cytoscape, BioGPS and DAVID for data visualization, integration and functional enrichment. Moreover, we illustrate the use of Taverna, Kepler, GenePattern, and Galaxy as open-access workbenches for bioinformatics workflows. Finally, we mention services...

  18. Wavelet Analyses and Applications

    Science.gov (United States)

    Bordeianu, Cristian C.; Landau, Rubin H.; Paez, Manuel J.

    2009-01-01

    It is shown how a modern extension of Fourier analysis known as wavelet analysis is applied to signals containing multiscale information. First, a continuous wavelet transform is used to analyse the spectrum of a nonstationary signal (one whose form changes in time). The spectral analysis of such a signal gives the strength of the signal in each…

  19. Report sensory analyses veal

    NARCIS (Netherlands)

    Veldman, M.; Schelvis-Smit, A.A.M.

    2005-01-01

    On behalf of a client of Animal Sciences Group, different varieties of veal were analyzed by both instrumental and sensory analyses. The sensory evaluation was performed with a sensory analytical panel in the period of 13th of May and 31st of May, 2005. The three varieties of veal were: young bull,

  20. Contesting Citizenship: Comparative Analyses

    DEFF Research Database (Denmark)

    Siim, Birte; Squires, Judith

    2007-01-01

    . Comparative citizenship analyses need to be considered in relation to multipleinequalities and their intersections and to multiple governance and trans-national organisinf. This, in turn, suggests that comparative citizenship analysis needs to consider new spaces in which struggles for equal citizenship occur...

  1. Meta-analyses

    NARCIS (Netherlands)

    Hendriks, M.A.; Luyten, J.W.; Scheerens, J.; Sleegers, P.J.C.; Scheerens, J.

    2014-01-01

    In this chapter results of a research synthesis and quantitative meta-analyses of three facets of time effects in education are presented, namely time at school during regular lesson hours, homework, and extended learning time. The number of studies for these three facets of time that could be used

  2. eSNaPD: a versatile, web-based bioinformatics platform for surveying and mining natural product biosynthetic diversity from metagenomes.

    Science.gov (United States)

    Reddy, Boojala Vijay B; Milshteyn, Aleksandr; Charlop-Powers, Zachary; Brady, Sean F

    2014-08-14

    Environmental Surveyor of Natural Product Diversity (eSNaPD) is a web-based bioinformatics and data aggregation platform that aids in the discovery of gene clusters encoding both novel natural products and new congeners of medicinally relevant natural products using (meta)genomic sequence data. Using PCR-generated sequence tags, the eSNaPD data-analysis pipeline profiles biosynthetic diversity hidden within (meta)genomes by comparing sequence tags to a reference data set of characterized gene clusters. Sample mapping, molecule discovery, library mapping, and new clade visualization modules facilitate the interrogation of large (meta)genomic sequence data sets for diverse downstream analyses, including, but not limited to, the identification of environments rich in untapped biosynthetic diversity, targeted molecule discovery efforts, and chemical ecology studies. eSNaPD is designed to generate a global atlas of biosynthetic diversity that can facilitate a systematic, sequence-based interrogation of nature's biosynthetic potential.

  3. Bio-informatics Research Progress in the Post-genome Era Based on the Quantitative Analysis of SCIE

    Institute of Scientific and Technical Information of China (English)

    Yongqin; ZHAN; Min; YU

    2013-01-01

    SCIE paper output can reflect the status quo and trend of discipline research and 7 038 scientific articles concerning bioinformatics are retrieved in SCIE database during the years between 2008 and 2012. Quantitative analysis of paper output and citation frequency are conducted according to nations, institutions, publications, research direction as well as hot articles, which provides assistance for bioinformatics researchers to understand the present situation of this subject, carry out cooperative studies and display scientific research achievements.

  4. BioShaDock: a community driven bioinformatics shared Docker-based tools registry [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    François Moreews

    2015-12-01

    Full Text Available Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community.

  5. Making authentic science accessible—the benefits and challenges of integrating bioinformatics into a high-school science curriculum

    Science.gov (United States)

    Gelbart, Hadas; Ben-Dor, Shifra; Yarden, Anat

    2017-01-01

    Despite the central place held by bioinformatics in modern life sciences and related areas, it has only recently been integrated to a limited extent into high-school teaching and learning programs. Here we describe the assessment of a learning environment entitled ‘Bioinformatics in the Service of Biotechnology’. Students’ learning outcomes and attitudes toward the bioinformatics learning environment were measured by analyzing their answers to questions embedded within the activities, questionnaires, interviews and observations. Students’ difficulties and knowledge acquisition were characterized based on four categories: the required domain-specific knowledge (declarative, procedural, strategic or situational), the scientific field that each question stems from (biology, bioinformatics or their combination), the associated cognitive-process dimension (remember, understand, apply, analyze, evaluate, create) and the type of question (open-ended or multiple choice). Analysis of students’ cognitive outcomes revealed learning gains in bioinformatics and related scientific fields, as well as appropriation of the bioinformatics approach as part of the students’ scientific ‘toolbox’. For students, questions stemming from the ‘old world’ biology field and requiring declarative or strategic knowledge were harder to deal with. This stands in contrast to their teachers’ prediction. Analysis of students’ affective outcomes revealed positive attitudes toward bioinformatics and the learning environment, as well as their perception of the teacher’s role. Insights from this analysis yielded implications and recommendations for curriculum design, classroom enactment, teacher education and research. For example, we recommend teaching bioinformatics in an integrative and comprehensive manner, through an inquiry process, and linking it to the wider science curriculum. PMID:26801769

  6. Bioinformatic Analysis of Deleterious Non-Synonymous Single Nucleotide Polymorphisms (nsSNPs in the Coding Regions of Human Prion Protein Gene (PRNP

    Directory of Open Access Journals (Sweden)

    Kourosh Bamdad

    2016-12-01

    Full Text Available Background & Objective: Single nucleotide polymorphisms are the cause of genetic variation to living organisms. Single nucleotide polymorphisms alter residues in the protein sequence. In this investigation, the relationship between prion protein gene polymorphisms and its relevance to pathogenicity was studied. Material & Method: Amino acid sequence of the main isoform from the human prion protein gene (PRNP was extracted from UniProt database and evaluated by FoldAmyloid and AmylPred servers. All non-synonymous single nucleotide polymorphisms (nsSNPs from SNP database (dbSNP were further analyzed by bioinformatics servers including SIFT, PolyPhen-2, I-Mutant-3.0, PANTHER, SNPs & GO, PHD-SNP, Meta-SNP, and MutPred to determine the most damaging nsSNPs. Results: The results of the first structure analyses by FoldAmyloid and AmylPerd servers implied that regions including 5-15, 174-178, 180-184, 211-217, and 240-252 were the most sensitive parts of the protein sequence to amyloidosis. Screening all nsSNPs of the main protein isoform using bioinformatic servers revealed that substitution of Aspartic acid with Valine at position 178 (ID code: rs11538766 was the most deleterious nsSNP in the protein structure. Conclusion:  Substitution of the Aspartic acid with Valine at position 178 (D178V was the most pathogenic mutation in the human prion protein gene. Analyses from the MutPred server also showed that beta-sheets’ increment in the secondary structure was the main reason behind the molecular mechanism of the prion protein aggregation.

  7. The Bioinformatic Analysis of the blcap Gene%宫颈癌相关blcap基因的生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    刘娟; 熊金虎; 伍欣星

    2004-01-01

    BLCAP is a potential gene for suppression of cervical carcinoma, which was found by analysing the cervical carcinoma specimen with the oncogene and anti-oncogene cDNA microarray. Basing on the bioinformatical analyses, we try to predict the function of blcap gene. The results show that there are several genes that highly resemble with blcap. The comparability between the sequences of blcap and Homo sapiens mRNA (DKFZp564M053) or BC10 is 99% and 87%, respectively. The protein encoded by BLCAP is composed of Leu(19.5%), pro(9.19%), ser(8.04%)、 cys(8.04%) and other amino acids. The secondary structure of the N-terminal of BLCAP encoded protein is an alpha helix. In the C-terminal, it is beta sheet and in the middle, it is coil. The of the terminals is more hydrophobile than the middle region. Between 45-55aa, there is a transmembrane region. Therefore, we forecast the BLCAP is a member of transmembrane protein I. By analyzing the signal peptide and the procedure of blcap gene with the program of SignalP (V1.1), we found a cleavage site in 59-66aa. By using the program of Netpho, we predicted there might be three phospholate sites at 68aa, 73aa and 78aa. At 78-81aa, we found a typical [ST]-X [2] -[DE] structure—the phospholate site of tyrosine protein kinase, which might be related to its function. Bioinformatic studies of blcap provided the foundation for the function researches of BLCAP in laboratory.

  8. Analysing Access Control Specifications

    DEFF Research Database (Denmark)

    Probst, Christian W.; Hansen, René Rydhof

    2009-01-01

    . Recent events have revealed intimate knowledge of surveillance and control systems on the side of the attacker, making it often impossible to deduce the identity of an inside attacker from logged data. In this work we present an approach that analyses the access control configuration to identify the set......When prosecuting crimes, the main question to answer is often who had a motive and the possibility to commit the crime. When investigating cyber crimes, the question of possibility is often hard to answer, as in a networked system almost any location can be accessed from almost anywhere. The most...... of credentials needed to reach a certain location in a system. This knowledge allows to identify a set of (inside) actors who have the possibility to commit an insider attack at that location. This has immediate applications in analysing log files, but also nontechnical applications such as identifying possible...

  9. Possible future HERA analyses

    CERN Document Server

    Geiser, Achim

    2015-01-01

    A variety of possible future analyses of HERA data in the context of the HERA data preservation programme is collected, motivated, and commented. The focus is placed on possible future analyses of the existing $ep$ collider data and their physics scope. Comparisons to the original scope of the HERA programme are made, and cross references to topics also covered by other participants of the workshop are given. This includes topics on QCD, proton structure, diffraction, jets, hadronic final states, heavy flavours, electroweak physics, and the application of related theory and phenomenology topics like NNLO QCD calculations, low-x related models, nonperturbative QCD aspects, and electroweak radiative corrections. Synergies with other collider programmes are also addressed. In summary, the range of physics topics which can still be uniquely covered using the existing data is very broad and of considerable physics interest, often matching the interest of results from colliders currently in operation. Due to well-e...

  10. Possible future HERA analyses

    Energy Technology Data Exchange (ETDEWEB)

    Geiser, Achim

    2015-12-15

    A variety of possible future analyses of HERA data in the context of the HERA data preservation programme is collected, motivated, and commented. The focus is placed on possible future analyses of the existing ep collider data and their physics scope. Comparisons to the original scope of the HERA pro- gramme are made, and cross references to topics also covered by other participants of the workshop are given. This includes topics on QCD, proton structure, diffraction, jets, hadronic final states, heavy flavours, electroweak physics, and the application of related theory and phenomenology topics like NNLO QCD calculations, low-x related models, nonperturbative QCD aspects, and electroweak radiative corrections. Synergies with other collider programmes are also addressed. In summary, the range of physics topics which can still be uniquely covered using the existing data is very broad and of considerable physics interest, often matching the interest of results from colliders currently in operation. Due to well-established data and MC sets, calibrations, and analysis procedures the manpower and expertise needed for a particular analysis is often very much smaller than that needed for an ongoing experiment. Since centrally funded manpower to carry out such analyses is not available any longer, this contribution not only targets experienced self-funded experimentalists, but also theorists and master-level students who might wish to carry out such an analysis.

  11. Biomass feedstock analyses

    Energy Technology Data Exchange (ETDEWEB)

    Wilen, C.; Moilanen, A.; Kurkela, E. [VTT Energy, Espoo (Finland). Energy Production Technologies

    1996-12-31

    The overall objectives of the project `Feasibility of electricity production from biomass by pressurized gasification systems` within the EC Research Programme JOULE II were to evaluate the potential of advanced power production systems based on biomass gasification and to study the technical and economic feasibility of these new processes with different type of biomass feed stocks. This report was prepared as part of this R and D project. The objectives of this task were to perform fuel analyses of potential woody and herbaceous biomasses with specific regard to the gasification properties of the selected feed stocks. The analyses of 15 Scandinavian and European biomass feed stock included density, proximate and ultimate analyses, trace compounds, ash composition and fusion behaviour in oxidizing and reducing atmospheres. The wood-derived fuels, such as whole-tree chips, forest residues, bark and to some extent willow, can be expected to have good gasification properties. Difficulties caused by ash fusion and sintering in straw combustion and gasification are generally known. The ash and alkali metal contents of the European biomasses harvested in Italy resembled those of the Nordic straws, and it is expected that they behave to a great extent as straw in gasification. Any direct relation between the ash fusion behavior (determined according to the standard method) and, for instance, the alkali metal content was not found in the laboratory determinations. A more profound characterisation of the fuels would require gasification experiments in a thermobalance and a PDU (Process development Unit) rig. (orig.) (10 refs.)

  12. AMS analyses at ANSTO

    Energy Technology Data Exchange (ETDEWEB)

    Lawson, E.M. [Australian Nuclear Science and Technology Organisation, Lucas Heights, NSW (Australia). Physics Division

    1998-03-01

    The major use of ANTARES is Accelerator Mass Spectrometry (AMS) with {sup 14}C being the most commonly analysed radioisotope - presently about 35 % of the available beam time on ANTARES is used for {sup 14}C measurements. The accelerator measurements are supported by, and dependent on, a strong sample preparation section. The ANTARES AMS facility supports a wide range of investigations into fields such as global climate change, ice cores, oceanography, dendrochronology, anthropology, and classical and Australian archaeology. Described here are some examples of the ways in which AMS has been applied to support research into the archaeology, prehistory and culture of this continent`s indigenous Aboriginal peoples. (author)

  13. Comparative genome sequencing of Drosophila pseudoobscura: Chromosomal, gene, and cis-element evolution

    DEFF Research Database (Denmark)

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.;

    2005-01-01

    between the species-but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence...

  14. A New Approach to Sequence Analysis Exemplified by Identification of cis-Elements in Abscisic Acid Inducible Promoters

    DEFF Research Database (Denmark)

    Busk, Peter Kamp; Hallin, Peter Fischer; Salomon, Jesper

    Identification of cis-regulatory elements in DNA promoters by computational approaches are a great help in genome-wide elucidation of signal transduction, transcriptional regulation and gene expression. However, to accomplish this has proven a difficult problem in computational genomics in part d...

  15. Whorl-specific expression of the SUPERMAN gene of Arabidopsis is mediated by cis elements in the transcribed region.

    Science.gov (United States)

    Ito, Toshiro; Sakai, Hajime; Meyerowitz, Elliot M

    2003-09-02

    The SUPERMAN (SUP) gene of Arabidopsis is involved in controlling cell proliferation in stamen and carpel primordia and in ovules during flower development. The SUP gene encodes a transcription factor with a C2H2-type zinc finger motif, a serine/proline-rich domain, a basic domain, and a leucine-zipper-like domain and is expressed in a very limited region in stamen primordia and in the developing ovary during flower development. The SUP gene is susceptible to methylation, resulting in epigenetic gene silencing. To understand how the SUP gene is expressed spatially and temporally in its restricted domain, and why methylation of the transcribed region affects early-stage SUP expression, we have identified the SUP cis regulatory elements by characterizing SUP gene fusions. These studies show that the SUP gene has discrete upstream promoter elements required for expression in stamen primordia in early stages and in the ovary in later stages. The promoter activity for stamen primordia is modulated by several positive and negative elements located in the transcribed and translated regions. Several regulatory elements in the transcribed region correlate with the areas of the gene that are heavily methylated in epigenetic alleles; these data provide a possible explanation of how methylation of the transcribed region represses transcription.

  16. Meta-analysis of breast cancer microarray studies in conjunction with conserved cis-elements suggest patterns for coordinate regulation

    Directory of Open Access Journals (Sweden)

    Lundberg Cathryn

    2008-01-01

    Full Text Available Abstract Background Gene expression measurements from breast cancer (BrCa tumors are established clinical predictive tools to identify tumor subtypes, identify patients showing poor/good prognosis, and identify patients likely to have disease recurrence. However, diverse breast cancer datasets in conjunction with diagnostic clinical arrays show little overlap in the sets of genes identified. One approach to identify a set of consistently dysregulated candidate genes in these tumors is to employ meta-analysis of multiple independent microarray datasets. This allows one to compare expression data from a diverse collection of breast tumor array datasets generated on either cDNA or oligonucleotide arrays. Results We gathered expression data from 9 published microarray studies examining estrogen receptor positive (ER+ and estrogen receptor negative (ER- BrCa tumor cases from the Oncomine database. We performed a meta-analysis and identified genes that were universally up or down regulated with respect to ER+ versus ER- tumor status. We surveyed both the proximal promoter and 3' untranslated regions (3'UTR of our top-ranking genes in each expression group to test whether common sequence elements may contribute to the observed expression patterns. Utilizing a combination of known transcription factor binding sites (TFBS, evolutionarily conserved mammalian promoter and 3'UTR motifs, and microRNA (miRNA seed sequences, we identified numerous motifs that were disproportionately represented between the two gene classes suggesting a common regulatory network for the observed gene expression patterns. Conclusion Some of the genes we identified distinguish key transcripts previously seen in array studies, while others are newly defined. Many of the genes identified as overexpressed in ER- tumors were previously identified as expression markers for neoplastic transformation in multiple human cancers. Moreover, our motif analysis identified a collection of specific cis-acting target sites which may collectively play a role in the differential gene expression patterns observed in ER+ versus ER- breast cancer tumors. Importantly, the gene sets and associated DNA motifs provide a starting point with which to explore the mechanistic basis for the observed expression patterns in breast tumors.

  17. Bioinformatics: Cheap and robust method to explore biomaterial from Indonesia biodiversity

    Science.gov (United States)

    Widodo

    2015-02-01

    Indonesia has a huge amount of biodiversity, which may contain many biomaterials for pharmaceutical application. These resources potency should be explored to discover new drugs for human wealth. However, the bioactive screening using conventional methods is very expensive and time-consuming. Therefore, we developed a methodology for screening the potential of natural resources based on bioinformatics. The method is developed based on the fact that organisms in the same taxon will have similar genes, metabolism and secondary metabolites product. Then we employ bioinformatics to explore the potency of biomaterial from Indonesia biodiversity by comparing species with the well-known taxon containing the active compound through published paper or chemical database. Then we analyze drug-likeness, bioactivity and the target proteins of the active compound based on their molecular structure. The target protein was examined their interaction with other proteins in the cell to determine action mechanism of the active compounds in the cellular level, as well as to predict its side effects and toxicity. By using this method, we succeeded to screen anti-cancer, immunomodulators and anti-inflammation from Indonesia biodiversity. For example, we found anticancer from marine invertebrate by employing the method. The anti-cancer was explore based on the isolated compounds of marine invertebrate from published article and database, and then identified the protein target, followed by molecular pathway analysis. The data suggested that the active compound of the invertebrate able to kill cancer cell. Further, we collect and extract the active compound from the invertebrate, and then examined the activity on cancer cell (MCF7). The MTT result showed that the methanol extract of marine invertebrate was highly potent in killing MCF7 cells. Therefore, we concluded that bioinformatics is cheap and robust way to explore bioactive from Indonesia biodiversity for source of drug and another

  18. A bioinformatics search pipeline, RNA2DSearch, identifies RNA localization elements in Drosophila retrotransposons.

    Science.gov (United States)

    Hamilton, Russell S; Hartswood, Eve; Vendra, Georgia; Jones, Cheryl; Van De Bor, Veronique; Finnegan, David; Davis, Ilan

    2009-02-01

    mRNA localization is a widespread mode of delivering proteins to their site of function. The embryonic axes in Drosophila are determined in the oocyte, through Dynein-dependent transport of gurken/TGF-alpha mRNA, containing a small localization signal that assigns its destination. A signal with a similar secondary structure, but lacking significant sequence similarity, is present in the I factor retrotransposon mRNA, also transported by Dynein. It is currently unclear whether other mRNAs exist that are localized to the same site using similar signals. Moreover, searches for other genes containing similar elements have not been possible due to a lack of suitable bioinformatics methods for searches of secondary structure elements and the difficulty of experimentally testing all the possible candidates. We have developed a bioinformatics approach for searching across the genome for small RNA elements that are similar to the secondary structures of particular localization signals. We have uncovered 48 candidates, of which we were able to test 22 for their localization potential using injection assays for Dynein mediated RNA localization. We found that G2 and Jockey transposons each contain a gurken/I factor-like RNA stem-loop required for Dynein-dependent localization to the anterior and dorso-anterior corner of the oocyte. We conclude that I factor, G2, and Jockey are members of a "family" of transposable elements sharing a gurken-like mRNA localization signal and Dynein-dependent mechanism of transport. The bioinformatics pipeline we have developed will have broader utility in fields where small RNA signals play important roles.

  19. Should we have blind faith in bioinformatics software? Illustrations from the SNAP web-based tool.

    Directory of Open Access Journals (Sweden)

    Sébastien Robiou-du-Pont

    Full Text Available Bioinformatics tools have gained popularity in biology but little is known about their validity. We aimed to assess the early contribution of 415 single nucleotide polymorphisms (SNPs associated with eight cardio-metabolic traits at the genome-wide significance level in adults in the Family Atherosclerosis Monitoring In earLY Life (FAMILY birth cohort. We used the popular web-based tool SNAP to assess the availability of the 415 SNPs in the Illumina Cardio-Metabochip genotyped in the FAMILY study participants. We then compared the SNAP output with the Cardio-Metabochip file provided by Illumina using chromosome and chromosomal positions of SNPs from NCBI Human Genome Browser (Genome Reference Consortium Human Build 37. With the HapMap 3 release 2 reference, 201 out of 415 SNPs were reported as missing in the Cardio-Metabochip by the SNAP output. However, the Cardio-Metabochip file revealed that 152 of these 201 SNPs were in fact present in the Cardio-Metabochip array (false negative rate of 36.6%. With the more recent 1000 Genomes Project release, we found a false-negative rate of 17.6% by comparing the outputs of SNAP and the Illumina product file. We did not find any 'false positive' SNPs (SNPs specified as available in the Cardio-Metabochip by SNAP, but not by the Cardio-Metabochip Illumina file. The Cohen's Kappa coefficient, which calculates the percentage of agreement between both methods, indicated that the validity of SNAP was fair to moderate depending on the reference used (the HapMap 3 or 1000 Genomes. In conclusion, we demonstrate that the SNAP outputs for the Cardio-Metabochip are invalid. This study illustrates the importance of systematically assessing the validity of bioinformatics tools in an independent manner. We propose a series of guidelines to improve practices in the fast-moving field of bioinformatics software implementation.

  20. Bioinformatics analysis of the factors controlling type I IFN gene expression in autoimmune disease and virus-induced immunity

    Directory of Open Access Journals (Sweden)

    Di eFeng

    2013-09-01

    Full Text Available Patients with systemic lupus erythematosus (SLE and Sjögren's syndrome (SS display increased levels of type I IFN-induced genes. Plasmacytoid dendritic cells (PDCs are natural interferon producing cells and considered to be a primary source of IFN-α in these two diseases. Differential expression patterns of type I IFN inducible transcripts can be found in different immune cell subsets and in patients with both active and inactive autoimmune disease. A type I IFN gene signature generally consists of three groups of IFN-induced genes - those regulated in response to virus-induced type I IFN, those regulated by the IFN-induced mitogen-activated protein kinase/extracellular-regulated kinase (MAPK/ERK pathway, and those by the IFN-induced phosphoinositide-3 kinase (PI-3K pathway. These three groups of type I IFN-regulated genes control important cellular processes such as apoptosis, survival, adhesion, and chemotaxis, that when dysregulated, contribute to autoimmunity. With the recent generation of large datasets in the public domain from next-generation sequencing and DNA microarray experiments, one can perform detailed analyses of cell type-specific gene signatures as well as identify distinct transcription factors that differentially regulate these gene signatures. We have performed bioinformatics analysis of data in the public domain and experimental data from our lab to gain insight into the regulation of type I IFN gene expression. We have found that the genetic landscape of the IFNA and IFNB genes are occupied by transcription factors, such as insulators CTCF and cohesin, that negatively regulate transcription, as well as IRF5 and IRF7, that positively and distinctly regulate IFNA subtypes. A detailed understanding of the factors controlling type I IFN gene transcription will significantly aid in the identification and development of new therapeutic strategies targeting the IFN pathway in autoimmune disease.

  1. Bioinformatic analysis of microRNA networks following the activation of the constitutive androstane receptor (CAR) in mouse liver.

    Science.gov (United States)

    Hao, Ruixin; Su, Shengzhong; Wan, Yinan; Shen, Frank; Niu, Ben; Coslo, Denise M; Albert, Istvan; Han, Xing; Omiecinski, Curtis J

    2016-09-01

    The constitutive androstane receptor (CAR; NR1I3) is a member of the nuclear receptor superfamily that functions as a xenosensor, serving to regulate xenobiotic detoxification, lipid homeostasis and energy metabolism. CAR activation is also a key contributor to the development of chemical hepatocarcinogenesis in mice. The underlying pathways affected by CAR in these processes are complex and not fully elucidated. MicroRNAs (miRNAs) have emerged as critical modulators of gene expression and appear to impact many cellular pathways, including those involved in chemical detoxification and liver tumor development. In this study, we used deep sequencing approaches with an Illumina HiSeq platform to differentially profile microRNA expression patterns in livers from wild type C57BL/6J mice following CAR activation with the mouse CAR-specific ligand activator, 1,4-bis-[2-(3,5,-dichloropyridyloxy)] benzene (TCPOBOP). Bioinformatic analyses and pathway evaluations were performed leading to the identification of 51 miRNAs whose expression levels were significantly altered by TCPOBOP treatment, including mmu-miR-802-5p and miR-485-3p. Ingenuity Pathway Analysis of the differentially expressed microRNAs revealed altered effector pathways, including those involved in liver cell growth and proliferation. A functional network among CAR targeted genes and the affected microRNAs was constructed to illustrate how CAR modulation of microRNA expression may potentially mediate its biological role in mouse hepatocyte proliferation. This article is part of a Special Issue entitled: Xenobiotic nuclear receptors: New Tricks for An Old Dog, edited by Dr. Wen Xie.

  2. Identification of Immunoreactive Leishmania infantum Protein Antigens to Asymptomatic Dog Sera through Combined Immunoproteomics and Bioinformatics Analysis

    Science.gov (United States)

    Samiotaki, Martina; Panayotou, George; Karagouni, Evdokia

    2016-01-01

    Leishmania infantum is the etiologic agent of zoonotic visceral leishmaniasis (VL) in countries in the Mediterranean basin, where dogs are the domestic reservoirs and represent important elements in the transmission of the disease. Since the major focal areas of human VL exhibit a high prevalence of seropositive dogs, the control of canine VL could reduce the infection rate in humans. Efforts toward this have focused on the improvement of diagnostic tools, as well as on vaccine development. The identification of parasite antigens including suitable major histocompatibility complex (MHC) class I- and/or II-restricted epitopes is very important since disease protection is characterized by strong and long-lasting CD8+ T and CD4+ Th1 cell-dominated immunity. In the present study, total protein extract from late-log phase L. infantum promastigotes was analyzed by two-dimensional western blots and probed with sera from asymptomatic and symptomatic dogs. A total of 42 protein spots were found to differentially react with IgG from asymptomatic dogs, while 17 of these identified by Coommasie stain were extracted and analyzed. Of these, 21 proteins were identified by mass spectrometry; they were mainly involved in metabolism and stress responses. An in silico analysis predicted that the chaperonin HSP60, dihydrolipoamide dehydrogenase, enolase, cyclophilin 2, cyclophilin 40, and one hypothetical protein contain promiscuous MHCI and/or MHCII epitopes. Our results suggest that the combination of immunoproteomics and bioinformatics analyses is a promising method for the identification of novel candidate antigens for vaccine development or with potential use in the development of sensitive diagnostic tests. PMID:26906226

  3. Significance and suppression of redundant IL17 responses in acute allograft rejection by bioinformatics based drug repositioning of fenofibrate.

    Directory of Open Access Journals (Sweden)

    Silke Roedder

    Full Text Available Despite advanced immunosuppression, redundancy in the molecular diversity of acute rejection (AR often results in incomplete resolution of the injury response. We present a bioinformatics based approach for identification of these redundant molecular pathways in AR and a drug repositioning approach to suppress these using FDA approved drugs currently available for non-transplant indications. Two independent microarray data-sets from human renal allograft biopsies (n = 101 from patients on majorly Th1/IFN-y immune response targeted immunosuppression, with and without AR, were profiled. Using gene-set analysis across 3305 biological pathways, significant enrichment was found for the IL17 pathway in AR in both data-sets. Recent evidence suggests IL17 pathway as an important escape mechanism when Th1/IFN-y mediated responses are suppressed. As current immunosuppressions do not specifically target the IL17 axis, 7200 molecular compounds were interrogated for FDA approved drugs with specific inhibition of this axis. A combined IL17/IFN-y suppressive role was predicted for the antilipidemic drug Fenofibrate. To assess the immunregulatory action of Fenofibrate, we conducted in-vitro treatment of anti-CD3/CD28 stimulated human peripheral blood cells (PBMC, and, as predicted, Fenofibrate reduced IL17 and IFN-γ gene expression in stimulated PMBC. In-vivo Fenofibrate treatment of an experimental rodent model of cardiac AR reduced infiltration of total leukocytes, reduced expression of IL17/IFN-y and their pathway related genes in allografts and recipients' spleens, and extended graft survival by 21 days (p<0.007. In conclusion, this study provides important proof of concept that meta-analyses of genomic data and drug databases can provide new insights into the redundancy of the rejection response and presents an economic methodology to reposition FDA approved drugs in organ transplantation.

  4. Proceedings of the 2013 MidSouth Computational Biology and Bioinformatics Society (MCBIOS) Conference.

    Science.gov (United States)

    Wren, Jonathan D; Dozmorov, Mikhail G; Burian, Dennis; Kaundal, Rakesh; Perkins, Andy; Perkins, Ed; Kupfer, Doris M; Springer, Gordon K

    2013-01-01

    The tenth annual conference of the MidSouth Computational Biology and Bioinformatics Society (MCBIOS 2013), "The 10th Anniversary in a Decade of Change: Discovery in a Sea of Data", took place at the Stoney Creek Inn & Conference Center in Columbia, Missouri on April 5-6, 2013. This year's Conference Chairs were Gordon Springer and Chi-Ren Shyu from the University of Missouri and Edward Perkins from the US Army Corps of Engineers Engineering Research and Development Center, who is also the current MCBIOS President (2012-3). There were 151 registrants and a total of 111 abstracts (51 oral presentations and 60 poster session abstracts).

  5. An interdepartmental Ph.D. program in computational biology and bioinformatics: the Yale perspective.

    Science.gov (United States)

    Gerstein, Mark; Greenbaum, Dov; Cheung, Kei; Miller, Perry L

    2007-02-01

    Computational biology and bioinformatics (CBB), the terms often used interchangeably, represent a rapidly evolving biological discipline. With the clear potential for discovery and innovation, and the need to deal with the deluge of biological data, many academic institutions are committing significant resources to develop CBB research and training programs. Yale formally established an interdepartmental Ph.D. program in CBB in May 2003. This paper describes Yale's program, discussing the scope of the field, the program's goals and curriculum, as well as a number of issues that arose in implementing the program. (Further updated information is available from the program's website, www.cbb.yale.edu.)

  6. A Generic and Dynamic Framework for Web Publishing of Bioinformatics Databases

    Institute of Scientific and Technical Information of China (English)

    2003-01-01

    The present paper covers a generic and dynamic framework for the web publishing of bioinformatics databases based upon a meta data design, Java Bean, Java Server Page(JSP), Extensible Markup Language(XML), Extensible Stylesheet Language(XSL) and Extensible Stylesheet Language Transformation(XSLT). In this framework, the content is stored in a configurable and structured XML format, dynamically generated from an Oracle Relational Database Management System(RDBMS). The presentation is dynamically generated by transforming the XML document into HTML through XSLT.This clean separation between content and presentation makes the web publishing more flexible; changing the presentation only needs a modification of the Extensive Stylesheet(XS).

  7. Bioinformatics prediction of swine MHC class I epitopes from Porcine Reproductive and Respiratory Syndrome Virus

    DEFF Research Database (Denmark)

    Welner, Simon; Nielsen, Morten; Lund, Ole

    of the host cell antiviral machinery, to the deceptive induction of a non-neutralizing antibody response through decoy antigen presentation. This, combined with a very high mutation rate, has hampered the development of safe and effective vaccines. With the overall aim to design a vaccine that induces...... an effective CTL response against PRRSV, we have taken a bioinformatics approach to identify common PRRSV epitopes predicted to react broadly with predominant swine MHC (SLA) alleles. First, the genomic integrity and sequencing method was examined for 334 available complete PRRSV type 2 genomes leaving 104...

  8. Recent advances in computational biology, bioinformatics, medicine, and healthcare by modern OR

    OpenAIRE

    Türkay, Metin; Weber, Gerhard-Wilhelm; Blazewicz, Jacek; Rauner, Marion

    2014-01-01

    CEJOR (2014) 22:427–430 DOI 10.1007/s10100-013-0327-2 EDITORIAL Recent advances in computational biology, bioinformatics, medicine, and healthcare by modern OR Gerhard-Wilhelm Weber · Jacek Blazewicz · Marion Rauner · Metin Türkay Published online: 7 September 2013 © Springer-Verlag Berlin Heidelberg 2013 At the occasion of the 25th European Conference on Operational Research, EURO XXV 2012, July 8–11, 2012, in Vilnius, Lithuania (http://www.euro-2012.lt/), the ...

  9. Quantum Bio-Informatics:From Quantum Information to Bio-Informatics

    CERN Document Server

    Freudenberg, W; Ohya, M

    2008-01-01

    The purpose of this volume is examine bio-informatics and quantum information, which are growing rapidly at present, and to attempt to connect the two, with a view to enumerating and solving the many fundamental problems they entail. To this end, we look for interdisciplinary bridges in mathematics, physics, and information and life sciences. In particular, research into a new paradigm for information science and life science on the basis of quantum theory is emphasized. Sample Chapter(s). Markov Fields on Graphs (599 KB). Contents: Markov Fields on Graphs (L Accardi & H Ohno); Some Aspects of

  10. Identification of Putative Rhamnogalacturonan-II Specific Glycosyltransferases in Arabidopsis Using a Combination of Bioinformatics Approaches

    OpenAIRE

    Aline Voxeur; Aurélie André; Christelle Breton; Patrice Lerouge

    2012-01-01

    Rhamnogalacturonan-II (RG-II) is a complex plant cell wall polysaccharide that is composed of an α(1,4)-linked homogalacturonan backbone substituted with four side chains. It exists in the cell wall in the form of a dimer that is cross-linked by a borate di-ester. Despite its highly complex structure, RG-II is evolutionarily conserved in the plant kingdom suggesting that this polymer has fundamental functions in the primary wall organisation. In this study, we have set up a bioinformatics str...

  11. Widening participation would be key in enhancing bioinformatics and genomics research in Africa

    Directory of Open Access Journals (Sweden)

    Thomas K. Karikari

    2015-09-01

    Full Text Available Bioinformatics and genome science (BGS are gradually gaining roots in Africa, contributing to studies that are leading to improved understanding of health, disease, agriculture and food security. While a few African countries have established foundations for research and training in these areas, BGS appear to be limited to only a few institutions in specific African countries. However, improving the disciplines in Africa will require pragmatic efforts to expand training and research partnerships to scientists in yet-unreached institutions. Here, we discuss the need to expand BGS programmes in Africa, and propose mechanisms to do so.

  12. Swarm intelligence in bioinformatics: methods and implementations for discovering patterns of multiple sequences.

    Science.gov (United States)

    Cui, Zhihua; Zhang, Yi

    2014-02-01

    As a promising and innovative research field, bioinformatics has attracted increasing attention recently. Beneath the enormous number of open problems in this field, one fundamental issue is about the accurate and efficient computational methodology that can deal with tremendous amounts of data. In this paper, we survey some applications of swarm intelligence to discover patterns of multiple sequences. To provide a deep insight, ant colony optimization, particle swarm optimization, artificial bee colony and artificial fish swarm algorithm are selected, and their applications to multiple sequence alignment and motif detecting problem are discussed.

  13. Bioinformatics-Based Identification of Chemosensory Proteins in African Malaria Mosquito, Anopheles gambiae

    Institute of Scientific and Technical Information of China (English)

    Zhengxi Li; Zuorui Shen; Jingjiang Zhou; Lin Field

    2003-01-01

    Chemosensory proteins (CSPs) are identifiable by four spatially conserved Cysteine residues in their primary structure or by two disulfide bridges in their tertiary structure according to the previously identified olfactory specific-D related proteins. A genomics- and bioinformatics-based approach is taken in the present study to identify the putative CSPs in the malaria-carrying mosquito, Anopheles gambiae. The results show that five out of the nine annotated candidates are the most possible Anopheles CSPs of A. gambiae. This study lays the foundation for further functional identification of Anopheles CSPs, though all of these candidates need additional experimental verification.

  14. Tissue damage in organic rainbow trout muscle investigated by proteomics and bioinformatics

    DEFF Research Database (Denmark)

    Wulff, Tune; Silva, T.; Nielsen, Michael Engelbrecht

    2013-01-01

    damage and proteins were separated using 2-DE. The experimental design included two groups of rainbow trout, which were fed organic feed either with or without astaxanthin. In total, 96 proteins were found to be affected by tissue damage, clearly demonstrating in this lower vertebrate the complexity...... and magnitude of the cellular response, in the context of a regenerative process. Using a bioinformatics approach, the main biological function of these proteins were assigned, showing the regulation of proteins involved in processes like apoptosis, iron homeostasis and regulation of muscular structure...

  15. Novel C16orf57 mutations in patients with Poikiloderma with Neutropenia: bioinformatic analysis of the protein and predicted effects of all reported mutations

    Directory of Open Access Journals (Sweden)

    Colombo Elisa A

    2012-01-01

    Full Text Available Abstract Background Poikiloderma with Neutropenia (PN is a rare autosomal recessive genodermatosis caused by C16orf57 mutations. To date 17 mutations have been identified in 31 PN patients. Results We characterize six PN patients expanding the clinical phenotype of the syndrome and the mutational repertoire of the gene. We detect the two novel C16orf57 mutations, c.232C>T and c.265+2T>G, as well as the already reported c.179delC, c.531delA and c.693+1G>T mutations. cDNA analysis evidences the presence of aberrant transcripts, and bioinformatic prediction of C16orf57 protein structure gauges the mutations effects on the folded protein chain. Computational analysis of the C16orf57 protein shows two conserved H-X-S/T-X tetrapeptide motifs marking the active site of a two-fold pseudosymmetric structure recalling the 2H phosphoesterase superfamily. Based on this model C16orf57 is likely a 2H-active site enzyme functioning in RNA processing, as a presumptive RNA ligase. According to bioinformatic prediction, all known C16orf57 mutations, including the novel mutations herein described, impair the protein structure by either removing one or both tetrapeptide motifs or by destroying the symmetry of the native folding. Finally, we analyse the geographical distribution of the recurrent mutations that depicts clusters featuring a founder effect. Conclusions In cohorts of patients clinically affected by genodermatoses with overlapping symptoms, the molecular screening of C16orf57 gene seems the proper way to address the correct diagnosis of PN, enabling the syndrome-specific oncosurveillance. The bioinformatic prediction of the C16orf57 protein structure denotes a very basic enzymatic function consistent with a housekeeping function. Detection of aberrant transcripts, also in cells from PN patients carrying early truncated mutations, suggests they might be translatable. Tissue-specific sensitivity to the lack of functionally correct protein accounts for the

  16. EEG analyses with SOBI.

    Energy Technology Data Exchange (ETDEWEB)

    Glickman, Matthew R.; Tang, Akaysha (University of New Mexico, Albuquerque, NM)

    2009-02-01

    The motivating vision behind Sandia's MENTOR/PAL LDRD project has been that of systems which use real-time psychophysiological data to support and enhance human performance, both individually and of groups. Relevant and significant psychophysiological data being a necessary prerequisite to such systems, this LDRD has focused on identifying and refining such signals. The project has focused in particular on EEG (electroencephalogram) data as a promising candidate signal because it (potentially) provides a broad window on brain activity with relatively low cost and logistical constraints. We report here on two analyses performed on EEG data collected in this project using the SOBI (Second Order Blind Identification) algorithm to identify two independent sources of brain activity: one in the frontal lobe and one in the occipital. The first study looks at directional influences between the two components, while the second study looks at inferring gender based upon the frontal component.

  17. Network class superposition analyses.

    Directory of Open Access Journals (Sweden)

    Carl A B Pearson

    Full Text Available Networks are often used to understand a whole system by modeling the interactions among its pieces. Examples include biomolecules in a cell interacting to provide some primary function, or species in an environment forming a stable community. However, these interactions are often unknown; instead, the pieces' dynamic states are known, and network structure must be inferred. Because observed function may be explained by many different networks (e.g., ≈ 10(30 for the yeast cell cycle process, considering dynamics beyond this primary function means picking a single network or suitable sample: measuring over all networks exhibiting the primary function is computationally infeasible. We circumvent that obstacle by calculating the network class ensemble. We represent the ensemble by a stochastic matrix T, which is a transition-by-transition superposition of the system dynamics for each member of the class. We present concrete results for T derived from boolean time series dynamics on networks obeying the Strong Inhibition rule, by applying T to several traditional questions about network dynamics. We show that the distribution of the number of point attractors can be accurately estimated with T. We show how to generate Derrida plots based on T. We show that T-based Shannon entropy outperforms other methods at selecting experiments to further narrow the network structure. We also outline an experimental test of predictions based on T. We motivate all of these results in terms of a popular molecular biology boolean network model for the yeast cell cycle, but the methods and analyses we introduce are general. We conclude with open questions for T, for example, application to other models, computational considerations when scaling up to larger systems, and other potential analyses.

  18. VectorBase: improvements to a bioinformatics resource for invertebrate vector genomics.

    Science.gov (United States)

    Megy, Karine; Emrich, Scott J; Lawson, Daniel; Campbell, David; Dialynas, Emmanuel; Hughes, Daniel S T; Koscielny, Gautier; Louis, Christos; Maccallum, Robert M; Redmond, Seth N; Sheehan, Andrew; Topalis, Pantelis; Wilson, Derek

    2012-01-01

    VectorBase (http://www.vectorbase.org) is a NIAID-supported bioinformatics resource for invertebrate vectors of human pathogens. It hosts data for nine genomes: mosquitoes (three Anopheles gambiae genomes, Aedes aegypti and Culex quinquefasciatus), tick (Ixodes scapularis), body louse (Pediculus humanus), kissing bug (Rhodnius prolixus) and tsetse fly (Glossina morsitans). Hosted data range from genomic features and expression data to population genetics and ontologies. We describe improvements and integration of new data that expand our taxonomic coverage. Releases are bi-monthly and include the delivery of preliminary data for emerging genomes. Frequent updates of the genome browser provide VectorBase users with increasing options for visualizing their own high-throughput data. One major development is a new population biology resource for storing genomic variations, insecticide resistance data and their associated metadata. It takes advantage of improved ontologies and controlled vocabularies. Combined, these new features ensure timely release of multiple types of data in the public domain while helping overcome the bottlenecks of bioinformatics and annotation by engaging with our user community.

  19. Design and bioinformatics analysis of novel biomimetic peptides as nanocarriers for gene transfer

    Directory of Open Access Journals (Sweden)

    Asia Majidi

    2015-01-01

    Full Text Available Objective(s: The introduction of nucleic acids into cells for therapeutic objectives is significantly hindered by the size and charge of these molecules and therefore requires efficient vectors that assist cellular uptake. For several years great efforts have been devoted to the study of development of recombinant vectors based on biological domains with potential applications in gene therapy. Such vectors have been synthesized in genetically engineered approach, resulting in biomacromolecules with new properties that are not present in nature. Materials and Methods: In this study, we have designed new peptides using homology modeling with the purpose of overcoming the cell barriers for successful gene delivery through Bioinformatics tools. Three different carriers were designed and one of those with better score through Bioinformatics tools was cloned, expressed and its affinity for pDNA was monitored. Results: The resultszz demonstrated that the vector can effectively condense pDNAinto nanoparticles with the average sizes about 100 nm. Conclusion: We hope these peptides can overcome the biological barriers associated with gene transfer, and mediate efficient gene delivery.

  20. Bioinformatics Analysis for Coding SNPs of the HLADQA1 Gene Involved in Susceptibility to Cervical Cancer

    Institute of Scientific and Technical Information of China (English)

    Yanyun Li; Jun Xing; Linsheng Zhao; Yanni Li; Yuchuan Wang; Weiming Zhang

    2006-01-01

    OBJECTIVE To analyze coding SNPs of the HLA-DQA1 gene involved in susceptibility for cervical cancer by a bioinformatics approach, and to choose some SNPs that may have an association with cervical cancer.METHODS By a SNPper tool we extracted SNPs from a public database (dbSNP), exporting them in FASTA formats suitable for subsequent use.Then we used PARSESNP as a tool for the analysis of the cSNPs.RESULTS In the cSNPs of the HLA-DQA1 gene, we find that rs9272693and rs9272703, are made up of missense mutations which convert a codon for one amino acid into a codon for a different amino acid. We chose a PSSM Difference >10 as a lower level for the scores of changes predicted to be deldterious.CONCLUSION We used a bioinformatics approach for cSNPs analysis of the HLA-DQA1 gene. This method can select the variants in a conserved region, and give a PSSM Difference score. But the results need to be verified in cervical cancer patients and a control population.

  1. Speedup bioinformatics applications on multicore-based processor using vectorizing and multithreading strategies.

    Science.gov (United States)

    Chaichoompu, Kridsadakorn; Kittitornkun, Surin; Tongsima, Sissades

    2007-12-30

    Many computational intensive bioinformatics software, such as multiple sequence alignment, population structure analysis, etc., written in C/C++ are not multicore-aware. A multicore processor is an emerging CPU technology that combines two or more independent processors into a single package. The Single Instruction Multiple Data-stream (SIMD) paradigm is heavily utilized in this class of processors. Nevertheless, most popular compilers including Microsoft Visual C/C++ 6.0, x86 gnu C-compiler gcc do not automatically create SIMD code which can fully utilize the advancement of these processors. To harness the power of the new multicore architecture certain compiler techniques must be considered. This paper presents a generic compiling strategy to assist the compiler in improving the performance of bioinformatics applications written in C/C++. The proposed framework contains 2 main steps: multithreading and vectorizing strategies. After following the strategies, the application can achieve higher speedup by taking the advantage of multicore architecture technology. Due to the extremely fast interconnection networking among multiple cores, it is suggested that the proposed optimization could be more appropriate than making use of parallelization on a small cluster computer which has larger network latency and lower bandwidth.

  2. CaPSID: A bioinformatics platform for computational pathogen sequence identification in human genomes and transcriptomes

    Directory of Open Access Journals (Sweden)

    Borozan Ivan

    2012-08-01

    Full Text Available Abstract Background It is now well established that nearly 20% of human cancers are caused by infectious agents, and the list of human oncogenic pathogens will grow in the future for a variety of cancer types. Whole tumor transcriptome and genome sequencing by next-generation sequencing technologies presents an unparalleled opportunity for pathogen detection and discovery in human tissues but requires development of new genome-wide bioinformatics tools. Results Here we present CaPSID (Computational Pathogen Sequence IDentification, a comprehensive bioinformatics platform for identifying, querying and visualizing both exogenous and endogenous pathogen nucleotide sequences in tumor genomes and transcriptomes. CaPSID includes a scalable, high performance database for data storage and a web application that integrates the genome browser JBrowse. CaPSID also provides useful metrics for sequence analysis of pre-aligned BAM files, such as gene and genome coverage, and is optimized to run efficiently on multiprocessor computers with low memory usage. Conclusions To demonstrate the usefulness and efficiency of CaPSID, we carried out a comprehensive analysis of both a simulated dataset and transcriptome samples from ovarian cancer. CaPSID correctly identified all of the human and pathogen sequences in the simulated dataset, while in the ovarian dataset CaPSID’s predictions were successfully validated in vitro.

  3. The OAuth 2.0 Web Authorization Protocol for the Internet Addiction Bioinformatics (IABio) Database.

    Science.gov (United States)

    Choi, Jeongseok; Kim, Jaekwon; Lee, Dong Kyun; Jang, Kwang Soo; Kim, Dai-Jin; Choi, In Young

    2016-03-01

    Internet addiction (IA) has become a widespread and problematic phenomenon as smart devices pervade society. Moreover, internet gaming disorder leads to increases in social expenditures for both individuals and nations alike. Although the prevention and treatment of IA are getting more important, the diagnosis of IA remains problematic. Understanding the neurobiological mechanism of behavioral addictions is essential for the development of specific and effective treatments. Although there are many databases related to other addictions, a database for IA has not been developed yet. In addition, bioinformatics databases, especially genetic databases, require a high level of security and should be designed based on medical information standards. In this respect, our study proposes the OAuth standard protocol for database access authorization. The proposed IA Bioinformatics (IABio) database system is based on internet user authentication, which is a guideline for medical information standards, and uses OAuth 2.0 for access control technology. This study designed and developed the system requirements and configuration. The OAuth 2.0 protocol is expected to establish the security of personal medical information and be applied to genomic research on IA.

  4. SweetNET: A Bioinformatics Workflow for Glycopeptide MS/MS Spectral Analysis.

    Science.gov (United States)

    Nasir, Waqas; Toledo, Alejandro Gomez; Noborn, Fredrik; Nilsson, Jonas; Wang, Mingxun; Bandeira, Nuno; Larson, Göran

    2016-08-01

    Glycoproteomics has rapidly become an independent analytical platform bridging the fields of glycomics and proteomics to address site-specific protein glycosylation and its impact in biology. Current glycopeptide characterization relies on time-consuming manual interpretations and demands high levels of personal expertise. Efficient data interpretation constitutes one of the major challenges to be overcome before true high-throughput glycopeptide analysis can be achieved. The development of new glyco-related bioinformatics tools is thus of crucial importance to fulfill this goal. Here we present SweetNET: a data-oriented bioinformatics workflow for efficient analysis of hundreds of thousands of glycopeptide MS/MS-spectra. We have analyzed MS data sets from two separate glycopeptide enrichment protocols targeting sialylated glycopeptides and chondroitin sulfate linkage region glycopeptides, respectively. Molecular networking was performed to organize the glycopeptide MS/MS data based on spectral similarities. The combination of spectral clustering, oxonium ion intensity profiles, and precursor ion m/z shift distributions provided typical signatures for the initial assignment of different N-, O- and CS-glycopeptide classes and their respective glycoforms. These signatures were further used to guide database searches leading to the identification and validation of a large number of glycopeptide variants including novel deoxyhexose (fucose) modifications in the linkage region of chondroitin sulfate proteoglycans.

  5. Documenting the emergence of bio-ontologies: or, why researching bioinformatics requires HPSSB.

    Science.gov (United States)

    Leonelli, Sabina

    2010-01-01

    This paper reflects on the analytic challenges emerging from the study of bioinformatic tools recently created to store and disseminate biological data, such as databases, repositories, and bio-ontologies. I focus my discussion on the Gene Ontology, a term that defines three entities at once: a classification system facilitating the distribution and use of genomic data as evidence towards new insights; an expert community specialised in the curation of those data; and a scientific institution promoting the use of this tool among experimental biologists. These three dimensions of the Gene Ontology can be clearly distinguished analytically, but are tightly intertwined in practice. I suggest that this is true of all bioinformatic tools: they need to be understood simultaneously as epistemic, social, and institutional entities, since they shape the knowledge extracted from data and at the same time regulate the organisation, development, and communication of research. This viewpoint has one important implication for the methodologies used to study these tools; that is, the need to integrate historical, philosophical, and sociological approaches. I illustrate this claim through examples of misunderstandings that may result from a narrowly disciplinary study of the Gene Ontology, as I experienced them in my own research.

  6. Bioinformatics-driven identification and examination of candidate genes for non-alcoholic fatty liver disease.

    Directory of Open Access Journals (Sweden)

    Karina Banasik

    Full Text Available OBJECTIVE: Candidate genes for non-alcoholic fatty liver disease (NAFLD identified by a bioinformatics approach were examined for variant associations to quantitative traits of NAFLD-related phenotypes. RESEARCH DESIGN AND METHODS: By integrating public database text mining, trans-organism protein-protein interaction transferal, and information on liver protein expression a protein-protein interaction network was constructed and from this a smaller isolated interactome was identified. Five genes from this interactome were selected for genetic analysis. Twenty-one tag single-nucleotide polymorphisms (SNPs which captured all common variation in these genes were genotyped in 10,196 Danes, and analyzed for association with NAFLD-related quantitative traits, type 2 diabetes (T2D, central obesity, and WHO-defined metabolic syndrome (MetS. RESULTS: 273 genes were included in the protein-protein interaction analysis and EHHADH, ECHS1, HADHA, HADHB, and ACADL were selected for further examination. A total of 10 nominal statistical significant associations (P<0.05 to quantitative metabolic traits were identified. Also, the case-control study showed associations between variation in the five genes and T2D, central obesity, and MetS, respectively. Bonferroni adjustments for multiple testing negated all associations. CONCLUSIONS: Using a bioinformatics approach we identified five candidate genes for NAFLD. However, we failed to provide evidence of associations with major effects between SNPs in these five genes and NAFLD-related quantitative traits, T2D, central obesity, and MetS.

  7. The Road to Metagenomics: From Microbiology to DNA Sequencing Technologies and Bioinformatics

    Science.gov (United States)

    Escobar-Zepeda, Alejandra; Vera-Ponce de León, Arturo; Sanchez-Flores, Alejandro

    2015-01-01

    The study of microorganisms that pervade each and every part of this planet has encountered many challenges through time such as the discovery of unknown organisms and the understanding of how they interact with their environment. The aim of this review is to take the reader along the timeline and major milestones that led us to modern metagenomics. This new and thriving area is likely to be an important contributor to solve different problems. The transition from classical microbiology to modern metagenomics studies has required the development of new branches of knowledge and specialization. Here, we will review how the availability of high-throughput sequencing technologies has transformed microbiology and bioinformatics and how to tackle the inherent computational challenges that arise from the DNA sequencing revolution. New computational methods are constantly developed to collect, process, and extract useful biological information from a variety of samples and complex datasets, but metagenomics needs the integration of several of these computational methods. Despite the level of specialization needed in bioinformatics, it is important that life-scientists have a good understanding of it for a correct experimental design, which allows them to reveal the information in a metagenome. PMID:26734060

  8. BioGPS descriptors for rational engineering of enzyme promiscuity and structure based bioinformatic analysis.

    Directory of Open Access Journals (Sweden)

    Valerio Ferrario

    Full Text Available A new bioinformatic methodology was developed founded on the Unsupervised Pattern Cognition Analysis of GRID-based BioGPS descriptors (Global Positioning System in Biological Space. The procedure relies entirely on three-dimensional structure analysis of enzymes and does not stem from sequence or structure alignment. The BioGPS descriptors account for chemical, geometrical and physical-chemical features of enzymes and are able to describe comprehensively the active site of enzymes in terms of "pre-organized environment" able to stabilize the transition state of a given reaction. The efficiency of this new bioinformatic strategy was demonstrated by the consistent clustering of four different Ser hydrolases classes, which are characterized by the same active site organization but able to catalyze different reactions. The method was validated by considering, as a case study, the engineering of amidase activity into the scaffold of a lipase. The BioGPS tool predicted correctly the properties of lipase variants, as demonstrated by the projection of mutants inside the BioGPS "roadmap".

  9. [Bioinformatic analysis of adenoma-normal mucosa SSH library of colon].

    Science.gov (United States)

    Lü, Bing-Jian; Cui, Jing; Xu, Jing; Zhang, Hao; Luo, Min-Jie; Zhu, Yi-Min; Lai, Mao-De

    2006-04-01

    We established a colonic adenoma-normal mucosa suppressive subtraction hybridization (SSH) library in 1999. In this study, we wanted to explore the expression profile of all candidate genes in this library. We developed an EST pipeline which contained two in-house software packages, nucleic acid analytical software and GetUni. The nucleic acid analytical software, an integrator of the universal bioinformatics tools including phred, phd2fasta, cross_match, repeatmasker and blast2.0, can blast sequences of differential clones with the downloaded non-redundant nucleotide (NR) database. GetUni can cluster these NR sequences into Unigene via matching with the downloaded Homo Sapiens UniGene database. Sixty-two candidate genes in A-N library were obtained via the high throughput automatic gene expression bioinformatics pipeline. Gene Ontology online analysis revealed that ribosome genes and immunity-regulating genes were the two most common categories in the KEGG or Biocarta Pathway. We also detected the expression of 2 genes with highest hits, Reg4 and FAM46A, by semi-quantitative RT-PCR. Both genes were up-regulated in 10 or 9 out of 10 adenomas in comparison with the paired normal mucosa, respectively. The candidate genes in A-N library would be of great significance in disclosing the molecular mechanism underlying in colonic adenoma initiation and progression.

  10. Predicting the Nuclear Localization Signals of 107 Types of HPV L1 Proteins by Bioinformatic Analysis

    Institute of Scientific and Technical Information of China (English)

    Jun Yang; Yi-Li Wang; Lü-Sheng Si

    2006-01-01

    In this study, 107 types of human papillomavirus (HPV) L1 protein sequences were obtained from available databases, and the nuclear localization signals (NLSs) of these HPV L1 proteins were analyzed and predicted by bioinformatic analysis.Out of the 107 types, the NLSs of 39 types were predicted by PredictNLS software (35 types of bipartite NLSs and 4 types of monopartite NLSs). The NLSs of the remaining HPV types were predicted according to the characteristics and the homology of the already predicted NLSs as well as the general rule of NLSs.According to the result, the NLSs of 107 types of HPV L1 proteins were classified into 15 categories. The different types of HPV L1 proteins in the same NLS category could share the similar or the same nucleocytoplasmic transport pathway.They might be used as the same target to prevent and treat different types of HPV infection. The results also showed that bioinformatic technology could be used to analyze and predict NLSs of proteins.

  11. An Abstract Description Approach to the Discovery and Classification of Bioinformatics Web Sources

    Energy Technology Data Exchange (ETDEWEB)

    Rocco, D; Critchlow, T J

    2003-05-01

    The World Wide Web provides an incredible resource to genomics researchers in the form of dynamic data sources--e.g. BLAST sequence homology search interfaces. The growth rate of these sources outpaces the speed at which they can be manually classified, meaning that the available data is not being utilized to its full potential. Existing research has not addressed the problems of automatically locating, classifying, and integrating classes of bioinformatics data sources. This paper presents an overview of a system for finding classes of bioinformatics data sources and integrating them behind a unified interface. We examine an approach to classifying these sources automatically that relies on an abstract description format: the service class description. This format allows a domain expert to describe the important features of an entire class of services without tying that description to any particular Web source. We present the features of this description format in the context of BLAST sources to show how the service class description relates to Web sources that are being described. We then show how a service class description can be used to classify an arbitrary Web source to determine if that source is an instance of the described service. To validate the effectiveness of this approach, we have constructed a prototype that can correctly classify approximately two-thirds of the BLAST sources we tested. We then examine these results, consider the factors that affect correct automatic classification, and discuss future work.

  12. How the strengths of Lisp-family languages facilitate building complex and flexible bioinformatics applications.

    Science.gov (United States)

    Khomtchouk, Bohdan B; Weitz, Edmund; Karp, Peter D; Wahlestedt, Claes

    2016-12-31

    We present a rationale for expanding the presence of the Lisp family of programming languages in bioinformatics and computational biology research. Put simply, Lisp-family languages enable programmers to more quickly write programs that run faster than in other languages. Languages such as Common Lisp, Scheme and Clojure facilitate the creation of powerful and flexible software that is required for complex and rapidly evolving domains like biology. We will point out several important key features that distinguish languages of the Lisp family from other programming languages, and we will explain how these features can aid researchers in becoming more productive and creating better code. We will also show how these features make these languages ideal tools for artificial intelligence and machine learning applications. We will specifically stress the advantages of domain-specific languages (DSLs): languages that are specialized to a particular area, and thus not only facilitate easier research problem formulation, but also aid in the establishment of standards and best programming practices as applied to the specific research field at hand. DSLs are particularly easy to build in Common Lisp, the most comprehensive Lisp dialect, which is commonly referred to as the 'programmable programming language'. We are convinced that Lisp grants programmers unprecedented power to build increasingly sophisticated artificial intelligence systems that may ultimately transform machine learning and artificial intelligence research in bioinformatics and computational biology.

  13. Expression and Bioinformatics Analysis of SPACA4 in Human and Mice

    Institute of Scientific and Technical Information of China (English)

    Ai-fa TANG; Zhen-dong YU; Yao-ting GUI; Xin GUO; Xian-xin LI; Wei-xiang LIU; Hui ZHU; Zhi-ming CAI

    2008-01-01

    Objective To analyze the expression of SPACA4 in human and mice. Methods Testes cRNA samples from Balb/c mice of different postnatal days were performed with mouse affymetrix chip to screen the expression of SPACA4 in mice. Sub-quantitative RT-PCR and bioinformatic tools were used here to describe the expression profile of SPACA4 in mice and human. Results The results of gene chip analysis indicated that the expression of mSPACA4 began after d 35 of postnatal testis in mice. Sub-quantitative RT-PCR assay showed that SPACA4 gene expressed exclusively in mouse and human testis, and mouse mSPACA4 gene expressed after d 35 of postnatal testis that was consistency with the results of gene chip analysis. By bioinformatics analysis, mSPACA4 is located in cell membrane (34.8%) or plasma membrane (34.8%), the signal peptide cleavage site between position 19 and 20 amino acids, transmembrane region between 2-20 and 101-126 amino acids, respectively, on mSPACA4 protein. Conclusion mSPACA4 and hSPACA4 were testis-specific genes, and the expression of mSPACA4 begins after d 35 of postnatal testis in mice. SPACA4 is a candidate for targeting in a sperm-based contraceptive vaccine.

  14. BioZone Exploting Source-Capability Information for Integrated Access to Multiple Bioinformatics Data Sources

    Energy Technology Data Exchange (ETDEWEB)

    Liu, L; Buttler, D; Paques, H; Pu, C; Critchlow

    2002-01-28

    Modern Bioinformatics data sources are widely used by molecular biologists for homology searching and new drug discovery. User-friendly and yet responsive access is one of the most desirable properties for integrated access to the rapidly growing, heterogeneous, and distributed collection of data sources. The increasing volume and diversity of digital information related to bioinformatics (such as genomes, protein sequences, protein structures, etc.) have led to a growing problem that conventional data management systems do not have, namely finding which information sources out of many candidate choices are the most relevant and most accessible to answer a given user query. We refer to this problem as the query routing problem. In this paper we introduce the notation and issues of query routing, and present a practical solution for designing a scalable query routing system based on multi-level progressive pruning strategies. The key idea is to create and maintain source-capability profiles independently, and to provide algorithms that can dynamically discover relevant information sources for a given query through the smart use of source profiles. Compared to the keyword-based indexing techniques adopted in most of the search engines and software, our approach offers fine-granularity of interest matching, thus it is more powerful and effective for handling queries with complex conditions.

  15. Bioinformatics analysis of two-component regulatory systems in Staphylococcus epidermidis

    Institute of Scientific and Technical Information of China (English)

    QIN Zhiqiang; ZHONG Yang; ZHANG Jian; HE Youyu; WU Yang; JIANG Juan; CHEN Jiemin; LUO Xiaomin; QU Di

    2004-01-01

    Sixteen pairs of two-component regulatory systems are identified in the genome of Staphylococcus epidermidis ATCC12228 strain, which is newly sequenced by our laboratory for Medical Molecular Virology and Chinese National Human Genome Center at Shanghai, by using bioinformatics analysis. Comparative analysis of the twocomponent regulatory systems in S. epidermidis and that of S.aureus and Bacillus subtilis shows that these systems may regulate some important biological functions, e.g. growth,biofilm formation, and expression of virulence factors in S.epidermidis. Two conserved domains, i.e. HATPase_c and REC domains, are found in all 16 pairs of two-component proteins.Homologous modelling analysis indicates that there are 4similar HATPase_c domain structures of histidine kinases and 13 similar REC domain structures of response regulators,and there is one AMP-PNP binding pocket in the HATPase_c domain and three active aspartate residues in the REC domain. Preliminary experiment reveals that the bioinformatics analysis of the conserved domain structures in the two-component regulatory systems in S. epidermidis may provide useful information for discovery of potential drug target.

  16. Applying bioinformatics to proteomics: is machine learning the answer to biomarker discovery for PD and MSA?

    Science.gov (United States)

    Mattison, Hayley A; Stewart, Tessandra; Zhang, Jing

    2012-11-01

    Bioinformatics tools are increasingly being applied to proteomic data to facilitate the identification of biomarkers and classification of patients. In the June, 2012 issue, Ishigami et al. used principal component analysis (PCA) to extract features and support vector machine (SVM) to differentiate and classify cerebrospinal fluid (CSF) samples from two small cohorts of patients diagnosed with either Parkinson's disease (PD) or multiple system atrophy (MSA) based on differences in the patterns of peaks generated with matrix-assisted desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). PCA accurately segregated patients with PD and MSA from controls when the cohorts were combined, but did not perform well when segregating PD from MSA. On the other hand, SVM, a machine learning classification model, correctly classified the samples from patients with early PD or MSA, and the peak at m/z 6250 was identified as a strong contributor to the ability of SVM to distinguish the proteomic profiles of either cohort when trained on one cohort. This study, while preliminary, provides promising results for the application of bioinformatics tools to proteomic data, an approach that may eventually facilitate the ability of clinicians to differentiate and diagnose closely related parkinsonian disorders.

  17. Website-analyse

    DEFF Research Database (Denmark)

    Thorlacius, Lisbeth

    2009-01-01

    planlægning af de funktionelle og indholdsmæssige aspekter ved websites. Der findes en stor mængde teori- og metodebøger, som har specialiseret sig i de tekniske problemstillinger i forbindelse med interaktion og navigation, samt det sproglige indhold på websites. Den danske HCI (Human Computer Interaction...... hyperfunktionelle websites. Det primære ærinde for HCI-eksperterne er at udarbejde websites, som er brugervenlige. Ifølge deres direktiver skal websites være opbygget med hurtige og effektive navigations- og interaktionsstrukturer, hvor brugeren kan få sine informationer ubesværet af lange downloadingshastigheder...... eller blindgyder, når han/hun besøger sitet. Studier i design og analyse af de visuelle og æstetiske aspekter i planlægning og brug af websites har imidlertid kun i et begrænset omfang været under reflektorisk behandling. Det er baggrunden for dette kapitel, som indleder med en gennemgang af æstetikkens...

  18. 生物信息学跨学科研究%The Research on Bioinformatics Interdisciplinary

    Institute of Scientific and Technical Information of China (English)

    武妍; 胡德华

    2012-01-01

    目的:研究生物信息学起源、发展趋势,与其他学科相互交叉渗透关系的发展及强度.方法:利用美国《科学引文索引》(SCI)数据库web of science,运用文献计量学方法对8种权威生物信息学期刊2001年至2010年于2011年1月15日之前上传至web of science的全部文献进行统计及分析.通过研究生物信息学相关论文的主题分类,被引情况及施引文献的分类,寻找其跨学科的趋势及相关研究领域的进展情况、主要内容.结果:生物信息学的相关文献数在2001-2010间逐年增加,在2009-2010年达到高峰.跨学科领域广泛,并以生物化学、分子生物学、计算生物学、微生物学、数学、统计学等学科为主要交叉学科.各交叉学科与生物信息学之间跨学科研究的文献数也呈逐年递增趋势.结论:生物信息学的跨学科范围广泛,发展迅速.%Objective:To study the origin, developing trends of bioinformatics, relationship with other disciplines, the intensity and development of interdisciplinary situation. Methods: In this paper, the United States "Science Citation Index" (SCI) database, web of science, and the method of bibliometrics are used to do a research about all literature from 8 authoritative bioinformatics journals from the year 2001 to 2010 with metrology statistics. Through researches about the classification related to the theme of the papers of the literature, and the classification of cited literature's discipline, the trends and the research areas of bioinformatics1 inter-disciplinary development can be known. Results: Bioinformatics literature in the decade from 2001 to 2010 is increased year by year and reached a peak in 2009 and 2010 year. The interdisciplinary fields of bioinformatics become wider. Among all the interdisciplinary fields, biochemistry, molecular biology, computational biology, microbiology, mathematics, statistics and other disciplines are the main cross-disciplinary disciplines

  19. Evaluating the effectiveness of a practical inquiry-based learning bioinformatics module on undergraduate student engagement and applied skills.

    Science.gov (United States)

    Brown, James A L

    2016-05-01

    A pedagogic intervention, in the form of an inquiry-based peer-assisted learning project (as a practical student-led bioinformatics module), was assessed for its ability to increase students' engagement, practical bioinformatic skills and process-specific knowledge. Elements assessed were process-specific knowledge following module completion, qualitative student-based module evaluation and the novelty, scientific validity and quality of written student reports. Bioinformatics is often the starting point for laboratory-based research projects, therefore high importance was placed on allowing students to individually develop and apply processes and methods of scientific research. Students led a bioinformatic inquiry-based project (within a framework of inquiry), discovering, justifying and exploring individually discovered research targets. Detailed assessable reports were produced, displaying data generated and the resources used. Mimicking research settings, undergraduates were divided into small collaborative groups, with distinctive central themes. The module was evaluated by assessing the quality and originality of the students' targets through reports, reflecting students' use and understanding of concepts and tools required to generate their data. Furthermore, evaluation of the bioinformatic module was assessed semi-quantitatively using pre- and post-module quizzes (a non-assessable activity, not contributing to their grade), which incorporated process- and content-specific questions (indicative of their use of the online tools). Qualitative assessment of the teaching intervention was performed using post-module surveys, exploring student satisfaction and other module specific elements. Overall, a positive experience was found, as was a post module increase in correct process-specific answers. In conclusion, an inquiry-based peer-assisted learning module increased students' engagement, practical bioinformatic skills and process-specific knowledge. © 2016 by

  20. DCCP and DICP: Construction and Analyses of Databases for Copper- and Iron-Chelating Proteins

    Institute of Scientific and Technical Information of China (English)

    Hao Wu; Yan Yang; Sheng-Juan Jiang; Ling-Ling Chen; Hai-Xia Gao; Qing-Shan Fu; Feng Li; Bin-Guang Ma; Hong-Yu Zhang

    2005-01-01

    Copper and iron play important roles in a variety of biological processes, especially when being chelated with proteins. The proteins involved in the metal binding,transporting and metabolism have aroused much interest. To facilitate the study on this topic, we constructed two databases (DCCP and DICP) containing the known copper- and iron-chelating proteins, which are freely available from the website http:∥sdbi.sdut.edu.cn/en. Users can conveniently search and browse all of the entries in the databases. Based on the two databases, bioinformatic analyses were performed, which provided some novel insights into metalloproteins.