WorldWideScience

Sample records for bioinformatic cis-element analyses

  1. Bioinformatic identification of novel putative photoreceptor specific cis-elements

    OpenAIRE

    Knox Barry E; Qin Maochun; McIlvain Vera A; Danko Charles G; Pertsov Arkady M

    2007-01-01

    Abstract Background Cell specific gene expression is largely regulated by different combinations of transcription factors that bind cis-elements in the upstream promoter sequence. However, experimental detection of cis-elements is difficult, expensive, and time-consuming. This provides a motivation for developing bioinformatic methods to identify cis-elements that could prioritize future experimental studies. Here, we use motif discovery algorithms to predict transcription factor binding site...

  2. Bioinformatic identification of novel putative photoreceptor specific cis-elements

    Directory of Open Access Journals (Sweden)

    Knox Barry E

    2007-10-01

    Full Text Available Abstract Background Cell specific gene expression is largely regulated by different combinations of transcription factors that bind cis-elements in the upstream promoter sequence. However, experimental detection of cis-elements is difficult, expensive, and time-consuming. This provides a motivation for developing bioinformatic methods to identify cis-elements that could prioritize future experimental studies. Here, we use motif discovery algorithms to predict transcription factor binding sites involved in regulating the differences between murine rod and cone photoreceptor populations. Results To identify highly conserved motifs enriched in promoters that drive expression in either rod or cone photoreceptors, we assembled a set of murine rod-specific, cone-specific, and non-photoreceptor background promoter sequences. These sets were used as input to a newly devised motif discovery algorithm called Iterative Alignment/Modular Motif Selection (IAMMS. Using IAMMS, we predicted 34 motifs that may contribute to rod-specific (19 motifs or cone-specific (15 motifs expression patterns. Of these, 16 rod- and 12 cone-specific motifs were found in clusters near the transcription start site. New findings include the observation that cone promoters tend to contain TATA boxes, while rod promoters tend to be TATA-less (exempting Rho and Cnga1. Additionally, we identify putative sites for IL-6 effectors (in rods and RXR family members (in cones that can explain experimental data showing changes to cell-fate by activating these signaling pathways during rod/cone development. Two of the predicted motifs (NRE and ROP2 have been confirmed experimentally to be involved in cell-specific expression patterns. We provide a full database of predictions as additional data that may contain further valuable information. IAMMS predictions are compared with existing motif discovery algorithms, DME and BioProspector. We find that over 60% of IAMMS predictions are confirmed by

  3. Bioinformatic cis-element analyses performed in Arabidopsis and rice disclose bZIP- and MYB-related binding sites as potential AuxRE-coupling elements in auxin-mediated transcription

    Directory of Open Access Journals (Sweden)

    Berendzen Kenneth W

    2012-08-01

    Full Text Available Abstract Background In higher plants, a diverse array of developmental and growth-related processes is regulated by the plant hormone auxin. Recent publications have proposed that besides the well-characterized Auxin Response Factors (ARFs that bind Auxin Response Elements (AuxREs, also members of the bZIP- and MYB-transcription factor (TF families participate in transcriptional control of auxin-regulated genes via bZIP Response Elements (ZREs or Myb Response Elements (MREs, respectively. Results Applying a novel bioinformatic algorithm, we demonstrate on a genome-wide scale that singular motifs or composite modules of AuxREs, ZREs, MREs but also of MYC2 related elements are significantly enriched in promoters of auxin-inducible genes. Despite considerable, species-specific differences in the genome structure in terms of the GC content, this enrichment is generally conserved in dicot (Arabidopsis thaliana and monocot (Oryza sativa model plants. Moreover, an enrichment of defined composite modules has been observed in selected auxin-related gene families. Consistently, a bipartite module, which encompasses a bZIP-associated G-box Related Element (GRE and an AuxRE motif, has been found to be highly enriched. Making use of transient reporter studies in protoplasts, these findings were experimentally confirmed, demonstrating that GREs functionally interact with AuxREs in regulating auxin-mediated transcription. Conclusions Using genome-wide bioinformatic analyses, evolutionary conserved motifs have been defined which potentially function as AuxRE-dependent coupling elements to establish auxin-specific expression patterns. Based on these findings, experimental approaches can be designed to broaden our understanding of combinatorial, auxin-controlled gene regulation.

  4. Bioinformatics analyses for signal transduction networks

    Institute of Scientific and Technical Information of China (English)

    LIU Wei; LI Dong; ZHU YunPing; HE FuChu

    2008-01-01

    Research in signaling networks contributes to a deeper understanding of organism living activities. With the development of experimental methods in the signal transduction field, more and more mechanisms of signaling pathways have been discovered. This paper introduces such popular bioin-formatics analysis methods for signaling networks as the common mechanism of signaling pathways and database resource on the Internet, summerizes the methods of analyzing the structural properties of networks, including structural Motif finding and automated pathways generation, and discusses the modeling and simulation of signaling networks in detail, as well as the research situation and tendency in this area. Now the investigation of signal transduction is developing from small-scale experiments to large-scale network analysis, and dynamic simulation of networks is closer to the real system. With the investigation going deeper than ever, the bioinformatics analysis of signal transduction would have immense space for development and application.

  5. Proteomic and Bioinformatics Analyses of Mouse Liver Microsomes

    Directory of Open Access Journals (Sweden)

    Fang Peng

    2012-01-01

    Full Text Available Microsomes are derived mostly from endoplasmic reticulum and are an ideal target to investigate compound metabolism, membrane-bound enzyme functions, lipid-protein interactions, and drug-drug interactions. To better understand the molecular mechanisms of the liver and its diseases, mouse liver microsomes were isolated and enriched with differential centrifugation and sucrose gradient centrifugation, and microsome membrane proteins were further extracted from isolated microsomal fractions by the carbonate method. The enriched microsome proteins were arrayed with two-dimensional gel electrophoresis (2DE and carbonate-extracted microsome membrane proteins with one-dimensional gel electrophoresis (1DE. A total of 183 2DE-arrayed proteins and 99 1DE-separated proteins were identified with tandem mass spectrometry. A total of 259 nonredundant microsomal proteins were obtained and represent the proteomic profile of mouse liver microsomes, including 62 definite microsome membrane proteins. The comprehensive bioinformatics analyses revealed the functional categories of those microsome proteins and provided clues into biological functions of the liver. The systematic analyses of the proteomic profile of mouse liver microsomes not only reveal essential, valuable information about the biological function of the liver, but they also provide important reference data to analyze liver disease-related microsome proteins for biomarker discovery and mechanism clarification of liver disease.

  6. Whale song analyses using bioinformatics sequence analysis approaches

    Science.gov (United States)

    Chen, Yian A.; Almeida, Jonas S.; Chou, Lien-Siang

    2005-04-01

    Animal songs are frequently analyzed using discrete hierarchical units, such as units, themes and songs. Because animal songs and bio-sequences may be understood as analogous, bioinformatics analysis tools DNA/protein sequence alignment and alignment-free methods are proposed to quantify the theme similarities of the songs of false killer whales recorded off northeast Taiwan. The eighteen themes with discrete units that were identified in an earlier study [Y. A. Chen, masters thesis, University of Charleston, 2001] were compared quantitatively using several distance metrics. These metrics included the scores calculated using the Smith-Waterman algorithm with the repeated procedure; the standardized Euclidian distance and the angle metrics based on word frequencies. The theme classifications based on different metrics were summarized and compared in dendrograms using cluster analyses. The results agree with earlier classifications derived by human observation qualitatively. These methods further quantify the similarities among themes. These methods could be applied to the analyses of other animal songs on a larger scale. For instance, these techniques could be used to investigate song evolution and cultural transmission quantifying the dissimilarities of humpback whale songs across different seasons, years, populations, and geographic regions. [Work supported by SC Sea Grant, and Ilan County Government, Taiwan.

  7. Competing endogenous RNA and interactome bioinformatic analyses on human telomerase.

    Science.gov (United States)

    Arancio, Walter; Pizzolanti, Giuseppe; Genovese, Swonild Ilenia; Baiamonte, Concetta; Giordano, Carla

    2014-04-01

    We present a classic interactome bioinformatic analysis and a study on competing endogenous (ce) RNAs for hTERT. The hTERT gene codes for the catalytic subunit and limiting component of the human telomerase complex. Human telomerase reverse transcriptase (hTERT) is essential for the integrity of telomeres. Telomere dysfunctions have been widely reported to be involved in aging, cancer, and cellular senescence. The hTERT gene network has been analyzed using the BioGRID interaction database (http://thebiogrid.org/) and related analysis tools such as Osprey (http://biodata.mshri.on.ca/osprey/servlet/Index) and GeneMANIA (http://genemania.org/). The network of interaction of hTERT transcripts has been further analyzed following the competing endogenous (ce) RNA hypotheses (messenger [m] RNAs cross-talk via micro [mi] RNAs) using the miRWalk database and tools (www.ma.uni-heidelberg.de/apps/zmf/mirwalk/). These analyses suggest a role for Akt, nuclear factor-κB (NF-κB), heat shock protein 90 (HSP90), p70/p80 autoantigen, 14-3-3 proteins, and dynein in telomere functions. Roles for histone acetylation/deacetylation and proteoglycan metabolism are also proposed. PMID:24713059

  8. Bioinformatics

    DEFF Research Database (Denmark)

    Baldi, Pierre; Brunak, Søren

    medicine will be particularly affected by the new results and the increased understanding of life at the molecular level. Bioinformatics is the development and application of computer methods for analysis, interpretation, and prediction, as well as for the design of experiments. It has emerged as a...

  9. Proteomic and bioinformatic analyses of spinal cord injury‑induced skeletal muscle atrophy in rats.

    Science.gov (United States)

    Wei, Zhi-Jian; Zhou, Xian-Hu; Fan, Bao-You; Lin, Wei; Ren, Yi-Ming; Feng, Shi-Qing

    2016-07-01

    Spinal cord injury (SCI) may result in skeletal muscle atrophy. Identifying diagnostic biomarkers and effective targets for treatment is an important challenge in clinical work. The aim of the present study is to elucidate potential biomarkers and therapeutic targets for SCI‑induced muscle atrophy (SIMA) using proteomic and bioinformatic analyses. The protein samples from rat soleus muscle were collected at different time points following SCI injury and separated by two‑dimensional gel electrophoresis and compared with the sham group. The identities of these protein spots were analyzed by mass spectrometry (MS). MS demonstrated that 20 proteins associated with muscle atrophy were differentially expressed. Bioinformatic analyses indicated that SIMA changed the expression of proteins associated with cellular, developmental, immune system and metabolic processes, biological adhesion and localization. The results of the present study may be beneficial in understanding the molecular mechanisms of SIMA and elucidating potential biomarkers and targets for the treatment of muscle atrophy. PMID:27177391

  10. Modeling an evolutionary conserved circadian cis-element.

    Directory of Open Access Journals (Sweden)

    Eric R Paquet

    2008-02-01

    Full Text Available Circadian oscillator networks rely on a transcriptional activator called CLOCK/CYCLE (CLK/CYC in insects and CLOCK/BMAL1 or NPAS2/BMAL1 in mammals. Identifying the targets of this heterodimeric basic-helix-loop-helix (bHLH transcription factor poses challenges and it has been difficult to decipher its specific sequence affinity beyond a canonical E-box motif, except perhaps for some flanking bases contributing weakly to the binding energy. Thus, no good computational model presently exists for predicting CLK/CYC, CLOCK/BMAL1, or NPAS2/BMAL1 targets. Here, we use a comparative genomics approach and first study the conservation properties of the best-known circadian enhancer: a 69-bp element upstream of the Drosophila melanogaster period gene. This fragment shows a signal involving the presence of two closely spaced E-box-like motifs, a configuration that we can also detect in the other four prominent CLK/CYC target genes in flies: timeless, vrille, Pdp1, and cwo. This allows for the training of a probabilistic sequence model that we test using functional genomics datasets. We find that the predicted sequences are overrepresented in promoters of genes induced in a recent study by a glucocorticoid receptor-CLK fusion protein. We then scanned the mouse genome with the fly model and found that many known CLOCK/BMAL1 targets harbor sequences matching our consensus. Moreover, the phase of predicted cyclers in liver agreed with known CLOCK/BMAL1 regulation. Taken together, we built a predictive model for CLK/CYC or CLOCK/BMAL1-bound cis-enhancers through the integration of comparative and functional genomics data. Finally, a deeper phylogenetic analysis reveals that the link between the CLOCK/BMAL1 complex and the circadian cis-element dates back to before insects and vertebrates diverged.

  11. Differential Expression of Proteins Associated with the Hair Follicle Cycle - Proteomics and Bioinformatics Analyses.

    Directory of Open Access Journals (Sweden)

    Lei Wang

    Full Text Available Hair follicle cycling can be divided into the following three stages: anagen, catagen, and telogen. The molecular signals that orchestrate the follicular transition between phases are still unknown. To better understand the detailed protein networks controlling this process, proteomics and bioinformatics analyses were performed to construct comparative protein profiles of mouse skin at specific time points (0, 8, and 20 days. Ninety-five differentially expressed protein spots were identified by MALDI-TOF/TOF as 44 proteins, which were found to change during hair follicle cycle transition. Proteomics analysis revealed that these changes in protein expression are involved in Ca2+-regulated biological processes, migration, and regulation of signal transduction, among other processes. Subsequently, three proteins were selected to validate the reliability of expression patterns using western blotting. Cluster analysis revealed three expression patterns, and each pattern correlated with specific cell processes that occur during the hair cycle. Furthermore, bioinformatics analysis indicated that the differentially expressed proteins impacted multiple biological networks, after which detailed functional analyses were performed. Taken together, the above data may provide insight into the three stages of mouse hair follicle morphogenesis and provide a solid basis for potential therapeutic molecular targets for this hair disease.

  12. Identification of a Positive Cis-Element Upstream of Human NKX3.1 Gene

    Institute of Scientific and Technical Information of China (English)

    An-Li JIANG; Peng-Ju ZHANG; Xiao-Yan HU; Wei-Wen CHEN; Feng KONG; Zhi-Fang LIU; Hui-Qing YUAN; Jian-Ye ZHANG

    2005-01-01

    NKX3.1 is a prostate-specific homeobox gene related to prostate development and prostate cancer. In this work, we aimed to identify precisely the functional cis-element in the 197 bp region (from -1032 to -836 bp) of the NKX3.1 promoter (from -1032 to +8 bp), which was previously identified to present positive regulatory activity on NKX3.1 expression, by deletion mutagenesis analysis and electrophoretic mobility shift assay (EMSA). A 16 bp positive cis-element located between -920 and -905 bp upstream of the NKX3.1 gene was identified by deletion mutation analysis and proved to be a functional positive cis-element by EMSA. It will be important to further study the functions and regulatory mechanisms of this positive cis-element in NKX3.1 gene expression.

  13. Managing, Analysing, and Integrating Big Data in Medical Bioinformatics: Open Problems and Future Perspectives

    Directory of Open Access Journals (Sweden)

    Ivan Merelli

    2014-01-01

    Full Text Available The explosion of the data both in the biomedical research and in the healthcare systems demands urgent solutions. In particular, the research in omics sciences is moving from a hypothesis-driven to a data-driven approach. Healthcare is additionally always asking for a tighter integration with biomedical data in order to promote personalized medicine and to provide better treatments. Efficient analysis and interpretation of Big Data opens new avenues to explore molecular biology, new questions to ask about physiological and pathological states, and new ways to answer these open issues. Such analyses lead to better understanding of diseases and development of better and personalized diagnostics and therapeutics. However, such progresses are directly related to the availability of new solutions to deal with this huge amount of information. New paradigms are needed to store and access data, for its annotation and integration and finally for inferring knowledge and making it available to researchers. Bioinformatics can be viewed as the “glue” for all these processes. A clear awareness of present high performance computing (HPC solutions in bioinformatics, Big Data analysis paradigms for computational biology, and the issues that are still open in the biomedical and healthcare fields represent the starting point to win this challenge.

  14. Analyses of Brucella pathogenesis, host immunity, and vaccine targets using systems biology and bioinformatics

    Directory of Open Access Journals (Sweden)

    Yongqun eHe

    2012-02-01

    Full Text Available Brucella is a Gram-negative, facultative intracellular bacterium that causes zoonotic brucellosis in humans and various animals. Out of ten classified Brucella species, B. melitensis, B. abortus, B. suis, and B. canis are pathogenic to humans. In the past decade, the mechanisms of Brucella pathogenesis and host immunity have been extensively investigated using the cutting edge systems biology and bioinformatics approaches. This article provides a comprehensive review of the applications of Omics (including genomics, transcriptomics, and proteomics and bioinformatics technologies for the analysis of Brucella pathogenesis, host immune responses, and vaccine targets. Based on more than 30 sequenced Brucella genomes, comparative genomics is able to identify gene variations among Brucella strains that help to explain host specificity and virulence differences among Brucella species. Diverse transcriptomics and proteomics gene expression studies have been conducted to analyze gene expression profiles of wild type Brucella strains and mutants under different laboratory conditions. High throughput Omics analyses of host responses to infections with virulent or attenuated Brucella strains have been focused on responses by mouse and cattle macrophages, bovine trophoblastic cells, mouse and boar splenocytes, and ram buffy coat. Differential serum responses in humans and rams to Brucella infections have been analyzed using high throughput serum antibody screening technology. The Vaxign reverse vaccinology has been used to predict many Brucella vaccine targets. More than 180 Brucella virulence factors and their gene interaction networks have been identified using advanced literature mining methods. The recent development of community-based Vaccine Ontology and Brucellosis Ontology provides an efficient way for Brucella data integration, exchange, and computer-assisted automated reasoning.

  15. Structural, bioinformatic, and in vivo analyses of two Treponema pallidum lipoproteins reveal a unique TRAP transporter

    Science.gov (United States)

    Deka, Ranjit K.; Brautigam, Chad A.; Goldberg, Martin; Schuck, Peter; Tomchick, Diana R.; Norgard, Michael V.

    2012-01-01

    Treponema pallidum, the bacterial agent of syphilis, is predicted to encode one tripartite ATP- independent periplasmic transporter (TRAP-T). TRAP-Ts typically employ a periplasmic substrate-binding protein (SBP) to deliver the cognate ligand to the transmembrane symporter. Herein, we demonstrate that the genes encoding the putative TRAP-T components from T. pallidum, tp0957 (the SBP) and tp0958 (the symporter) are in an operon with an uncharacterized third gene, tp0956. We determined the crystal structure of recombinant Tp0956; the protein is trimeric and perforated by a pore. Part of Tp0956 forms an assembly similar to those of “tetratricopeptide repeat” (TPR) motifs. The crystal structure of recombinant Tp0957 was also determined; like the SBPs of other TRAP-Ts, there are two lobes separated by a cleft. In these other SBPs, the cleft binds a negatively charged ligand. However, the cleft of Tp0957 has a strikingly hydrophobic chemical composition, indicating that its ligand may be substantially different and likely hydrophobic. Analytical ultracentrifugation of the recombinant versions of Tp0956 and Tp0957 established that these proteins associate avidly. This unprecedented interaction was confirmed for the native molecules using in vivo cross-linking experiments. Finally, bioinformatic analyses suggested that this transporter exemplifies a new subfamily of TPR-protein associated TRAP transporters (TPATs) that require the action of a TPR-containing accessory protein for the periplasmic transport of a potentially hydrophobic ligand(s). PMID:22306465

  16. Structural, Bioinformatic, and In Vivo Analyses of Two Treponema pallidum Lipoproteins Reveal a Unique TRAP Transporter

    Energy Technology Data Exchange (ETDEWEB)

    Deka, Ranjit K.; Brautigam, Chad A.; Goldberg, Martin; Schuck, Peter; Tomchick, Diana R.; Norgard, Michael V. (NIH); (UTSMC)

    2012-05-25

    Treponema pallidum, the bacterial agent of syphilis, is predicted to encode one tripartite ATP-independent periplasmic transporter (TRAP-T). TRAP-Ts typically employ a periplasmic substrate-binding protein (SBP) to deliver the cognate ligand to the transmembrane symporter. Herein, we demonstrate that the genes encoding the putative TRAP-T components from T. pallidum, tp0957 (the SBP), and tp0958 (the symporter), are in an operon with an uncharacterized third gene, tp0956. We determined the crystal structure of recombinant Tp0956; the protein is trimeric and perforated by a pore. Part of Tp0956 forms an assembly similar to those of 'tetratricopeptide repeat' (TPR) motifs. The crystal structure of recombinant Tp0957 was also determined; like the SBPs of other TRAP-Ts, there are two lobes separated by a cleft. In these other SBPs, the cleft binds a negatively charged ligand. However, the cleft of Tp0957 has a strikingly hydrophobic chemical composition, indicating that its ligand may be substantially different and likely hydrophobic. Analytical ultracentrifugation of the recombinant versions of Tp0956 and Tp0957 established that these proteins associate avidly. This unprecedented interaction was confirmed for the native molecules using in vivo cross-linking experiments. Finally, bioinformatic analyses suggested that this transporter exemplifies a new subfamily of TPATs (TPR-protein-associated TRAP-Ts) that require the action of a TPR-containing accessory protein for the periplasmic transport of a potentially hydrophobic ligand(s).

  17. FastGroupII: A web-based bioinformatics platform for analyses of large 16S rDNA libraries

    Directory of Open Access Journals (Sweden)

    McNairnie Pat

    2006-02-01

    Full Text Available Abstract Background High-throughput sequencing makes it possible to rapidly obtain thousands of 16S rDNA sequences from environmental samples. Bioinformatic tools for the analyses of large 16S rDNA sequence databases are needed to comprehensively describe and compare these datasets. Results FastGroupII is a web-based bioinformatics platform to dereplicate large 16S rDNA libraries. FastGroupII provides users with the option of four different dereplication methods, performs rarefaction analysis, and automatically calculates the Shannon-Wiener Index and Chao1. FastGroupII was tested on a set of 16S rDNA sequences from coral-associated Bacteria. The different grouping algorithms produced similar, but not identical, results. This suggests that 16S rDNA datasets need to be analyzed in multiple ways when being used for community ecology studies. Conclusion FastGroupII is an effective bioinformatics tool for the trimming and dereplication of 16S rDNA sequences. Several standard diversity indices are calculated, and the raw sequences are prepared for downstream analyses.

  18. Solid Phase Biosensors for Arsenic or Cadmium Composed of A trans Factor and cis Element Complex

    Directory of Open Access Journals (Sweden)

    Mohammad Shohel Rana Siddiki

    2011-10-01

    Full Text Available The presence of toxic metals in drinking water has hazardous effects on human health. This study was conducted to develop GFP-based-metal-binding biosensors for on-site assay of toxic metal ions. GFP-tagged ArsR and CadC proteins bound to a cis element, and lost the capability of binding to it in their As- and Cd-binding conformational states, respectively. Water samples containing toxic metals were incubated on a complex of GFP-tagged ArsR or CadC and cis element which was immobilized on a solid surface. Metal concentrations were quantified with fluorescence intensity of the metal-binding states released from the cis element. Fluorescence intensity obtained with the assay significantly increased with increasing concentrations of toxic metals. Detection limits of 1 μg/L for Cd(II and 5 μg/L for As(III in purified water and 10 µg/L for Cd(II and As(III in tap water and bottled mineral water were achieved by measurement with a battery-powered portable fluorometer after 15-min and 30-min incubation, respectively. A complex of freeze dried GFP-tagged ArsR or CadC binding to cis element was stable at 4 °C and responded to 5 μg/L As(III or Cd(II. The solid phase biosensors are sensitive, less time-consuming, portable, and could offer a protocol for on-site evaluation of the toxic metals in drinking water.

  19. Solid phase biosensors for arsenic or cadmium composed of A trans factor and cis element complex.

    Science.gov (United States)

    Siddiki, Mohammad Shohel Rana; Kawakami, Yasunari; Ueda, Shunsaku; Maeda, Isamu

    2011-01-01

    The presence of toxic metals in drinking water has hazardous effects on human health. This study was conducted to develop GFP-based-metal-binding biosensors for on-site assay of toxic metal ions. GFP-tagged ArsR and CadC proteins bound to a cis element, and lost the capability of binding to it in their As- and Cd-binding conformational states, respectively. Water samples containing toxic metals were incubated on a complex of GFP-tagged ArsR or CadC and cis element which was immobilized on a solid surface. Metal concentrations were quantified with fluorescence intensity of the metal-binding states released from the cis element. Fluorescence intensity obtained with the assay significantly increased with increasing concentrations of toxic metals. Detection limits of 1 μg/L for Cd(II) and 5 μg/L for As(III) in purified water and 10 µg/L for Cd(II) and As(III) in tap water and bottled mineral water were achieved by measurement with a battery-powered portable fluorometer after 15-min and 30-min incubation, respectively. A complex of freeze dried GFP-tagged ArsR or CadC binding to cis element was stable at 4 °C and responded to 5 μg/L As(III) or Cd(II). The solid phase biosensors are sensitive, less time-consuming, portable, and could offer a protocol for on-site evaluation of the toxic metals in drinking water. PMID:22346629

  20. Advances in Omics and Bioinformatics Tools for Systems Analyses of Plant Functions

    OpenAIRE

    Mochida, Keiichi; Shinozaki, Kazuo

    2011-01-01

    Omics and bioinformatics are essential to understanding the molecular systems that underlie various plant functions. Recent game-changing sequencing technologies have revitalized sequencing approaches in genomics and have produced opportunities for various emerging analytical applications. Driven by technological advances, several new omics layers such as the interactome, epigenome and hormonome have emerged. Furthermore, in several plant species, the development of omics resources has progre...

  1. Assessing viral taxonomic composition in benthic marine ecosystems: reliability and efficiency of different bioinformatic tools for viral metagenomic analyses

    Science.gov (United States)

    Tangherlini, M.; Dell’Anno, A.; Zeigler Allen, L.; Riccioni, G.; Corinaldesi, C.

    2016-01-01

    In benthic deep-sea ecosystems, which represent the largest biome on Earth, viruses have a recognised key ecological role, but their diversity is still largely unknown. Identifying the taxonomic composition of viruses is crucial for understanding virus-host interactions, their role in food web functioning and evolutionary processes. Here, we compared the performance of various bioinformatic tools (BLAST, MG-RAST, NBC, VMGAP, MetaVir, VIROME) for analysing the viral taxonomic composition in simulated viromes and viral metagenomes from different benthic deep-sea ecosystems. The analyses of simulated viromes indicate that all the BLAST tools, followed by MetaVir and VMGAP, are more reliable in the affiliation of viral sequences and strains. When analysing the environmental viromes, tBLASTx, MetaVir, VMGAP and VIROME showed a similar efficiency of sequence annotation; however, MetaVir and tBLASTx identified a higher number of viral strains. These latter tools also identified a wider range of viral families than the others, providing a wider view of viral taxonomic diversity in benthic deep-sea ecosystems. Our findings highlight strengths and weaknesses of available bioinformatic tools for investigating the taxonomic diversity of viruses in benthic ecosystems in order to improve our comprehension of viral diversity in the oceans and its relationships with host diversity and ecosystem functioning. PMID:27329207

  2. Assessing viral taxonomic composition in benthic marine ecosystems: reliability and efficiency of different bioinformatic tools for viral metagenomic analyses.

    Science.gov (United States)

    Tangherlini, M; Dell'Anno, A; Zeigler Allen, L; Riccioni, G; Corinaldesi, C

    2016-01-01

    In benthic deep-sea ecosystems, which represent the largest biome on Earth, viruses have a recognised key ecological role, but their diversity is still largely unknown. Identifying the taxonomic composition of viruses is crucial for understanding virus-host interactions, their role in food web functioning and evolutionary processes. Here, we compared the performance of various bioinformatic tools (BLAST, MG-RAST, NBC, VMGAP, MetaVir, VIROME) for analysing the viral taxonomic composition in simulated viromes and viral metagenomes from different benthic deep-sea ecosystems. The analyses of simulated viromes indicate that all the BLAST tools, followed by MetaVir and VMGAP, are more reliable in the affiliation of viral sequences and strains. When analysing the environmental viromes, tBLASTx, MetaVir, VMGAP and VIROME showed a similar efficiency of sequence annotation; however, MetaVir and tBLASTx identified a higher number of viral strains. These latter tools also identified a wider range of viral families than the others, providing a wider view of viral taxonomic diversity in benthic deep-sea ecosystems. Our findings highlight strengths and weaknesses of available bioinformatic tools for investigating the taxonomic diversity of viruses in benthic ecosystems in order to improve our comprehension of viral diversity in the oceans and its relationships with host diversity and ecosystem functioning. PMID:27329207

  3. Phosphoproteomics and bioinformatics analyses of spinal cord proteins in rats with morphine tolerance.

    Directory of Open Access Journals (Sweden)

    Wen-Jinn Liaw

    Full Text Available INTRODUCTION: Morphine is the most effective pain-relieving drug, but it can cause unwanted side effects. Direct neuraxial administration of morphine to spinal cord not only can provide effective, reliable pain relief but also can prevent the development of supraspinal side effects. However, repeated neuraxial administration of morphine may still lead to morphine tolerance. METHODS: To better understand the mechanism that causes morphine tolerance, we induced tolerance in rats at the spinal cord level by giving them twice-daily injections of morphine (20 µg/10 µL for 4 days. We confirmed tolerance by measuring paw withdrawal latencies and maximal possible analgesic effect of morphine on day 5. We then carried out phosphoproteomic analysis to investigate the global phosphorylation of spinal proteins associated with morphine tolerance. Finally, pull-down assays were used to identify phosphorylated types and sites of 14-3-3 proteins, and bioinformatics was applied to predict biological networks impacted by the morphine-regulated proteins. RESULTS: Our proteomics data showed that repeated morphine treatment altered phosphorylation of 10 proteins in the spinal cord. Pull-down assays identified 2 serine/threonine phosphorylated sites in 14-3-3 proteins. Bioinformatics further revealed that morphine impacted on cytoskeletal reorganization, neuroplasticity, protein folding and modulation, signal transduction and biomolecular metabolism. CONCLUSIONS: Repeated morphine administration may affect multiple biological networks by altering protein phosphorylation. These data may provide insight into the mechanism that underlies the development of morphine tolerance.

  4. Comparative bioinformatics analyses and profiling of lysosome-related organelle proteomes

    Science.gov (United States)

    Hu, Zhang-Zhi; Valencia, Julio C.; Huang, Hongzhan; Chi, An; Shabanowitz, Jeffrey; Hearing, Vincent J.; Appella, Ettore; Wu, Cathy

    2007-01-01

    Complete and accurate profiling of cellular organelle proteomes, while challenging, is important for the understanding of detailed cellular processes at the organelle level. Mass spectrometry technologies coupled with bioinformatics analysis provide an effective approach for protein identification and functional interpretation of organelle proteomes. In this study, we have compiled human organelle reference datasets from large-scale proteomic studies and protein databases for seven lysosome-related organelles (LROs), as well as the endoplasmic reticulum and mitochondria, for comparative organelle proteome analysis. Heterogeneous sources of human organelle proteins and rodent homologs are mapped to human UniProtKB protein entries based on ID and/or peptide mappings, followed by functional annotation and categorization using the iProXpress proteomic expression analysis system. Cataloging organelle proteomes allows close examination of both shared and unique proteins among various LROs and reveals their functional relevance. The proteomic comparisons show that LROs are a closely related family of organelles. The shared proteins indicate the dynamic and hybrid nature of LROs, while the unique transmembrane proteins may represent additional candidate marker proteins for LROs. This comparative analysis, therefore, provides a basis for hypothesis formulation and experimental validation of organelle proteins and their functional roles.

  5. Biophysical and Bioinformatic Analyses Implicate the Treponema pallidum Tp34 Lipoprotein (Tp0971) in Transition Metal Homeostasis

    Science.gov (United States)

    Brautigam, Chad A.; Deka, Ranjit K.; Ouyang, Zhiming; Machius, Mischa; Knutsen, Gregory; Tomchick, Diana R.

    2012-01-01

    Metal ion homeostasis is a critical function of many integral and peripheral membrane proteins. The genome of the etiologic agent of syphilis, Treponema pallidum, is compact and devoid of many metabolic enzyme genes. Nevertheless, it harbors genes coding for homologs of several enzymes that typically require either iron or zinc. The product of the tp0971 gene of T. pallidum, designated Tp34, is a periplasmic lipoprotein that is thought to be tethered to the inner membrane of this organism. Previous work on a water-soluble (nonacylated) recombinant version of Tp34 established that this protein binds to Zn2+, which, like other transition metal ions, stabilizes the dimeric form of the protein. In this study, we employed analytical ultracentrifugation to establish that four transition metal ions (Ni2+, Co2+, Cu2+, and Zn2+) readily induce the dimerization of Tp34; Cu2+ (50% effective concentration [EC50] = 1.7 μM) and Zn2+ (EC50 = 6.2 μM) were the most efficacious of these ions. Mutations of the crystallographically identified metal-binding residues hindered the ability of Tp34 to dimerize. X-ray crystallography performed on crystals of Tp34 that had been incubated with metal ions indicated that the binding site could accommodate the metals examined. The findings presented herein, coupled with bioinformatic analyses of related proteins, point to Tp34's likely role in metal ion homeostasis in T. pallidum. PMID:23042995

  6. Expression of zebrafish pax6b in pancreas is regulated by two enhancers containing highly conserved cis-elements bound by PDX1, PBX and PREP factors

    Directory of Open Access Journals (Sweden)

    Biemar Frédéric

    2008-05-01

    Full Text Available Abstract Background PAX6 is a transcription factor playing a crucial role in the development of the eye and in the differentiation of the pancreatic endocrine cells as well as of enteroendocrine cells. Studies on the mouse Pax6 gene have shown that sequences upstream from the P0 promoter are required for expression in the lens and the pancreas; but there remain discrepancies regarding the precise location of the pancreatic regulatory elements. Results Due to genome duplication in the evolution of ray-finned fishes, zebrafish has two pax6 genes, pax6a and pax6b. While both zebrafish pax6 genes are expressed in the developing eye and nervous system, only pax6b is expressed in the endocrine cells of the pancreas. To investigate the cause of this differential expression, we used a combination of in silico, in vivo and in vitro approaches. We show that the pax6b P0 promoter targets expression to endocrine pancreatic cells and also to enteroendocrine cells, retinal neurons and the telencephalon of transgenic zebrafish. Deletion analyses indicate that strong pancreatic expression of the pax6b gene relies on the combined action of two conserved regulatory enhancers, called regions A and C. By means of gel shift assays, we detected binding of the homeoproteins PDX1, PBX and PREP to several cis-elements of these regions. In constrast, regions A and C of the zebrafish pax6a gene are not active in the pancreas, this difference being attributable to sequence divergences within two cis-elements binding the pancreatic homeoprotein PDX1. Conclusion Our data indicate a conserved role of enhancers A and C in the pancreatic expression of pax6b and emphasize the importance of the homeoproteins PBX and PREP cooperating with PDX1, in activating pax6b expression in endocrine pancreatic cells. This study also provides a striking example of how adaptative evolution of gene regulatory sequences upon gene duplication progressively leads to subfunctionalization of the

  7. NMR spectroscopic and bioinformatic analyses of the LTBP1 C-terminus reveal a highly dynamic domain organisation.

    Directory of Open Access Journals (Sweden)

    Ian B Robertson

    Full Text Available Proteins from the LTBP/fibrillin family perform key structural and functional roles in connective tissues. LTBP1 forms the large latent complex with TGFβ and its propeptide LAP, and sequesters the latent growth factor to the extracellular matrix. Bioinformatics studies suggest the main structural features of the LTBP1 C-terminus are conserved through evolution. NMR studies were carried out on three overlapping C-terminal fragments of LTBP1, comprising four domains with characterised homologues, cbEGF14, TB3, EGF3 and cbEGF15, and three regions with no homology to known structures. The NMR data reveal that the four domains adopt canonical folds, but largely lack the interdomain interactions observed with homologous fibrillin domains; the exception is the EGF3-cbEGF15 domain pair which has a well-defined interdomain interface. (15N relaxation studies further demonstrate that the three interdomain regions act as flexible linkers, allowing a wide range of motion between the well-structured domains. This work is consistent with the LTBP1 C-terminus adopting a flexible "knotted rope" structure, which may facilitate cell matrix interactions, and the accessibility to proteases or other factors that could contribute to TGFβ activation.

  8. Application of bioinformatics in tropical medicine

    Institute of Scientific and Technical Information of China (English)

    Wiwanitkit V

    2008-01-01

    Bioinformatics is a usage of information technology to help solve biological problems by designing novel and in-cisive algorithms and methods of analyses.Bioinformatics becomes a discipline vital in the era of post-genom-ics.In this review article,the application of bioinformatics in tropical medicine will be presented and dis-cussed.

  9. Trends in IT Innovation to Build a Next Generation Bioinformatics Solution to Manage and Analyse Biological Big Data Produced by NGS Technologies

    Science.gov (United States)

    de Brevern, Alexandre G.; Meyniel, Jean-Philippe; Fairhead, Cécile; Neuvéglise, Cécile; Malpertuy, Alain

    2015-01-01

    Sequencing the human genome began in 1994, and 10 years of work were necessary in order to provide a nearly complete sequence. Nowadays, NGS technologies allow sequencing of a whole human genome in a few days. This deluge of data challenges scientists in many ways, as they are faced with data management issues and analysis and visualization drawbacks due to the limitations of current bioinformatics tools. In this paper, we describe how the NGS Big Data revolution changes the way of managing and analysing data. We present how biologists are confronted with abundance of methods, tools, and data formats. To overcome these problems, focus on Big Data Information Technology innovations from web and business intelligence. We underline the interest of NoSQL databases, which are much more efficient than relational databases. Since Big Data leads to the loss of interactivity with data during analysis due to high processing time, we describe solutions from the Business Intelligence that allow one to regain interactivity whatever the volume of data is. We illustrate this point with a focus on the Amadea platform. Finally, we discuss visualization challenges posed by Big Data and present the latest innovations with JavaScript graphic libraries. PMID:26125026

  10. Identification of a peroxisome proliferator responsive element (PPRE)-like cis-element in mouse plasminogen activator inhibitor-1 gene promoter

    International Nuclear Information System (INIS)

    PAI-1 is expressed and secreted by adipose tissue which may mediate the pathogenesis of obesity-associated cardiovascular complications. Evidence is presented in this report that PAI-1 is not expressed by preadipocyte, but significantly induced during 3T3-L1 adipocyte differentiation and the PAI-1 expression correlates with the induction of peroxisome proliferator-activated receptor γ (PPARγ). A peroxisome proliferator responsive element (PPRE)-like cis-element (-206TCCCCCATGCCCT-194) is identified in the mouse PAI-1 gene promoter by electrophoretic mobility shift assay (EMSA) combined with transient transfection experiments; the PPRE-like cis-element forms a specific DNA-protein complex only with adipocyte nuclear extracts, not with preadipocyte nuclear extracts; the DNA-protein complex can be totally competed away by non-labeled consensus PPRE, and can be supershifted with PPARγ antibody. Mutation of this PPRE-like cis-element can abolish the transactivation of mouse PAI-1 promoter mediated by PPARγ. Specific PPARγ ligand Pioglitazone can significantly induce the PAI-1 expression, and stimulate the secretion of PAI-1 into medium

  11. Statement complementing the EFSA opinion on application EFSA GMO UK 2007 41 (cotton MON 88913 for food and feed uses, import and processing) taking into consideration updated bioinformatic analyses

    OpenAIRE

    EFSA Panel on Genetically Modified Organisms (GMO)

    2014-01-01

    In this statement, the EFSA GMO Panel responds to a request from the European Commission (EC) to complement its partially inconclusive scientific opinion on cotton MON 88913 taking into consideration updated bioinformatic analyses submitted by the applicant after the adoption. Similarity searches assessed the identity of the genomic sequences flanking the MON 88913 insert, the potential of creating open reading frames (ORFs) showing similarity to known all...

  12. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software.

    Science.gov (United States)

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians. PMID:25996054

  13. Bioinformatics in Africa: The Rise of Ghana?

    Directory of Open Access Journals (Sweden)

    Thomas K Karikari

    2015-09-01

    Full Text Available Until recently, bioinformatics, an important discipline in the biological sciences, was largely limited to countries with advanced scientific resources. Nonetheless, several developing countries have lately been making progress in bioinformatics training and applications. In Africa, leading countries in the discipline include South Africa, Nigeria, and Kenya. However, one country that is less known when it comes to bioinformatics is Ghana. Here, I provide a first description of the development of bioinformatics activities in Ghana and how these activities contribute to the overall development of the discipline in Africa. Over the past decade, scientists in Ghana have been involved in publications incorporating bioinformatics analyses, aimed at addressing research questions in biomedical science and agriculture. Scarce research funding and inadequate training opportunities are some of the challenges that need to be addressed for Ghanaian scientists to continue developing their expertise in bioinformatics.

  14. Engineering BioInformatics

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    @@ With the completion of human genome sequencing, a new era of bioinformatics st arts. On one hand, due to the advance of high throughput DNA microarray technol ogies, functional genomics such as gene expression information has increased exp onentially and will continue to do so for the foreseeable future. Conventional m eans of storing, analysing and comparing related data are already overburdened. Moreover, the rich information in genes , their functions and their associated wide biological implication requires new technologies of analysing data that employ sophisticated statistical and machine learning algorithms, powerful com puters and intensive interaction together different data sources such as seque nce data, gene expression data, proteomics data and metabolic pathway informati on to discover complex genomic structures and functional patterns with other bi ological process to gain a comprehensive understanding of cell physiology.

  15. An Introduction to Bioinformatics

    Institute of Scientific and Technical Information of China (English)

    SHENGQi-zheng; DeMoorBart

    2004-01-01

    As a newborn interdisciplinary field, bioinformatics is receiving increasing attention from biologists, computer scientists, statisticians, mathematicians and engineers. This paper briefly introduces the birth, importance, and extensive applications of bioinformatics in the different fields of biological research. A major challenge in bioinformatics - the unraveling of gene regulation - is discussed in detail.

  16. Statement complementing the EFSA opinion on application EFSA GMO UK 2007 41 (cotton MON 88913 for food and feed uses, import and processing taking into consideration updated bioinformatic analyses

    Directory of Open Access Journals (Sweden)

    EFSA Panel on Genetically Modified Organisms (GMO

    2014-03-01

    Full Text Available In this statement, the EFSA GMO Panel responds to a request from the European Commission (EC to complement its partially inconclusive scientific opinion on cotton MON 88913 taking into consideration updated bioinformatic analyses submitted by the applicant after the adoption. Similarity searches assessed the identity of the genomic sequences flanking the MON 88913 insert, the potential of creating open reading frames (ORFs showing similarity to known allergens or toxins and the similarity of the newly expressed CP4 EPSPS protein to known allergens or toxins. Having assessed these searches, the EFSA GMO Panel did not identify interruptions of known cotton genes or any safety issue arising from the identified ORFs including the newly expressed CP4 EPSPS protein. In conclusion, the EFSA GMO Panel considers that cotton MON 88913, as assessed in the scientific opinion on application EFSA-GMO-UK-2007-41 and in the supplementary bioinformatic dataset, is as safe and nutritious as its conventional counterpart and commercial cotton varieties with respect to potential effects on human and animal health and the environment in the context of its intended uses.

  17. Multiple cis elements within the Igf2/H19 insulator domain organize a distance-dependent silencer. A cautionary note.

    Science.gov (United States)

    Ginjala, Vasudeva; Holmgren, Claes; Ullerås, Erik; Kanduri, Chandrasekhar; Pant, Vinod; Lobanenkov, Victor; Franklin, Gary; Ohlsson, Rolf

    2002-02-22

    The 5'-flank of the H19 gene harbors a differentially methylated imprinting control region that represses the maternally derived Igf2 and paternally derived H19 alleles. Here we show that the H19 imprinting control region (ICR) is a potent silencer when positioned in a promoter-proximal position. The silencing effect is not alleviated by trichostatin A treatment, suggesting that it does not involve histone deacetylase functions. When the H19 ICR is separated from the promoter by more than 1.2 +/- 0.3 kb, however, trichostatin A stimulates promoter activity 10-fold. Deletion analyses revealed that the silencing feature extended throughout the ICR segment. Finally, chromatin immunopurification analyses revealed that the H19 ICR prevented trichostatin A-dependent reacetylation of histones in the promoter region in a proximal but not in a distal position. We argue that these features are likely to be side effects of the H19 ICR, rather than explaining the mechanism of silencing of the paternal H19 allele. We issue a cautionary note, therefore, that the interpretation of insulator/silencer data could be erroneous should the distance issue not be taken into consideration. PMID:11777900

  18. Deep Learning in Bioinformatics

    OpenAIRE

    Min, Seonwoo; Lee, Byunghan; Yoon, Sungroh

    2016-01-01

    In the era of big data, transformation of biomedical big data into valuable knowledge has been one of the most important challenges in bioinformatics. Deep learning has advanced rapidly since the early 2000s and now demonstrates state-of-the-art performance in various fields. Accordingly, application of deep learning in bioinformatics to gain insight from data has been emphasized in both academia and industry. Here, we review deep learning in bioinformatics, presenting examples of current res...

  19. Microbial bioinformatics 2020.

    Science.gov (United States)

    Pallen, Mark J

    2016-09-01

    Microbial bioinformatics in 2020 will remain a vibrant, creative discipline, adding value to the ever-growing flood of new sequence data, while embracing novel technologies and fresh approaches. Databases and search strategies will struggle to cope and manual curation will not be sustainable during the scale-up to the million-microbial-genome era. Microbial taxonomy will have to adapt to a situation in which most microorganisms are discovered and characterised through the analysis of sequences. Genome sequencing will become a routine approach in clinical and research laboratories, with fresh demands for interpretable user-friendly outputs. The "internet of things" will penetrate healthcare systems, so that even a piece of hospital plumbing might have its own IP address that can be integrated with pathogen genome sequences. Microbiome mania will continue, but the tide will turn from molecular barcoding towards metagenomics. Crowd-sourced analyses will collide with cloud computing, but eternal vigilance will be the price of preventing the misinterpretation and overselling of microbial sequence data. Output from hand-held sequencers will be analysed on mobile devices. Open-source training materials will address the need for the development of a skilled labour force. As we boldly go into the third decade of the twenty-first century, microbial sequence space will remain the final frontier! PMID:27471065

  20. CstF-64 and 3'-UTR cis-element determine Star-PAP specificity for target mRNA selection by excluding PAPα.

    Science.gov (United States)

    Kandala, Divya T; Mohan, Nimmy; A, Vivekanand; A P, Sudheesh; G, Reshmi; Laishram, Rakesh S

    2016-01-29

    Almost all eukaryotic mRNAs have a poly (A) tail at the 3'-end. Canonical PAPs (PAPα/γ) polyadenylate nuclear pre-mRNAs. The recent identification of the non-canonical Star-PAP revealed specificity of nuclear PAPs for pre-mRNAs, yet the mechanism how Star-PAP selects mRNA targets is still elusive. Moreover, how Star-PAP target mRNAs having canonical AAUAAA signal are not regulated by PAPα is unclear. We investigate specificity mechanisms of Star-PAP that selects pre-mRNA targets for polyadenylation. Star-PAP assembles distinct 3'-end processing complex and controls pre-mRNAs independent of PAPα. We identified a Star-PAP recognition nucleotide motif and showed that suboptimal DSE on Star-PAP target pre-mRNA 3'-UTRs inhibit CstF-64 binding, thus preventing PAPα recruitment onto it. Altering 3'-UTR cis-elements on a Star-PAP target pre-mRNA can switch the regulatory PAP from Star-PAP to PAPα. Our results suggest a mechanism of poly (A) site selection that has potential implication on the regulation of alternative polyadenylation. PMID:26496945

  1. CstF-64 and 3′-UTR cis-element determine Star-PAP specificity for target mRNA selection by excluding PAPα

    Science.gov (United States)

    Kandala, Divya T.; Mohan, Nimmy; A, Vivekanand; AP, Sudheesh; G, Reshmi; Laishram, Rakesh S.

    2016-01-01

    Almost all eukaryotic mRNAs have a poly (A) tail at the 3′-end. Canonical PAPs (PAPα/γ) polyadenylate nuclear pre-mRNAs. The recent identification of the non-canonical Star-PAP revealed specificity of nuclear PAPs for pre-mRNAs, yet the mechanism how Star-PAP selects mRNA targets is still elusive. Moreover, how Star-PAP target mRNAs having canonical AAUAAA signal are not regulated by PAPα is unclear. We investigate specificity mechanisms of Star-PAP that selects pre-mRNA targets for polyadenylation. Star-PAP assembles distinct 3′-end processing complex and controls pre-mRNAs independent of PAPα. We identified a Star-PAP recognition nucleotide motif and showed that suboptimal DSE on Star-PAP target pre-mRNA 3′-UTRs inhibit CstF-64 binding, thus preventing PAPα recruitment onto it. Altering 3′-UTR cis-elements on a Star-PAP target pre-mRNA can switch the regulatory PAP from Star-PAP to PAPα. Our results suggest a mechanism of poly (A) site selection that has potential implication on the regulation of alternative polyadenylation. PMID:26496945

  2. Introduction to Bioinformatics

    OpenAIRE

    Thampi, Sabu M.

    2009-01-01

    Bioinformatics is a new discipline that addresses the need to manage and interpret the data that in the past decade was massively generated by genomic research. This discipline represents the convergence of genomics, biotechnology and information technology, and encompasses analysis and interpretation of data, modeling of biological phenomena, and development of algorithms and statistics. This article presents an introduction to bioinformatics

  3. Global computing for bioinformatics.

    Science.gov (United States)

    Loewe, Laurence

    2002-12-01

    Global computing, the collaboration of idle PCs via the Internet in a SETI@home style, emerges as a new way of massive parallel multiprocessing with potentially enormous CPU power. Its relations to the broader, fast-moving field of Grid computing are discussed without attempting a review of the latter. This review (i) includes a short table of milestones in global computing history, (ii) lists opportunities global computing offers for bioinformatics, (iii) describes the structure of problems well suited for such an approach, (iv) analyses the anatomy of successful projects and (v) points to existing software frameworks. Finally, an evaluation of the various costs shows that global computing indeed has merit, if the problem to be solved is already coded appropriately and a suitable global computing framework can be found. Then, either significant amounts of computing power can be recruited from the general public, or--if employed in an enterprise-wide Intranet for security reasons--idle desktop PCs can substitute for an expensive dedicated cluster. PMID:12511066

  4. Chemistry in Bioinformatics

    OpenAIRE

    Mitchell John; Murray-Rust Peter; Rzepa Henry

    2005-01-01

    Abstract Chemical information is now seen as critical for most areas of life sciences. But unlike Bioinformatics, where data is openly available and freely re-usable, most chemical information is closed and cannot be re-distributed without permission. This has led to a failure to adopt modern informatics and software techniques and therefore paucity of chemistry in bioinformatics. New technology, however, offers the hope of making chemical data (compounds and properties) free during the auth...

  5. Chemistry in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Mitchell John

    2005-06-01

    Full Text Available Abstract Chemical information is now seen as critical for most areas of life sciences. But unlike Bioinformatics, where data is openly available and freely re-usable, most chemical information is closed and cannot be re-distributed without permission. This has led to a failure to adopt modern informatics and software techniques and therefore paucity of chemistry in bioinformatics. New technology, however, offers the hope of making chemical data (compounds and properties free during the authoring process. We argue that the technology is already available; we require a collective agreement to enhance publication protocols.

  6. Metagenomics and Bioinformatics in Microbial Ecology: Current Status and Beyond

    Science.gov (United States)

    Hiraoka, Satoshi; Yang, Ching-chia; Iwasaki, Wataru

    2016-01-01

    Metagenomic approaches are now commonly used in microbial ecology to study microbial communities in more detail, including many strains that cannot be cultivated in the laboratory. Bioinformatic analyses make it possible to mine huge metagenomic datasets and discover general patterns that govern microbial ecosystems. However, the findings of typical metagenomic and bioinformatic analyses still do not completely describe the ecology and evolution of microbes in their environments. Most analyses still depend on straightforward sequence similarity searches against reference databases. We herein review the current state of metagenomics and bioinformatics in microbial ecology and discuss future directions for the field. New techniques will allow us to go beyond routine analyses and broaden our knowledge of microbial ecosystems. We need to enrich reference databases, promote platforms that enable meta- or comprehensive analyses of diverse metagenomic datasets, devise methods that utilize long-read sequence information, and develop more powerful bioinformatic methods to analyze data from diverse perspectives. PMID:27383682

  7. Bioinformatics and School Biology

    Science.gov (United States)

    Dalpech, Roger

    2006-01-01

    The rapidly changing field of bioinformatics is fuelling the need for suitably trained personnel with skills in relevant biological "sub-disciplines" such as proteomics, transcriptomics and metabolomics, etc. But because of the complexity--and sheer weight of data--associated with these new areas of biology, many school teachers feel…

  8. In the Spotlight: Bioinformatics

    Science.gov (United States)

    Wang, May Dongmei

    2016-01-01

    During 2012, next generation sequencing (NGS) has attracted great attention in the biomedical research community, especially for personalized medicine. Also, third generation sequencing has become available. Therefore, state-of-art sequencing technology and analysis are reviewed in this Bioinformatics spotlight on 2012. Next-generation sequencing (NGS) is high-throughput nucleic acid sequencing technology with wide dynamic range and single base resolution. The full promise of NGS depends on the optimization of NGS platforms, sequence alignment and assembly algorithms, data analytics, novel algorithms for integrating NGS data with existing genomic, proteomic, or metabolomic data, and quantitative assessment of NGS technology in comparing to more established technologies such as microarrays. NGS technology has been predicated to become a cornerstone of personalized medicine. It is argued that NGS is a promising field for motivated young researchers who are looking for opportunities in bioinformatics. PMID:23192635

  9. Bioinformatics for Exploration

    Science.gov (United States)

    Johnson, Kathy A.

    2006-01-01

    For the purpose of this paper, bioinformatics is defined as the application of computer technology to the management of biological information. It can be thought of as the science of developing computer databases and algorithms to facilitate and expedite biological research. This is a crosscutting capability that supports nearly all human health areas ranging from computational modeling, to pharmacodynamics research projects, to decision support systems within autonomous medical care. Bioinformatics serves to increase the efficiency and effectiveness of the life sciences research program. It provides data, information, and knowledge capture which further supports management of the bioastronautics research roadmap - identifying gaps that still remain and enabling the determination of which risks have been addressed.

  10. Bioinformatics big data processing

    OpenAIRE

    Cohen-Boulakia, Sarah; Valduriez, Patrick

    2016-01-01

    The volumes of bioinformatics data available on the Web are constantly increasing.Access and joint exploitation of these highly distributed data (i.e, available in distributed Webdata sources) and highly heterogeneous (in text or tabulated les including images, in dierentformats, described with dierent levels of detail and dierent levels of quality ...) is essential forthe biological knowledge to progress. The purpose of this short report is to present in a simpleway the problems of the joint...

  11. Phylogenetic trees in bioinformatics

    Energy Technology Data Exchange (ETDEWEB)

    Burr, Tom L [Los Alamos National Laboratory

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  12. Flow cytometry bioinformatics.

    Directory of Open Access Journals (Sweden)

    Kieran O'Neill

    Full Text Available Flow cytometry bioinformatics is the application of bioinformatics to flow cytometry data, which involves storing, retrieving, organizing, and analyzing flow cytometry data using extensive computational resources and tools. Flow cytometry bioinformatics requires extensive use of and contributes to the development of techniques from computational statistics and machine learning. Flow cytometry and related methods allow the quantification of multiple independent biomarkers on large numbers of single cells. The rapid growth in the multidimensionality and throughput of flow cytometry data, particularly in the 2000s, has led to the creation of a variety of computational analysis methods, data standards, and public databases for the sharing of results. Computational methods exist to assist in the preprocessing of flow cytometry data, identifying cell populations within it, matching those cell populations across samples, and performing diagnosis and discovery using the results of previous steps. For preprocessing, this includes compensating for spectral overlap, transforming data onto scales conducive to visualization and analysis, assessing data for quality, and normalizing data across samples and experiments. For population identification, tools are available to aid traditional manual identification of populations in two-dimensional scatter plots (gating, to use dimensionality reduction to aid gating, and to find populations automatically in higher dimensional space in a variety of ways. It is also possible to characterize data in more comprehensive ways, such as the density-guided binary space partitioning technique known as probability binning, or by combinatorial gating. Finally, diagnosis using flow cytometry data can be aided by supervised learning techniques, and discovery of new cell types of biological importance by high-throughput statistical methods, as part of pipelines incorporating all of the aforementioned methods. Open standards, data

  13. Identification of SmtB/ArsR cis elements and proteins in archaea using the Prokaryotic InterGenic Exploration Database (PIGED

    Directory of Open Access Journals (Sweden)

    Michael Bose

    2006-01-01

    Full Text Available Microbial genome sequencing projects have revealed an apparently wide distribution of SmtB/ArsR metal-responsive transcriptional regulators among prokaryotes. Using a position-dependent weight matrix approach, prokaryotic genome sequences were screened for SmtB/ArsR DNA binding sites using data derived from intergenic sequences upstream of orthologous genes encoding these regulators. Sixty SmtB/ArsR operators linked to metal detoxification genes, including nine among various archaeal species, are predicted among 230 annotated and draft prokaryotic genome sequences. Independent multiple sequence alignments of putative operator sites and corresponding winged helix-turn-helix motifs define sequence signatures for the DNA binding activity of this SmtB/ArsR subfamily. Prediction of an archaeal SmtB/ArsR based upon these signature sequences is confirmed using purified Methanosarcina acetivorans C2A protein and electrophoretic mobility shift assays. Tools used in this study have been incorporated into a web application, the Prokaryotic InterGenic Exploration Database (PIGED; http://bioinformatics.uwp.edu/~PIGED/home.htm, facilitating comparable studies. Use of this tool and establishment of orthology based on DNA binding signatures holds promise for deciphering potential cellular roles of various archaeal winged helix-turn-helix transcriptional regulators.

  14. Identification of SmtB/ArsR cis elements and proteins in archaea using the Prokaryotic InterGenic Exploration Database (PIGED).

    Science.gov (United States)

    Bose, Michael; Slick, David; Sarto, Mickey J; Murphy, Patrick; Roberts, David; Roberts, Jacqueline; Barber, Robert D

    2006-08-01

    Microbial genome sequencing projects have revealed an apparently wide distribution of SmtB/ArsR metal-responsive transcriptional regulators among prokaryotes. Using a position-dependent weight matrix approach, prokaryotic genome sequences were screened for SmtB/ArsR DNA binding sites using data derived from intergenic sequences upstream of orthologous genes encoding these regulators. Sixty SmtB/ArsR operators linked to metal detoxification genes, including nine among various archaeal species, are predicted among 230 annotated and draft prokaryotic genome sequences. Independent multiple sequence alignments of putative operator sites and corresponding winged helix-turn-helix motifs define sequence signatures for the DNA binding activity of this SmtB/ArsR subfamily. Prediction of an archaeal SmtB/ArsR based upon these signature sequences is confirmed using purified Methanosarcina acetivorans C2A protein and electrophoretic mobility shift assays. Tools used in this study have been incorporated into a web application, the Prokaryotic InterGenic Exploration Database (PIGED; http://bioinformatics.uwp.edu/~PIGED/home.htm), facilitating comparable studies. Use of this tool and establishment of orthology based on DNA binding signatures holds promise for deciphering potential cellular roles of various archaeal winged helix-turn-helix transcriptional regulators. PMID:16877320

  15. Bioinformatic methods in protein characterization

    OpenAIRE

    Kallberg, Yvonne

    2002-01-01

    Bioinformatics is an emerging interdisciplinary research field in which mathematics. computer science and biology meet. In this thesis. bioinformatic methods for analysis of functional and structural properties among proteins will be presented. I have developed and applied bioinformatic methods on the enzyme superfamily of short-chain dehydrogenases/reductases (SDRs), coenzyme-binding enzymes of the Rossmann fold type, and amyloid-forming proteins and peptides. The basis...

  16. Bioinformatics and genomic medicine.

    Science.gov (United States)

    Kim, Ju Han

    2002-01-01

    Bioinformatics is a rapidly emerging field of biomedical research. A flood of large-scale genomic and postgenomic data means that many of the challenges in biomedical research are now challenges in computational science. Clinical informatics has long developed methodologies to improve biomedical research and clinical care by integrating experimental and clinical information systems. The informatics revolution in both bioinformatics and clinical informatics will eventually change the current practice of medicine, including diagnostics, therapeutics, and prognostics. Postgenome informatics, powered by high-throughput technologies and genomic-scale databases, is likely to transform our biomedical understanding forever, in much the same way that biochemistry did a generation ago. This paper describes how these technologies will impact biomedical research and clinical care, emphasizing recent advances in biochip-based functional genomics and proteomics. Basic data preprocessing with normalization and filtering, primary pattern analysis, and machine-learning algorithms are discussed. Use of integrative biochip informatics technologies, including multivariate data projection, gene-metabolic pathway mapping, automated biomolecular annotation, text mining of factual and literature databases, and the integrated management of biomolecular databases, are also discussed. PMID:12544491

  17. Virtual Bioinformatics Distance Learning Suite

    Science.gov (United States)

    Tolvanen, Martti; Vihinen, Mauno

    2004-01-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material…

  18. Emergent Computation Emphasizing Bioinformatics

    CERN Document Server

    Simon, Matthew

    2005-01-01

    Emergent Computation is concerned with recent applications of Mathematical Linguistics or Automata Theory. This subject has a primary focus upon "Bioinformatics" (the Genome and arising interest in the Proteome), but the closing chapter also examines applications in Biology, Medicine, Anthropology, etc. The book is composed of an organized examination of DNA, RNA, and the assembly of amino acids into proteins. Rather than examine these areas from a purely mathematical viewpoint (that excludes much of the biochemical reality), the author uses scientific papers written mostly by biochemists based upon their laboratory observations. Thus while DNA may exist in its double stranded form, triple stranded forms are not excluded. Similarly, while bases exist in Watson-Crick complements, mismatched bases and abasic pairs are not excluded, nor are Hoogsteen bonds. Just as there are four bases naturally found in DNA, the existence of additional bases is not ignored, nor amino acids in addition to the usual complement of...

  19. Bioinformatics on the cloud computing platform Azure.

    Science.gov (United States)

    Shanahan, Hugh P; Owen, Anne M; Harrison, Andrew P

    2014-01-01

    We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development. PMID:25050811

  20. [An overview of feature selection algorithm in bioinformatics].

    Science.gov (United States)

    Li, Xin; Ma, Li; Wang, Jinjia; Zhao, Chun

    2011-04-01

    Feature selection (FS) techniques have become an important tool in bioinformatics field. The core algorithm of it is to select the hidden significant data with low-dimension from high-dimensional data space, and thus to analyse the basic built-in rule of the data. The data of bioinformatics fields are always with high-dimension and small samples, so the research of FS algorithm in the bioinformatics fields has great foreground. In this article, we make the interested reader aware of the possibilities of feature selection, provide basic properties of feature selection techniques, and discuss their uses in the sequence analysis, microarray analysis, mass spectra analysis etc. Finally, the current problems and the prospects of feature selection algorithm in the application of bioinformatics is also discussed. PMID:21604512

  1. String Mining in Bioinformatics

    Science.gov (United States)

    Abouelhoda, Mohamed; Ghanem, Moustafa

    Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word "data-mining" is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].

  2. Primary structure and promoter analysis of leghemoglobin genes of the stem-nodulated tropical legume Sesbania rostrata: conserved coding sequences, cis-elements and trans-acting factors

    DEFF Research Database (Denmark)

    Metz, B A; Welters, P; Hoffmann, H J; Jensen, E O; Schell, J

    1988-01-01

    The primary structure of a leghemoglobin (lb) gene from the stem-nodulated, tropical legume Sesbania rostrata and two lb gene promoter regions was analysed. The S. rostrata lb gene structure and Lb amino acid composition were found to be highly conserved with previously described lb genes and Lb...... proteins. Distinct DNA elements were identified in the S. rostrata lb promoter regions, which share a high degree of homology with cis-active regulatory elements found in the soybean (Glycine max) lbc3 promoter. One conserved DNA element was found to interact specifically with an apparently universal...

  3. Clustering Techniques in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Muhammad Ali Masood

    2015-01-01

    Full Text Available Dealing with data means to group information into a set of categories either in order to learn new artifacts or understand new domains. For this purpose researchers have always looked for the hidden patterns in data that can be defined and compared with other known notions based on the similarity or dissimilarity of their attributes according to well-defined rules. Data mining, having the tools of data classification and data clustering, is one of the most powerful techniques to deal with data in such a manner that it can help researchers identify the required information. As a step forward to address this challenge, experts have utilized clustering techniques as a mean of exploring hidden structure and patterns in underlying data. Improved stability, robustness and accuracy of unsupervised data classification in many fields including pattern recognition, machine learning, information retrieval, image analysis and bioinformatics, clustering has proven itself as a reliable tool. To identify the clusters in datasets algorithm are utilized to partition data set into several groups based on the similarity within a group. There is no specific clustering algorithm, but various algorithms are utilized based on domain of data that constitutes a cluster and the level of efficiency required. Clustering techniques are categorized based upon different approaches. This paper is a survey of few clustering techniques out of many in data mining. For the purpose five of the most common clustering techniques out of many have been discussed. The clustering techniques which have been surveyed are: K-medoids, K-means, Fuzzy C-means, Density-Based Spatial Clustering of Applications with Noise (DBSCAN and Self-Organizing Map (SOM clustering.

  4. Application Of Data Mining In Bioinformatics

    OpenAIRE

    KHALID RAZA

    2012-01-01

    This article highlights some of the basic concepts of bioinformatics and data mining. The major research areas of bioinformatics are highlighted. The application of data mining in the domain of bioinformatics is explained. It also highlights some of the current challenges and opportunities of data mining in bioinformatics.

  5. Bioinformatics Training Network (BTN): a community resource for bioinformatics trainers

    DEFF Research Database (Denmark)

    Schneider, Maria V.; Walter, Peter; Blatter, Marie-Claude;

    2012-01-01

    to the development of ‘high-throughput biology’, the need for training in the field of bioinformatics, in particular, is seeing a resurgence: it has been defined as a key priority by many Institutions and research programmes and is now an important component of many grant proposals. Nevertheless, when it comes...... and clearly tagged in relation to target audiences, learning objectives, etc. Ideally, they would also be peer reviewed, and easily and efficiently accessible for downloading. Here, we present the Bioinformatics Training Network (BTN), a new enterprise that has been initiated to address these needs and review...

  6. Light and abiotic stresses regulate the expression of GDP-L-galactose phosphorylase and levels of ascorbic acid in two kiwifruit genotypes via light-responsive and stress-inducible cis-elements in their promoters.

    Science.gov (United States)

    Li, Juan; Liang, Dong; Li, Mingjun; Ma, Fengwang

    2013-09-01

    Ascorbic acid (AsA) plays an essential role in plants by protecting cells against oxidative damage. GDP-L-galactose phosphorylase (GGP) is the first committed gene for AsA synthesis. Our research examined AsA levels, regulation of GGP gene expression, and how these are related to abiotic stresses in two species of Actinidia (kiwifruit). When leaves were subjected to continuous darkness or light, ABA or MeJA, heat, or a hypoxic environment, we found some correlation between the relative levels of GGP mRNA and AsA concentrations. In transformed tobacco plants, activity of the GGP promoter was induced by all of these treatments. However, the degree of inducibility in the two kiwifruit species differed among the GGP promoter deletions. We deduced that the G-box motif, a light-responsive element, may have an important function in regulating GGP transcripts under various light conditions in both A. deliciosa and A. eriantha. Other elements such as ABRE, the CGTCA motif, and HSE might also control the promoter activities of GGP in kiwifruit. Altogether, these data suggest that GGP expression in the two kiwifruit species is regulated by light or abiotic stress via the relative cis-elements in their promoters. Furthermore, GGP has a critical role in modulating AsA concentrations in kiwifruit species under abiotic stresses. PMID:23775440

  7. ORA47 (octadecanoid-responsive AP2/ERF-domain transcription factor 47) regulates jasmonic acid and abscisic acid biosynthesis and signaling through binding to a novel cis-element.

    Science.gov (United States)

    Chen, Hsing-Yu; Hsieh, En-Jung; Cheng, Mei-Chun; Chen, Chien-Yu; Hwang, Shih-Ying; Lin, Tsan-Piao

    2016-07-01

    ORA47 (octadecanoid-responsive AP2/ERF-domain transcription factor 47) of Arabidopsis thaliana is an AP2/ERF domain transcription factor that regulates jasmonate (JA) biosynthesis and is induced by methyl JA treatment. The regulatory mechanism of ORA47 remains unclear. ORA47 is shown to bind to the cis-element (NC/GT)CGNCCA, which is referred to as the O-box, in the promoter of ABI2. We proposed that ORA47 acts as a connection between ABA INSENSITIVE1 (ABI1) and ABI2 and mediates an ABI1-ORA47-ABI2 positive feedback loop. PORA47:ORA47-GFP transgenic plants were used in a chromatin immunoprecipitation (ChIP) assay to show that ORA47 participates in the biosynthesis and/or signaling pathways of nine phytohormones. Specifically, many abscisic acid (ABA) and JA biosynthesis and signaling genes were direct targets of ORA47 under stress conditions. The JA content of the P35S:ORA47-GR lines was highly induced under wounding and moderately induced under water stress relative to that of the wild-type plants. The wounding treatment moderately increased ABA accumulation in the transgenic lines, whereas the water stress treatment repressed the ABA content. ORA47 is proposed to play a role in the biosynthesis of JA and ABA and in regulating the biosynthesis and/or signaling of a suite of phytohormone genes when plants are subjected to wounding and water stress. PMID:26974851

  8. Generations of interdisciplinarity in bioinformatics

    Science.gov (United States)

    Bartlett, Andrew; Lewis, Jamie; Williams, Matthew L.

    2016-01-01

    Bioinformatics, a specialism propelled into relevance by the Human Genome Project and the subsequent -omic turn in the life science, is an interdisciplinary field of research. Qualitative work on the disciplinary identities of bioinformaticians has revealed the tensions involved in work in this “borderland.” As part of our ongoing work on the emergence of bioinformatics, between 2010 and 2011, we conducted a survey of United Kingdom-based academic bioinformaticians. Building on insights drawn from our fieldwork over the past decade, we present results from this survey relevant to a discussion of disciplinary generation and stabilization. Not only is there evidence of an attitudinal divide between the different disciplinary cultures that make up bioinformatics, but there are distinctions between the forerunners, founders and the followers; as inter/disciplines mature, they face challenges that are both inter-disciplinary and inter-generational in nature. PMID:27453689

  9. CryptoDB: a Cryptosporidium bioinformatics resource update

    OpenAIRE

    Heiges, Mark; Wang, Haiming; Robinson, Edward; Aurrecoechea, Cristina; Gao, Xin; Kaluskar, Nivedita; Rhodes, Philippa; Wang, Sammy; He, Cong-Zhou; Su, Yanqi; Miller, John; Kraemer, Eileen; Kissinger, Jessica C

    2005-01-01

    The database, CryptoDB (), is a community bioinformatics resource for the AIDS-related apicomplexan-parasite, Cryptosporidium. CryptoDB integrates whole genome sequence and annotation with expressed sequence tag and genome survey sequence data and provides supplemental bioinformatics analyses and data-mining tools. A simple, yet comprehensive web interface is available for mining and visualizing the data. CryptoDB is allied with the databases PlasmoDB and ToxoDB via ApiDB, an NIH/NIAID-funded...

  10. The secondary metabolite bioinformatics portal

    DEFF Research Database (Denmark)

    Weber, Tilmann; Kim, Hyun Uk

    2016-01-01

    . In this context, this review gives a summary of tools and databases that currently are available to mine, identify and characterize natural product biosynthesis pathways and their producers based on ‘omics data. A web portal called Secondary Metabolite Bioinformatics Portal (SMBP at http...

  11. Visualising "Junk" DNA through Bioinformatics

    Science.gov (United States)

    Elwess, Nancy L.; Latourelle, Sandra M.; Cauthorn, Olivia

    2005-01-01

    One of the hottest areas of science today is the field in which biology, information technology,and computer science are merged into a single discipline called bioinformatics. This field enables the discovery and analysis of biological data, including nucleotide and amino acid sequences that are easily accessed through the use of computers. As…

  12. Reproducible Bioinformatics Research for Biologists

    Science.gov (United States)

    This book chapter describes the current Big Data problem in Bioinformatics and the resulting issues with performing reproducible computational research. The core of the chapter provides guidelines and summaries of current tools/techniques that a noncomputational researcher would need to learn to pe...

  13. Bioinformatics and the Undergraduate Curriculum

    Science.gov (United States)

    Maloney, Mark; Parker, Jeffrey; LeBlanc, Mark; Woodard, Craig T.; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of…

  14. Bioinformatic Identification of Conserved Cis-Sequences in Coregulated Genes.

    Science.gov (United States)

    Bülow, Lorenz; Hehl, Reinhard

    2016-01-01

    Bioinformatics tools can be employed to identify conserved cis-sequences in sets of coregulated plant genes because more and more gene expression and genomic sequence data become available. Knowledge on the specific cis-sequences, their enrichment and arrangement within promoters, facilitates the design of functional synthetic plant promoters that are responsive to specific stresses. The present chapter illustrates an example for the bioinformatic identification of conserved Arabidopsis thaliana cis-sequences enriched in drought stress-responsive genes. This workflow can be applied for the identification of cis-sequences in any sets of coregulated genes. The workflow includes detailed protocols to determine sets of coregulated genes, to extract the corresponding promoter sequences, and how to install and run a software package to identify overrepresented motifs. Further bioinformatic analyses that can be performed with the results are discussed. PMID:27557771

  15. A Transcription Factor γMYB1 Binds to the P1BS cis-Element and Activates PLA2-γ Expression with its Co-Activator γMYB2.

    Science.gov (United States)

    Nguyen, Ha Thi Kim; Kim, Soo Youn; Cho, Kwang-Moon; Hong, Jong Chan; Shin, Jeong Sheop; Kim, Hae Jin

    2016-04-01

    Phospholipase A2(PLA2) hydrolyzes phospholipid molecules to produce two products that are both precursors of second messengers of signaling pathways and signaling molecules per se.Arabidopsis thaliana PLA2paralogs (-β,-γand-δ) play critical roles during pollen development, pollen germination and tube growth. In this study, analysis of thePLA2-γpromoter using a deletion series revealed that the promoter region -153 to -1 is crucial for its pollen specificity. Using a yeast one-hybrid screening assay with thePLA2-γpromoter and an Arabidopsis transcription factor (TF)-only library, we isolated two novel MYB-like TFs belonging to the MYB-CC family, denoted here as γMYB1 and γMYB2. By electrophoretic mobility shift assay, we found that these two TFs bind directly to the P1BS (phosphate starvation response 1-binding sequence)cis-element of thePLA2-γpromoter. γMYB1 alone functioned as a transcriptional activator forPLA2-γexpression, whereas γMYB2 directly interacted with γMYB1 and enhanced its activation. Overexpression ofγMYB1in the mature pollen grain led to increased expression of not only thePLA2-γgene but also of several genes whose promoters contain the P1BScis-element and which are involved in the Pi starvation response, phospholipid biosynthesis and sugar synthesis. Based on these results, we suggest that the TF γMYB1 binds to the P1BScis-element, activates the expression ofPLA2-γwith the assistance of its co-activator, γMYB2, and regulates the expression of several target genes involved in many plant metabolic reactions. PMID:26872838

  16. Establishing bioinformatics research in the Asia Pacific

    OpenAIRE

    Tammi Martti; Ranganathan Shoba; Gribskov Michael; Tan Tin Wee

    2006-01-01

    Abstract In 1998, the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation was set up to champion the advancement of bioinformatics in the Asia Pacific. By 2002, APBioNet was able to gain sufficient critical mass to initiate the first International Conference on Bioinformatics (InCoB) bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2006 Conference was organized as the 5th annual conference of the Asia-...

  17. A Bioinformatics Facility for NASA

    Science.gov (United States)

    Schweighofer, Karl; Pohorille, Andrew

    2006-01-01

    Building on an existing prototype, we have fielded a facility with bioinformatics technologies that will help NASA meet its unique requirements for biological research. This facility consists of a cluster of computers capable of performing computationally intensive tasks, software tools, databases and knowledge management systems. Novel computational technologies for analyzing and integrating new biological data and already existing knowledge have been developed. With continued development and support, the facility will fulfill strategic NASA s bioinformatics needs in astrobiology and space exploration. . As a demonstration of these capabilities, we will present a detailed analysis of how spaceflight factors impact gene expression in the liver and kidney for mice flown aboard shuttle flight STS-108. We have found that many genes involved in signal transduction, cell cycle, and development respond to changes in microgravity, but that most metabolic pathways appear unchanged.

  18. Bioinformatic and Biometric Methods in Plant Morphology

    Directory of Open Access Journals (Sweden)

    Surangi W. Punyasena

    2014-08-01

    Full Text Available Recent advances in microscopy, imaging, and data analyses have permitted both the greater application of quantitative methods and the collection of large data sets that can be used to investigate plant morphology. This special issue, the first for Applications in Plant Sciences, presents a collection of papers highlighting recent methods in the quantitative study of plant form. These emerging biometric and bioinformatic approaches to plant sciences are critical for better understanding how morphology relates to ecology, physiology, genotype, and evolutionary and phylogenetic history. From microscopic pollen grains and charcoal particles, to macroscopic leaves and whole root systems, the methods presented include automated classification and identification, geometric morphometrics, and skeleton networks, as well as tests of the limits of human assessment. All demonstrate a clear need for these computational and morphometric approaches in order to increase the consistency, objectivity, and throughput of plant morphological studies.

  19. 9th International Conference on Practical Applications of Computational Biology and Bioinformatics

    CERN Document Server

    Rocha, Miguel; Fdez-Riverola, Florentino; Paz, Juan

    2015-01-01

    This proceedings presents recent practical applications of Computational Biology and  Bioinformatics. It contains the proceedings of the 9th International Conference on Practical Applications of Computational Biology & Bioinformatics held at University of Salamanca, Spain, at June 3rd-5th, 2015. The International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB) is an annual international meeting dedicated to emerging and challenging applied research in Bioinformatics and Computational Biology. Biological and biomedical research are increasingly driven by experimental techniques that challenge our ability to analyse, process and extract meaningful knowledge from the underlying data. The impressive capabilities of next generation sequencing technologies, together with novel and ever evolving distinct types of omics data technologies, have put an increasingly complex set of challenges for the growing fields of Bioinformatics and Computational Biology. The analysis o...

  20. Establishing bioinformatics research in the Asia Pacific

    Directory of Open Access Journals (Sweden)

    Tammi Martti

    2006-12-01

    Full Text Available Abstract In 1998, the Asia Pacific Bioinformatics Network (APBioNet, Asia's oldest bioinformatics organisation was set up to champion the advancement of bioinformatics in the Asia Pacific. By 2002, APBioNet was able to gain sufficient critical mass to initiate the first International Conference on Bioinformatics (InCoB bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2006 Conference was organized as the 5th annual conference of the Asia-Pacific Bioinformatics Network, on Dec. 18–20, 2006 in New Delhi, India, following a series of successful events in Bangkok (Thailand, Penang (Malaysia, Auckland (New Zealand and Busan (South Korea. This Introduction provides a brief overview of the peer-reviewed manuscripts accepted for publication in this Supplement. It exemplifies a typical snapshot of the growing research excellence in bioinformatics of the region as we embark on a trajectory of establishing a solid bioinformatics research culture in the Asia Pacific that is able to contribute fully to the global bioinformatics community.

  1. Metadata Description Framework for Integration of Bioinformatics Information Resources: A Case of iBIRA

    Directory of Open Access Journals (Sweden)

    Shri Ram,

    2014-09-01

    Full Text Available Bioinformatics emerged as a new discipline dedicated to the answer the queries about life science using computational approaches. The basic aim of bioinformatics is to create databases, analyse data sets and managing data generated through large-scale projects such as Human Genome project (HGP. It covers a wide variety of traditional computer science domains, such as data modeling, data retrieval, data mining, data integration, data managing, data warehousing, and simulation of biological information generated through laboratory and field experiments. Due to varied form, nature, and activities in the field of bioinformatics, presenting the information in cohesive fashion is major challenge. The bioinformatics information resources are heterogeneous in nature. Integration and interoperability of information is one of the biggest challenges in this field. Bioinformatics, as an emerging field, needs attention towards metadata application for resource discovery. This paper discusses the metadata element set description framework for integration of various information resources related to the field of bioinformatics available over internet. A web-based tool iBIRA: Integrated Bioinformatics Information Resources Access has been designed and developed for integration of bioinformatics information resources. Dublin Core metadata element set has been used for description of information resources and XML schema has been used for interoperability of information resources with others. A database has been designed using structure query language (SQL–database management system and hypertext preprocessor (PHP–as web programming languages for integration of bioinformatics resources. The database is designated to categorise various resources into biological database, institutions, journals, patents, software tools, web-servers, etc., and the search result is presented in the form of ‘tree view’. Each of these categories of the resources has been analysed

  2. Biology in 'silico': The Bioinformatics Revolution.

    Science.gov (United States)

    Bloom, Mark

    2001-01-01

    Explains the Human Genome Project (HGP) and efforts to sequence the human genome. Describes the role of bioinformatics in the project and considers it the genetics Swiss Army Knife, which has many different uses, for use in forensic science, medicine, agriculture, and environmental sciences. Discusses the use of bioinformatics in the high school…

  3. Fuzzy Logic in Medicine and Bioinformatics

    OpenAIRE

    Torres, Angela; Nieto, Juan J.

    2006-01-01

    The purpose of this paper is to present a general view of the current applications of fuzzy logic in medicine and bioinformatics. We particularly review the medical literature using fuzzy logic. We then recall the geometrical interpretation of fuzzy sets as points in a fuzzy hypercube and present two concrete illustrations in medicine (drug addictions) and in bioinformatics (comparison of genomes).

  4. Using "Arabidopsis" Genetic Sequences to Teach Bioinformatics

    Science.gov (United States)

    Zhang, Xiaorong

    2009-01-01

    This article describes a new approach to teaching bioinformatics using "Arabidopsis" genetic sequences. Several open-ended and inquiry-based laboratory exercises have been designed to help students grasp key concepts and gain practical skills in bioinformatics, using "Arabidopsis" leucine-rich repeat receptor-like kinase (LRR RLK) genetic…

  5. A Mathematical Optimization Problem in Bioinformatics

    Science.gov (United States)

    Heyer, Laurie J.

    2008-01-01

    This article describes the sequence alignment problem in bioinformatics. Through examples, we formulate sequence alignment as an optimization problem and show how to compute the optimal alignment with dynamic programming. The examples and sample exercises have been used by the author in a specialized course in bioinformatics, but could be adapted…

  6. Online Bioinformatics Tutorials | Office of Cancer Genomics

    Science.gov (United States)

    Bioinformatics is a scientific discipline that applies computer science and information technology to help understand biological processes. The NIH provides a list of free online bioinformatics tutorials, either generated by the NIH Library or other institutes, which includes introductory lectures and "how to" videos on using various tools.

  7. Rapid Development of Bioinformatics Education in China

    Science.gov (United States)

    Zhong, Yang; Zhang, Xiaoyan; Ma, Jian; Zhang, Liang

    2003-01-01

    As the Human Genome Project experiences remarkable success and a flood of biological data is produced, bioinformatics becomes a very "hot" cross-disciplinary field, yet experienced bioinformaticians are urgently needed worldwide. This paper summarises the rapid development of bioinformatics education in China, especially related undergraduate…

  8. Integration of Bioinformatics and Synthetic Promoters Leads to the Discovery of Novel Elicitor-Responsive cis-Regulatory Sequences in Arabidopsis1[C][W][OA

    Science.gov (United States)

    Koschmann, Jeannette; Machens, Fabian; Becker, Marlies; Niemeyer, Julia; Schulze, Jutta; Bülow, Lorenz; Stahl, Dietmar J.; Hehl, Reinhard

    2012-01-01

    A combination of bioinformatic tools, high-throughput gene expression profiles, and the use of synthetic promoters is a powerful approach to discover and evaluate novel cis-sequences in response to specific stimuli. With Arabidopsis (Arabidopsis thaliana) microarray data annotated to the PathoPlant database, 732 different queries with a focus on fungal and oomycete pathogens were performed, leading to 510 up-regulated gene groups. Using the binding site estimation suite of tools, BEST, 407 conserved sequence motifs were identified in promoter regions of these coregulated gene sets. Motif similarities were determined with STAMP, classifying the 407 sequence motifs into 37 families. A comparative analysis of these 37 families with the AthaMap, PLACE, and AGRIS databases revealed similarities to known cis-elements but also led to the discovery of cis-sequences not yet implicated in pathogen response. Using a parsley (Petroselinum crispum) protoplast system and a modified reporter gene vector with an internal transformation control, 25 elicitor-responsive cis-sequences from 10 different motif families were identified. Many of the elicitor-responsive cis-sequences also drive reporter gene expression in an Agrobacterium tumefaciens infection assay in Nicotiana benthamiana. This work significantly increases the number of known elicitor-responsive cis-sequences and demonstrates the successful integration of a diverse set of bioinformatic resources combined with synthetic promoter analysis for data mining and functional screening in plant-pathogen interaction. PMID:22744985

  9. Integration of bioinformatics and synthetic promoters leads to the discovery of novel elicitor-responsive cis-regulatory sequences in Arabidopsis.

    Science.gov (United States)

    Koschmann, Jeannette; Machens, Fabian; Becker, Marlies; Niemeyer, Julia; Schulze, Jutta; Bülow, Lorenz; Stahl, Dietmar J; Hehl, Reinhard

    2012-09-01

    A combination of bioinformatic tools, high-throughput gene expression profiles, and the use of synthetic promoters is a powerful approach to discover and evaluate novel cis-sequences in response to specific stimuli. With Arabidopsis (Arabidopsis thaliana) microarray data annotated to the PathoPlant database, 732 different queries with a focus on fungal and oomycete pathogens were performed, leading to 510 up-regulated gene groups. Using the binding site estimation suite of tools, BEST, 407 conserved sequence motifs were identified in promoter regions of these coregulated gene sets. Motif similarities were determined with STAMP, classifying the 407 sequence motifs into 37 families. A comparative analysis of these 37 families with the AthaMap, PLACE, and AGRIS databases revealed similarities to known cis-elements but also led to the discovery of cis-sequences not yet implicated in pathogen response. Using a parsley (Petroselinum crispum) protoplast system and a modified reporter gene vector with an internal transformation control, 25 elicitor-responsive cis-sequences from 10 different motif families were identified. Many of the elicitor-responsive cis-sequences also drive reporter gene expression in an Agrobacterium tumefaciens infection assay in Nicotiana benthamiana. This work significantly increases the number of known elicitor-responsive cis-sequences and demonstrates the successful integration of a diverse set of bioinformatic resources combined with synthetic promoter analysis for data mining and functional screening in plant-pathogen interaction. PMID:22744985

  10. Bioinformatics clouds for big data manipulation

    Directory of Open Access Journals (Sweden)

    Dai Lin

    2012-11-01

    Full Text Available Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS, Software as a Service (SaaS, Platform as a Service (PaaS, and Infrastructure as a Service (IaaS, and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.

  11. Bioinformatics clouds for big data manipulation

    KAUST Repository

    Dai, Lin

    2012-11-28

    As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics.This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor. 2012 Dai et al.; licensee BioMed Central Ltd.

  12. ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis.

    Science.gov (United States)

    Römer, Michael; Eichner, Johannes; Dräger, Andreas; Wrzodek, Clemens; Wrzodek, Finja; Zell, Andreas

    2016-01-01

    Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/. PMID:26882475

  13. ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis.

    Directory of Open Access Journals (Sweden)

    Michael Römer

    Full Text Available Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/.

  14. Best practices in bioinformatics training for life scientists

    DEFF Research Database (Denmark)

    Via, Allegra; Blicher, Thomas; Bongcam-Rudloff, Erik;

    2013-01-01

    to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse......The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians...... their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes...

  15. Bioinformatics. Data base and application execution web system

    International Nuclear Information System (INIS)

    Research for the biological systems has reached to the stage that clarifies an organism as a whole from genome, a set of genes. To accomplish the researches, one needs to face a lot of data retrieved from genome sequences. Conventional methods, namely experiments in wet-labs, are, however not designed for dealing with genome scale data. For those data analyses, computer based researches, such as data mining and simulation, are suitable. As a result, bioinformatics, a new discipline combining expertise of biology and information science, is emerging. Here, we report a development of a data base and a web based system for application execution (Execution of Application on web system). Those are one of the efforts to support bioinformatics research by Office of ITBL Promotion. (author)

  16. Computational biology and bioinformatics in Nigeria.

    Directory of Open Access Journals (Sweden)

    Segun A Fatumo

    2014-04-01

    Full Text Available Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  17. Bioinformatics resources for cancer research with an emphasis on gene function and structure prediction tools

    Directory of Open Access Journals (Sweden)

    Daisuke Kihara

    2006-01-01

    Full Text Available The immensely popular fields of cancer research and bioinformatics overlap in many different areas, e.g. large data repositories that allow for users to analyze data from many experiments (data handling, databases, pattern mining, microarray data analysis, and interpretation of proteomics data. There are many newly available resources in these areas that may be unfamiliar to most cancer researchers wanting to incorporate bioinformatics tools and analyses into their work, and also to bioinformaticians looking for real data to develop and test algorithms. This review reveals the interdependence of cancer research and bioinformatics, and highlight the most appropriate and useful resources available to cancer researchers. These include not only public databases, but general and specific bioinformatics tools which can be useful to the cancer researcher. The primary foci are function and structure prediction tools of protein genes. The result is a useful reference to cancer researchers and bioinformaticians studying cancer alike.

  18. Translational Bioinformatics and Clinical Research (Biomedical) Informatics.

    Science.gov (United States)

    Sirintrapun, S Joseph; Zehir, Ahmet; Syed, Aijazuddin; Gao, JianJiong; Schultz, Nikolaus; Cheng, Donavan T

    2016-03-01

    Translational bioinformatics and clinical research (biomedical) informatics are the primary domains related to informatics activities that support translational research. Translational bioinformatics focuses on computational techniques in genetics, molecular biology, and systems biology. Clinical research (biomedical) informatics involves the use of informatics in discovery and management of new knowledge relating to health and disease. This article details 3 projects that are hybrid applications of translational bioinformatics and clinical research (biomedical) informatics: The Cancer Genome Atlas, the cBioPortal for Cancer Genomics, and the Memorial Sloan Kettering Cancer Center clinical variants and results database, all designed to facilitate insights into cancer biology and clinical/therapeutic correlations. PMID:26851671

  19. KBWS: an EMBOSS associated package for accessing bioinformatics web services

    Directory of Open Access Journals (Sweden)

    Tomita Masaru

    2011-04-01

    Full Text Available Abstract The availability of bioinformatics web-based services is rapidly proliferating, for their interoperability and ease of use. The next challenge is in the integration of these services in the form of workflows, and several projects are already underway, standardizing the syntax, semantics, and user interfaces. In order to deploy the advantages of web services with locally installed tools, here we describe a collection of proxy client tools for 42 major bioinformatics web services in the form of European Molecular Biology Open Software Suite (EMBOSS UNIX command-line tools. EMBOSS provides sophisticated means for discoverability and interoperability for hundreds of tools, and our package, named the Keio Bioinformatics Web Service (KBWS, adds functionalities of local and multiple alignment of sequences, phylogenetic analyses, and prediction of cellular localization of proteins and RNA secondary structures. This software implemented in C is available under GPL from http://www.g-language.org/kbws/ and GitHub repository http://github.com/cory-ko/KBWS. Users can utilize the SOAP services implemented in Perl directly via WSDL file at http://soap.g-language.org/kbws.wsdl (RPC Encoded and http://soap.g-language.org/kbws_dl.wsdl (Document/literal.

  20. Best practices in bioinformatics training for life scientists.

    KAUST Repository

    Via, Allegra

    2013-06-25

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists.

  1. Best practices in bioinformatics training for life scientists.

    Science.gov (United States)

    Via, Allegra; Blicher, Thomas; Bongcam-Rudloff, Erik; Brazas, Michelle D; Brooksbank, Cath; Budd, Aidan; De Las Rivas, Javier; Dreyer, Jacqueline; Fernandes, Pedro L; van Gelder, Celia; Jacob, Joachim; Jimenez, Rafael C; Loveland, Jane; Moran, Federico; Mulder, Nicola; Nyrönen, Tommi; Rother, Kristian; Schneider, Maria Victoria; Attwood, Teresa K

    2013-09-01

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists. PMID:23803301

  2. GOBLET: the Global Organisation for Bioinformatics Learning, Education and Training.

    Science.gov (United States)

    Attwood, Teresa K; Atwood, Teresa K; Bongcam-Rudloff, Erik; Brazas, Michelle E; Corpas, Manuel; Gaudet, Pascale; Lewitter, Fran; Mulder, Nicola; Palagi, Patricia M; Schneider, Maria Victoria; van Gelder, Celia W G

    2015-04-01

    In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy--paradoxically, many are actually closing "niche" bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all. PMID:25856076

  3. Concepts and introduction to RNA bioinformatics

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Hofacker, Ivo L.; Ruzzo, Walter L.

    2014-01-01

    RNA bioinformatics and computational RNA biology have emerged from implementing methods for predicting the secondary structure of single sequences. The field has evolved to exploit multiple sequences to take evolutionary information into account, such as compensating (and structure preserving) base...

  4. Challenge: A Multidisciplinary Degree Program in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Mudasser Fraz Wyne

    2006-06-01

    Full Text Available Bioinformatics is a new field that is poorly served by any of the traditional science programs in Biology, Computer science or Biochemistry. Known to be a rapidly evolving discipline, Bioinformatics has emerged from experimental molecular biology and biochemistry as well as from the artificial intelligence, database, pattern recognition, and algorithms disciplines of computer science. While institutions are responding to this increased demand by establishing graduate programs in bioinformatics, entrance barriers for these programs are high, largely due to the significant prerequisite knowledge which is required, both in the fields of biochemistry and computer science. Although many schools currently have or are proposing graduate programs in bioinformatics, few are actually developing new undergraduate programs. In this paper I explore the blend of a multidisciplinary approach, discuss the response of academia and highlight challenges faced by this emerging field.

  5. Integrative Bioinformatics for Genomics and Proteomics

    OpenAIRE

    Wu, C.H.

    2011-01-01

    Systems integration is becoming the driving force for 21st century biology. Researchers are systematically tackling gene functions and complex regulatory processes by studying organisms at different levels of organization, from genomes and transcriptomes to proteomes and interactomes. To fully realize the value of such high-throughput data requires advanced bioinformatics for integration, mining, comparative analysis, and functional interpretation. We are developing a bioinformatics research ...

  6. Bioinformatics clouds for big data manipulation

    OpenAIRE

    Dai Lin; Gao Xin; Guo Yan; Xiao Jingfa; Zhang Zhang

    2012-01-01

    Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and a...

  7. No-boundary thinking in bioinformatics research

    OpenAIRE

    Huang, Xiuzhen; Bruce, Barry; Buchan, Alison; Congdon, Clare Bates; Cramer, Carole L.; Jennings, Steven F; Jiang, Hongmei; Li, Zenglu; McClure, Gail; McMullen, Rick; Moore, Jason H.; Nanduri, Bindu; Peckham, Joan; Perkins, Andy; Polson, Shawn W.

    2013-01-01

    Currently there are definitions from many agencies and research societies defining “bioinformatics” as deriving knowledge from computational analysis of large volumes of biological and biomedical data. Should this be the bioinformatics research focus? We will discuss this issue in this review article. We would like to promote the idea of supporting human-infrastructure (HI) with no-boundary thinking (NT) in bioinformatics (HINT).

  8. Bioinformatics for saffron (Crocus sativus L.) improvement

    OpenAIRE

    Ghulam A. PARRAY; Abdul G. Rather; Parvez Sofi; Shafiq A. Wani; Amjad M. Husaini; Asif B. Shikari; Javid I. Mir

    2009-01-01

    Saffron (Crocus sativus L.) is a sterile triploid plant and belongs to the Iridaceae (Liliales, Monocots). Its genome is of relatively large size and is poorly characterized. Bioinformatics can play an enormous technical role in the sequence-level structural characterization of saffron genomic DNA. Bioinformatics tools can also help in appreciating the extent of diversity of various geographic or genetic groups of cultivated saffron to infer relationships between groups and accessions. The ch...

  9. Agile parallel bioinformatics workflow management using Pwrake

    OpenAIRE

    Tanaka Masahiro; Sasaki Kensaku; Mishima Hiroyuki; Tatebe Osamu; Yoshiura Koh-ichiro

    2011-01-01

    Abstract Background In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environm...

  10. Coronavirus Genomics and Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Kwok-Yung Yuen

    2010-08-01

    Full Text Available The drastic increase in the number of coronaviruses discovered and coronavirus genomes being sequenced have given us an unprecedented opportunity to perform genomics and bioinformatics analysis on this family of viruses. Coronaviruses possess the largest genomes (26.4 to 31.7 kb among all known RNA viruses, with G + C contents varying from 32% to 43%. Variable numbers of small ORFs are present between the various conserved genes (ORF1ab, spike, envelope, membrane and nucleocapsid and downstream to nucleocapsid gene in different coronavirus lineages. Phylogenetically, three genera, Alphacoronavirus, Betacoronavirus and Gammacoronavirus, with Betacoronavirus consisting of subgroups A, B, C and D, exist. A fourth genus, Deltacoronavirus, which includes bulbul coronavirus HKU11, thrush coronavirus HKU12 and munia coronavirus HKU13, is emerging. Molecular clock analysis using various gene loci revealed that the time of most recent common ancestor of human/civet SARS related coronavirus to be 1999-2002, with estimated substitution rate of 4´10-4 to 2´10-2 substitutions per site per year. Recombination in coronaviruses was most notable between different strains of murine hepatitis virus (MHV, between different strains of infectious bronchitis virus, between MHV and bovine coronavirus, between feline coronavirus (FCoV type I and canine coronavirus generating FCoV type II, and between the three genotypes of human coronavirus HKU1 (HCoV-HKU1. Codon usage bias in coronaviruses were observed, with HCoV-HKU1 showing the most extreme bias, and cytosine deamination and selection of CpG suppressed clones are the two major independent biological forces that shape such codon usage bias in coronaviruses.

  11. Making sense of genomes of parasitic worms: Tackling bioinformatic challenges.

    Science.gov (United States)

    Korhonen, Pasi K; Young, Neil D; Gasser, Robin B

    2016-01-01

    Billions of people and animals are infected with parasitic worms (helminths). Many of these worms cause diseases that have a major socioeconomic impact worldwide, and are challenging to control because existing treatment methods are often inadequate. There is, therefore, a need to work toward developing new intervention methods, built on a sound understanding of parasitic worms at molecular level, the relationships that they have with their animal hosts and/or the diseases that they cause. Decoding the genomes and transcriptomes of these parasites brings us a step closer to this goal. The key focus of this article is to critically review and discuss bioinformatic tools used for the assembly and annotation of these genomes and transcriptomes, as well as various post-genomic analyses of transcription profiles, biological pathways, synteny, phylogeny, biogeography and the prediction and prioritisation of drug target candidates. Bioinformatic pipelines implemented and established recently provide practical and efficient tools for the assembly and annotation of genomes of parasitic worms, and will be applicable to a wide range of other parasites and eukaryotic organisms. Future research will need to assess the utility of long-read sequence data sets for enhanced genomic assemblies, and develop improved algorithms for gene prediction and post-genomic analyses, to enable comprehensive systems biology explorations of parasitic organisms. PMID:26956711

  12. Why Choose This One? Factors in Scientists' Selection of Bioinformatics Tools

    Science.gov (United States)

    Bartlett, Joan C.; Ishimura, Yusuke; Kloda, Lorie A.

    2011-01-01

    Purpose: The objective was to identify and understand the factors involved in scientists' selection of preferred bioinformatics tools, such as databases of gene or protein sequence information (e.g., GenBank) or programs that manipulate and analyse biological data (e.g., BLAST). Methods: Eight scientists maintained research diaries for a two-week…

  13. Bioinformatic Challenges in Clinical Diagnostic Application of Targeted Next Generation Sequencing: Experience from Pheochromocytoma.

    Directory of Open Access Journals (Sweden)

    Joakim Crona

    Full Text Available Recent studies have demonstrated equal quality of targeted next generation sequencing (NGS compared to Sanger Sequencing. Whereas these novel sequencing processes have a validated robust performance, choice of enrichment method and different available bioinformatic software as reliable analysis tool needs to be further investigated in a diagnostic setting.DNA from 21 patients with genetic variants in SDHB, VHL, EPAS1, RET, (n=17 or clinical criteria of NF1 syndrome (n=4 were included. Targeted NGS was performed using Truseq custom amplicon enrichment sequenced on an Illumina MiSEQ instrument. Results were analysed in parallel using three different bioinformatics pipelines; (1 Commercially available MiSEQ Reporter, fully automatized and integrated software, (2 CLC Genomics Workbench, graphical interface based software, also commercially available, and ICP (3 an in-house scripted custom bioinformatic tool.A tenfold read coverage was achieved in between 95-98% of targeted bases. All workflows had alignment of reads to SDHA and NF1 pseudogenes. Compared to Sanger sequencing, variant calling revealed a sensitivity ranging from 83 to 100% and a specificity of 99.9-100%. Only MiSEQ reporter identified all pathogenic variants in both sequencing runs.We conclude that targeted next generation sequencing have equal quality compared to Sanger sequencing. Enrichment specificity and the bioinformatic performance need to be carefully assessed in a diagnostic setting. As acceptable accuracy was noted for a fully automated bioinformatic workflow, we suggest that processing of NGS data could be performed without expert bioinformatics skills utilizing already existing commercially available bioinformatics tools.

  14. Using registries to integrate bioinformatics tools and services into workbench environments

    DEFF Research Database (Denmark)

    Ménager, Hervé; Kalaš, Matúš; Rapacki, Kristoffer;

    2015-01-01

    within convenient, integrated “workbench” environments. Resource descriptions are the core element of registry and workbench systems, which are used to both help the user find and comprehend available software tools, data resources, and Web Services, and to localise, execute and combine them......The diversity and complexity of bioinformatics resources presents significant challenges to their localisation, deployment and use, creating a need for reliable systems that address these issues. Meanwhile, users demand increasingly usable and integrated ways to access and analyse data, especially......, a software component that will ease the integration of bioinformatics resources in a workbench environment, using their description provided by the existing ELIXIR Tools and Data Services Registry....

  15. Integrative cluster analysis in bioinformatics

    CERN Document Server

    Abu-Jamous, Basel; Nandi, Asoke K

    2015-01-01

    Clustering techniques are increasingly being put to use in the analysis of high-throughput biological datasets. Novel computational techniques to analyse high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. This book details the complete pathway of cluster analysis, from the basics of molecular biology to the generation of biological knowledge. The book also presents the latest clustering methods and clustering validation, thereby offering the reader a comprehensive review o

  16. Bioinformatics in crosslinking chemistry of collagen with selective cross linkers

    Directory of Open Access Journals (Sweden)

    Gopal Ramesh

    2011-10-01

    Full Text Available Abstract Background Identifying the molecular interactions using bioinformatics tools before venturing into wet lab studies saves the energy and time considerably. The present study summarizes, molecular interactions and binding energy calculations made for major structural protein, collagen of Type I and Type III with the chosen cross-linkers, namely, coenzyme Q10, dopaquinone, embelin, embelin complex-1 & 2, idebenone, 5-O-methyl embelin, potassium embelate and vilangin. Results Molecular descriptive analyses suggest, dopaquinone, embelin, idebenone, 5-O-methyl embelin, and potassium embelate display nil violations. And results of docking analyses revealed, best affinity for Type I (- 4.74 kcal/mol and type III (-4.94 kcal/mol collagen was with dopaquinone. Conclusions Among the selected cross-linkers, dopaquinone, embelin, potassium embelate and 5-O-methyl embelin were the suitable cross-linkers for both Type I and Type III collagen and stabilizes the collagen at the expected level.

  17. KDE Bioscience: platform for bioinformatics analysis workflows.

    Science.gov (United States)

    Lu, Qiang; Hao, Pei; Curcin, Vasa; He, Weizhong; Li, Yuan-Yuan; Luo, Qing-Ming; Guo, Yi-Ke; Li, Yi-Xue

    2006-08-01

    Bioinformatics is a dynamic research area in which a large number of algorithms and programs have been developed rapidly and independently without much consideration so far of the need for standardization. The lack of such common standards combined with unfriendly interfaces make it difficult for biologists to learn how to use these tools and to translate the data formats from one to another. Consequently, the construction of an integrative bioinformatics platform to facilitate biologists' research is an urgent and challenging task. KDE Bioscience is a java-based software platform that collects a variety of bioinformatics tools and provides a workflow mechanism to integrate them. Nucleotide and protein sequences from local flat files, web sites, and relational databases can be entered, annotated, and aligned. Several home-made or 3rd-party viewers are built-in to provide visualization of annotations or alignments. KDE Bioscience can also be deployed in client-server mode where simultaneous execution of the same workflow is supported for multiple users. Moreover, workflows can be published as web pages that can be executed from a web browser. The power of KDE Bioscience comes from the integrated algorithms and data sources. With its generic workflow mechanism other novel calculations and simulations can be integrated to augment the current sequence analysis functions. Because of this flexible and extensible architecture, KDE Bioscience makes an ideal integrated informatics environment for future bioinformatics or systems biology research. PMID:16260186

  18. A bioinformatics approach to marker development

    NARCIS (Netherlands)

    Tang, J.

    2008-01-01

    The thesis focuses on two bioinformatics research topics: the development of tools for an efficient and reliable identification of single nucleotides polymorphisms (SNPs) and polymorphic simple sequence repeats (SSRs) from expressed sequence tags (ESTs) (Chapter 2, 3 and 4), and the subsequent imple

  19. Evolution of web services in bioinformatics

    NARCIS (Netherlands)

    Neerincx, P.B.T.; Leunissen, J.A.M.

    2005-01-01

    Bioinformaticians have developed large collections of tools to make sense of the rapidly growing pool of molecular biological data. Biological systems tend to be complex and in order to understand them, it is often necessary to link many data sets and use more than one tool. Therefore, bioinformatic

  20. Implementing bioinformatic workflows within the bioextract server

    Science.gov (United States)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows typically require the integrated use of multiple, distributed data sources and analytic tools. The BioExtract Server (http://bioextract.org) is a distributed servi...

  1. "Extreme Programming" in a Bioinformatics Class

    Science.gov (United States)

    Kelley, Scott; Alger, Christianna; Deutschman, Douglas

    2009-01-01

    The importance of Bioinformatics tools and methodology in modern biological research underscores the need for robust and effective courses at the college level. This paper describes such a course designed on the principles of cooperative learning based on a computer software industry production model called "Extreme Programming" (EP). The…

  2. Bioinformatics: A History of Evolution "In Silico"

    Science.gov (United States)

    Ondrej, Vladan; Dvorak, Petr

    2012-01-01

    Bioinformatics, biological databases, and the worldwide use of computers have accelerated biological research in many fields, such as evolutionary biology. Here, we describe a primer of nucleotide sequence management and the construction of a phylogenetic tree with two examples; the two selected are from completely different groups of organisms:…

  3. Privacy Preserving PCA on Distributed Bioinformatics Datasets

    Science.gov (United States)

    Li, Xin

    2011-01-01

    In recent years, new bioinformatics technologies, such as gene expression microarray, genome-wide association study, proteomics, and metabolomics, have been widely used to simultaneously identify a huge number of human genomic/genetic biomarkers, generate a tremendously large amount of data, and dramatically increase the knowledge on human…

  4. Bioinformatics in Undergraduate Education: Practical Examples

    Science.gov (United States)

    Boyle, John A.

    2004-01-01

    Bioinformatics has emerged as an important research tool in recent years. The ability to mine large databases for relevant information has become increasingly central to many different aspects of biochemistry and molecular biology. It is important that undergraduates be introduced to the available information and methodologies. We present a…

  5. Evolutionary Computation Applications in Current Bioinformatics

    OpenAIRE

    Wang, Bing; Zhang, Xiang

    2010-01-01

    This chapter provides an overview of some bioinformatics tasks and the relevance of the evolutionary computation methods, especially GAs. There are two advantages of GA-based approaches. One is that GAs are easier to run in parallel than single trajectory search procedures, and therefore allow groups of processors to be utilized for a search. The other is

  6. G2LC: Resources Autoscaling for Real Time Bioinformatics Applications in IaaS

    Directory of Open Access Journals (Sweden)

    Rongdong Hu

    2015-01-01

    Full Text Available Cloud computing has started to change the way how bioinformatics research is being carried out. Researchers who have taken advantage of this technology can process larger amounts of data and speed up scientific discovery. The variability in data volume results in variable computing requirements. Therefore, bioinformatics researchers are pursuing more reliable and efficient methods for conducting sequencing analyses. This paper proposes an automated resource provisioning method, G2LC, for bioinformatics applications in IaaS. It enables application to output the results in a real time manner. Its main purpose is to guarantee applications performance, while improving resource utilization. Real sequence searching data of BLAST is used to evaluate the effectiveness of G2LC. Experimental results show that G2LC guarantees the application performance, while resource is saved up to 20.14%.

  7. Agile parallel bioinformatics workflow management using Pwrake

    Directory of Open Access Journals (Sweden)

    Tanaka Masahiro

    2011-09-01

    Full Text Available Abstract Background In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error. Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. Findings We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Conclusions Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows

  8. Component-Based Approach for Educating Students in Bioinformatics

    Science.gov (United States)

    Poe, D.; Venkatraman, N.; Hansen, C.; Singh, G.

    2009-01-01

    There is an increasing need for an effective method of teaching bioinformatics. Increased progress and availability of computer-based tools for educating students have led to the implementation of a computer-based system for teaching bioinformatics as described in this paper. Bioinformatics is a recent, hybrid field of study combining elements of…

  9. Novel bioinformatic developments for exome sequencing.

    Science.gov (United States)

    Lelieveld, Stefan H; Veltman, Joris A; Gilissen, Christian

    2016-06-01

    With the widespread adoption of next generation sequencing technologies by the genetics community and the rapid decrease in costs per base, exome sequencing has become a standard within the repertoire of genetic experiments for both research and diagnostics. Although bioinformatics now offers standard solutions for the analysis of exome sequencing data, many challenges still remain; especially the increasing scale at which exome data are now being generated has given rise to novel challenges in how to efficiently store, analyze and interpret exome data of this magnitude. In this review we discuss some of the recent developments in bioinformatics for exome sequencing and the directions that this is taking us to. With these developments, exome sequencing is paving the way for the next big challenge, the application of whole genome sequencing. PMID:27075447

  10. Bioinformatics for saffron (Crocus sativus L. improvement

    Directory of Open Access Journals (Sweden)

    Ghulam A. Parray

    2009-02-01

    Full Text Available Saffron (Crocus sativus L. is a sterile triploid plant and belongs to the Iridaceae (Liliales, Monocots. Its genome is of relatively large size and is poorly characterized. Bioinformatics can play an enormous technical role in the sequence-level structural characterization of saffron genomic DNA. Bioinformatics tools can also help in appreciating the extent of diversity of various geographic or genetic groups of cultivated saffron to infer relationships between groups and accessions. The characterization of the transcriptome of saffron stigmas is the most vital for throwing light on the molecular basis of flavor, color biogenesis, genomic organization and biology of gynoecium of saffron. The information derived can be utilized for constructing biological pathways involved in the biosynthesis of principal components of saffron i.e., crocin, crocetin, safranal, picrocrocin and safchiA

  11. Discovery and Classification of Bioinformatics Web Services

    Energy Technology Data Exchange (ETDEWEB)

    Rocco, D; Critchlow, T

    2002-09-02

    The transition of the World Wide Web from a paradigm of static Web pages to one of dynamic Web services provides new and exciting opportunities for bioinformatics with respect to data dissemination, transformation, and integration. However, the rapid growth of bioinformatics services, coupled with non-standardized interfaces, diminish the potential that these Web services offer. To face this challenge, we examine the notion of a Web service class that defines the functionality provided by a collection of interfaces. These descriptions are an integral part of a larger framework that can be used to discover, classify, and wrapWeb services automatically. We discuss how this framework can be used in the context of the proliferation of sites offering BLAST sequence alignment services for specialized data sets.

  12. A Novel Approach for Bioinformatics Workflow Discovery

    Directory of Open Access Journals (Sweden)

    Walaa Nagy

    2014-11-01

    Full Text Available Workflow systems are typical fit for in the explorative research of bioinformaticians. These systems can help bioinformaticians to design and run their experiments and to automatically capture and store the data generated at runtime. On the other hand, Web services are increasingly used as the preferred method for accessing and processing the information coming from the diverse life science sources. In this work we provide an efficient approach for creating bioinformatic workflow for all-service architecture systems (i.e., all system components are services . This architecture style simplifies the user interaction with workflow systems and facilitates both the change of individual components, and the addition of new components to adopt to other workflow tasks if required. We finally present a case study for the bioinformatics domain to elaborate the applicability of our proposed approach.

  13. Adapting bioinformatics curricula for big data

    OpenAIRE

    Greene, Anna C.; Giffin, Kristine A.; Greene, Casey S; Jason H Moore

    2015-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these...

  14. Chapter 16: Text Mining for Translational Bioinformatics

    OpenAIRE

    Bretonnel Cohen, K; Hunter, Lawrence E.

    2013-01-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. P...

  15. Genome bioinformatics of tomato and potato

    OpenAIRE

    Datema, E.

    2011-01-01

    In the past two decades genome sequencing has developed from a laborious and costly technology employed by large international consortia to a widely used, automated and affordable tool used worldwide by many individual research groups. Genome sequences of many food animals and crop plants have been deciphered and are being exploited for fundamental research and applied to improve their breeding programs. The developments in sequencing technologies have also impacted the associated bioinformat...

  16. Hardware Acceleration of Bioinformatics Sequence Alignment Applications

    OpenAIRE

    Hasan, L.

    2011-01-01

    Biological sequence alignment is an important and challenging task in bioinformatics. Alignment may be defined as an arrangement of two or more DNA or protein sequences to highlight the regions of their similarity. Sequence alignment is used to infer the evolutionary relationship between a set of protein or DNA sequences. An accurate alignment can provide valuable information for experimentation on the newly found sequences. It is indispensable in basic research as well as in practical applic...

  17. Bioinformatics and the discovery of gene function

    OpenAIRE

    Casari, G; Daruvar, Dea; Sander, C.; Schneider, Reinhard

    1996-01-01

    Scientific history was made in completing the yeast genuine sequence, yet its 13 Mb are a mere starting point. Two challenges loom large: to decipher the function of all genes and to describe the workings of the eukaryotic cell in full molecular detail. A combination of experimental and theoretical approaches will be brought to bear on these challenges. What will be next in yeast genome analysis from the point of view of bioinformatics?

  18. A Novel Approach for Bioinformatics Workflow Discovery

    OpenAIRE

    Walaa Nagy; Hoda M.O. Mokhtar

    2014-01-01

    Workflow systems are typical fit for in the explorative research of bioinformaticians. These systems can help bioinformaticians to design and run their experiments and to automatically capture and store the data generated at runtime. On the other hand, Web services are increasingly used as the preferred method for accessing and processing the information coming from the diverse life science sources. In this work we provide an efficient approach for creating bioinformatic workflow for all-serv...

  19. VLSI Microsystem for Rapid Bioinformatic Pattern Recognition

    Science.gov (United States)

    Fang, Wai-Chi; Lue, Jaw-Chyng

    2009-01-01

    A system comprising very-large-scale integrated (VLSI) circuits is being developed as a means of bioinformatics-oriented analysis and recognition of patterns of fluorescence generated in a microarray in an advanced, highly miniaturized, portable genetic-expression-assay instrument. Such an instrument implements an on-chip combination of polymerase chain reactions and electrochemical transduction for amplification and detection of deoxyribonucleic acid (DNA).

  20. Management and Marketing of Bioinformatics Tools

    OpenAIRE

    Sudhakar, R.; Gupta, Kuhu; Kumar, Sushant

    2013-01-01

    Bioinformatics can be defined as conceptualization of biology, in specific- Molecular Biology and then application of certain techniques from multiple disciplines such as statistics, computer science and applied mathematics to analyze and understand the vast information related to molecular structures. Hence, its management becomes difficult. The manager must be thorough with the concepts of biology- genetic studies in particular, as well as information technology. Discussed below is the mana...

  1. Bioinformatic pipelines in Python with Leaf

    OpenAIRE

    Napolitano, Francesco; Mariani-Costantini, Renato; Tagliaferri, Roberto

    2013-01-01

    Background An incremental, loosely planned development approach is often used in bioinformatic studies when dealing with custom data analysis in a rapidly changing environment. Unfortunately, the lack of a rigorous software structuring can undermine the maintainability, communicability and replicability of the process. To ameliorate this problem we propose the Leaf system, the aim of which is to seamlessly introduce the pipeline formality on top of a dynamical development process with minimum...

  2. Translational bioinformatics in psychoneuroimmunology: methods and applications.

    Science.gov (United States)

    Yan, Qing

    2012-01-01

    Translational bioinformatics plays an indispensable role in transforming psychoneuroimmunology (PNI) into personalized medicine. It provides a powerful method to bridge the gaps between various knowledge domains in PNI and systems biology. Translational bioinformatics methods at various systems levels can facilitate pattern recognition, and expedite and validate the discovery of systemic biomarkers to allow their incorporation into clinical trials and outcome assessments. Analysis of the correlations between genotypes and phenotypes including the behavioral-based profiles will contribute to the transition from the disease-based medicine to human-centered medicine. Translational bioinformatics would also enable the establishment of predictive models for patient responses to diseases, vaccines, and drugs. In PNI research, the development of systems biology models such as those of the neurons would play a critical role. Methods based on data integration, data mining, and knowledge representation are essential elements in building health information systems such as electronic health records and computerized decision support systems. Data integration of genes, pathophysiology, and behaviors are needed for a broad range of PNI studies. Knowledge discovery approaches such as network-based systems biology methods are valuable in studying the cross-talks among pathways in various brain regions involved in disorders such as Alzheimer's disease. PMID:22933157

  3. Adapting bioinformatics curricula for big data.

    Science.gov (United States)

    Greene, Anna C; Giffin, Kristine A; Greene, Casey S; Moore, Jason H

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. PMID:25829469

  4. Bioinformatics Approach in Plant Genomic Research.

    Science.gov (United States)

    Ong, Quang; Nguyen, Phuc; Thao, Nguyen Phuong; Le, Ly

    2016-08-01

    The advance in genomics technology leads to the dramatic change in plant biology research. Plant biologists now easily access to enormous genomic data to deeply study plant high-density genetic variation at molecular level. Therefore, fully understanding and well manipulating bioinformatics tools to manage and analyze these data are essential in current plant genome research. Many plant genome databases have been established and continued expanding recently. Meanwhile, analytical methods based on bioinformatics are also well developed in many aspects of plant genomic research including comparative genomic analysis, phylogenomics and evolutionary analysis, and genome-wide association study. However, constantly upgrading in computational infrastructures, such as high capacity data storage and high performing analysis software, is the real challenge for plant genome research. This review paper focuses on challenges and opportunities which knowledge and skills in bioinformatics can bring to plant scientists in present plant genomics era as well as future aspects in critical need for effective tools to facilitate the translation of knowledge from new sequencing data to enhancement of plant productivity. PMID:27499685

  5. The European Bioinformatics Institute in 2016: Data growth and integration.

    Science.gov (United States)

    Cook, Charles E; Bergman, Mary Todd; Finn, Robert D; Cochrane, Guy; Birney, Ewan; Apweiler, Rolf

    2016-01-01

    New technologies are revolutionising biological research and its applications by making it easier and cheaper to generate ever-greater volumes and types of data. In response, the services and infrastructure of the European Bioinformatics Institute (EMBL-EBI, www.ebi.ac.uk) are continually expanding: total disk capacity increases significantly every year to keep pace with demand (75 petabytes as of December 2015), and interoperability between resources remains a strategic priority. Since 2014 we have launched two new resources: the European Variation Archive for genetic variation data and EMPIAR for two-dimensional electron microscopy data, as well as a Resource Description Framework platform. We also launched the Embassy Cloud service, which allows users to run large analyses in a virtual environment next to EMBL-EBI's vast public data resources. PMID:26673705

  6. Introducing bioinformatics, the biosciences' genomic revolution

    CERN Document Server

    Zanella, Paolo

    1999-01-01

    The general audience for these lectures is mainly physicists, computer scientists, engineers or the general public wanting to know more about what’s going on in the biosciences. What’s bioinformatics and why is all this fuss being made about it ? What’s this revolution triggered by the human genome project ? Are there any results yet ? What are the problems ? What new avenues of research have been opened up ? What about the technology ? These new developments will be compared with what happened at CERN earlier in its evolution, and it is hoped that the similiraties and contrasts will stimulate new curiosity and provoke new thoughts.

  7. Robust Bioinformatics Recognition with VLSI Biochip Microsystem

    Science.gov (United States)

    Lue, Jaw-Chyng L.; Fang, Wai-Chi

    2006-01-01

    A microsystem architecture for real-time, on-site, robust bioinformatic patterns recognition and analysis has been proposed. This system is compatible with on-chip DNA analysis means such as polymerase chain reaction (PCR)amplification. A corresponding novel artificial neural network (ANN) learning algorithm using new sigmoid-logarithmic transfer function based on error backpropagation (EBP) algorithm is invented. Our results show the trained new ANN can recognize low fluorescence patterns better than the conventional sigmoidal ANN does. A differential logarithmic imaging chip is designed for calculating logarithm of relative intensities of fluorescence signals. The single-rail logarithmic circuit and a prototype ANN chip are designed, fabricated and characterized.

  8. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    Science.gov (United States)

    Magana, Alejandra J.; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the…

  9. Bioinformatics for Diagnostics, Forensics, and Virulence Characterization and Detection

    Energy Technology Data Exchange (ETDEWEB)

    Gardner, S; Slezak, T

    2005-04-05

    We summarize four of our group's high-risk/high-payoff research projects funded by the Intelligence Technology Innovation Center (ITIC) in conjunction with our DHS-funded pathogen informatics activities. These are (1) quantitative assessment of genomic sequencing needs to predict high quality DNA and protein signatures for detection, and comparison of draft versus finished sequences for diagnostic signature prediction; (2) development of forensic software to identify SNP and PCR-RFLP variations from a large number of viral pathogen sequences and optimization of the selection of markers for maximum discrimination of those sequences; (3) prediction of signatures for the detection of virulence, antibiotic resistance, and toxin genes and genetic engineering markers in bacteria; (4) bioinformatic characterization of virulence factors to rapidly screen genomic data for potential genes with similar functions and to elucidate potential health threats in novel organisms. The results of (1) are being used by policy makers to set national sequencing priorities. Analyses from (2) are being used in collaborations with the CDC to genotype and characterize many variola strains, and reports from these collaborations have been made to the President. We also determined SNPs for serotype and strain discrimination of 126 foot and mouth disease virus (FMDV) genomes. For (3), currently >1000 probes have been predicted for the specific detection of >4000 virulence, antibiotic resistance, and genetic engineering vector sequences, and we expect to complete the bioinformatic design of a comprehensive ''virulence detection chip'' by August 2005. Results of (4) will be a system to rapidly predict potential virulence pathways and phenotypes in organisms based on their genomic sequences.

  10. Bioinformatics approaches to single-cell analysis in developmental biology.

    Science.gov (United States)

    Yalcin, Dicle; Hakguder, Zeynep M; Otu, Hasan H

    2016-03-01

    Individual cells within the same population show various degrees of heterogeneity, which may be better handled with single-cell analysis to address biological and clinical questions. Single-cell analysis is especially important in developmental biology as subtle spatial and temporal differences in cells have significant associations with cell fate decisions during differentiation and with the description of a particular state of a cell exhibiting an aberrant phenotype. Biotechnological advances, especially in the area of microfluidics, have led to a robust, massively parallel and multi-dimensional capturing, sorting, and lysis of single-cells and amplification of related macromolecules, which have enabled the use of imaging and omics techniques on single cells. There have been improvements in computational single-cell image analysis in developmental biology regarding feature extraction, segmentation, image enhancement and machine learning, handling limitations of optical resolution to gain new perspectives from the raw microscopy images. Omics approaches, such as transcriptomics, genomics and epigenomics, targeting gene and small RNA expression, single nucleotide and structural variations and methylation and histone modifications, rely heavily on high-throughput sequencing technologies. Although there are well-established bioinformatics methods for analysis of sequence data, there are limited bioinformatics approaches which address experimental design, sample size considerations, amplification bias, normalization, differential expression, coverage, clustering and classification issues, specifically applied at the single-cell level. In this review, we summarize biological and technological advancements, discuss challenges faced in the aforementioned data acquisition and analysis issues and present future prospects for application of single-cell analyses to developmental biology. PMID:26358759

  11. Identification of novel genes and pathways in carotid atheroma using integrated bioinformatic methods

    OpenAIRE

    Wenqing Nai; Diane Threapleton; Jingbo Lu; Kewei Zhang; Hongyuan Wu; You Fu; Yuanyuan Wang; Zejin Ou; Lanlan Shan; Yan Ding; Yanlin Yu; Meng Dai

    2016-01-01

    Atherosclerosis is the primary cause of cardiovascular events and its molecular mechanism urgently needs to be clarified. In our study, atheromatous plaques (ATH) and macroscopically intact tissue (MIT) sampled from 32 patients were compared and an integrated series of bioinformatic microarray analyses were used to identify altered genes and pathways. Our work showed 816 genes were differentially expressed between ATH and MIT, including 443 that were up-regulated and 373 that were down-regula...

  12. Quantum Bio-Informatics II From Quantum Information to Bio-Informatics

    Science.gov (United States)

    Accardi, L.; Freudenberg, Wolfgang; Ohya, Masanori

    2009-02-01

    / H. Kamimura -- Massive collection of full-length complementary DNA clones and microarray analyses: keys to rice transcriptome analysis / S. Kikuchi -- Changes of influenza A(H5) viruses by means of entropic chaos degree / K. Sato and M. Ohya -- Basics of genome sequence analysis in bioinformatics - its fundamental ideas and problems / T. Suzuki and S. Miyazaki -- A basic introduction to gene expression studies using microarray expression data analysis / D. Wanke and J. Kilian -- Integrating biological perspectives: a quantum leap for microarray expression analysis / D. Wanke ... [et al.].

  13. A lightweight, flow-based toolkit for parallel and distributed bioinformatics pipelines

    Directory of Open Access Journals (Sweden)

    Cieślik Marcin

    2011-02-01

    Full Text Available Abstract Background Bioinformatic analyses typically proceed as chains of data-processing tasks. A pipeline, or 'workflow', is a well-defined protocol, with a specific structure defined by the topology of data-flow interdependencies, and a particular functionality arising from the data transformations applied at each step. In computer science, the dataflow programming (DFP paradigm defines software systems constructed in this manner, as networks of message-passing components. Thus, bioinformatic workflows can be naturally mapped onto DFP concepts. Results To enable the flexible creation and execution of bioinformatics dataflows, we have written a modular framework for parallel pipelines in Python ('PaPy'. A PaPy workflow is created from re-usable components connected by data-pipes into a directed acyclic graph, which together define nested higher-order map functions. The successive functional transformations of input data are evaluated on flexibly pooled compute resources, either local or remote. Input items are processed in batches of adjustable size, all flowing one to tune the trade-off between parallelism and lazy-evaluation (memory consumption. An add-on module ('NuBio' facilitates the creation of bioinformatics workflows by providing domain specific data-containers (e.g., for biomolecular sequences, alignments, structures and functionality (e.g., to parse/write standard file formats. Conclusions PaPy offers a modular framework for the creation and deployment of parallel and distributed data-processing workflows. Pipelines derive their functionality from user-written, data-coupled components, so PaPy also can be viewed as a lightweight toolkit for extensible, flow-based bioinformatics data-processing. The simplicity and flexibility of distributed PaPy pipelines may help users bridge the gap between traditional desktop/workstation and grid computing. PaPy is freely distributed as open-source Python code at http://muralab.org/PaPy, and

  14. The 2015 Bioinformatics Open Source Conference (BOSC 2015.

    Directory of Open Access Journals (Sweden)

    Nomi L Harris

    2016-02-01

    Full Text Available The Bioinformatics Open Source Conference (BOSC is organized by the Open Bioinformatics Foundation (OBF, a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG before the annual Intelligent Systems in Molecular Biology (ISMB conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included "Data Science;" "Standards and Interoperability;" "Open Science and Reproducibility;" "Translational Bioinformatics;" "Visualization;" and "Bioinformatics Open Source Project Updates". In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled "Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community," that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule.

  15. The bioinformatics of next generation sequencing: a meeting report

    Institute of Scientific and Technical Information of China (English)

    Ravi Shankar

    2011-01-01

    @@ The Studio of Computational Biology & Bioinformatics (SCBB), IHBT, CSIR,Palampur, India organized one of the very first national workshop funded by DBT,Govt.of India, on the Bioinformatics issues associated with next generation sequencing approaches.The course structure was designed by SCBB, IHBT.The workshop took place in the IHBT premise on 17 and 18 June 2010.

  16. Generative Topic Modeling in Image Data Mining and Bioinformatics Studies

    Science.gov (United States)

    Chen, Xin

    2012-01-01

    Probabilistic topic models have been developed for applications in various domains such as text mining, information retrieval and computer vision and bioinformatics domain. In this thesis, we focus on developing novel probabilistic topic models for image mining and bioinformatics studies. Specifically, a probabilistic topic-connection (PTC) model…

  17. Evaluating an Inquiry-Based Bioinformatics Course Using Q Methodology

    Science.gov (United States)

    Ramlo, Susan E.; McConnell, David; Duan, Zhong-Hui; Moore, Francisco B.

    2008-01-01

    Faculty at a Midwestern metropolitan public university recently developed a course on bioinformatics that emphasized collaboration and inquiry. Bioinformatics, essentially the application of computational tools to biological data, is inherently interdisciplinary. Thus part of the challenge of creating this course was serving the needs and…

  18. Assessment of a Bioinformatics across Life Science Curricula Initiative

    Science.gov (United States)

    Howard, David R.; Miskowski, Jennifer A.; Grunwald, Sandra K.; Abler, Michael L.

    2007-01-01

    At the University of Wisconsin-La Crosse, we have undertaken a program to integrate the study of bioinformatics across the undergraduate life science curricula. Our efforts have included incorporating bioinformatics exercises into courses in the biology, microbiology, and chemistry departments, as well as coordinating the efforts of faculty within…

  19. Bioinformatics for cancer immunotherapy target discovery

    DEFF Research Database (Denmark)

    Olsen, Lars Rønn; Campos, Benito; Barnkob, Mike Stein;

    2014-01-01

    cancer immunotherapies has yet to be fulfilled. The insufficient efficacy of existing treatments can be attributed to a number of biological and technical issues. In this review, we detail the current limitations of immunotherapy target selection and design, and review computational methods to streamline......The mechanisms of immune response to cancer have been studied extensively and great effort has been invested into harnessing the therapeutic potential of the immune system. Immunotherapies have seen significant advances in the past 20 years, but the full potential of protective and therapeutic...... and co-targets for single-epitope and multi-epitope strategies. We provide examples of application to the well-known tumor antigen HER2 and suggest bioinformatics methods to ameliorate therapy resistance and ensure efficient and lasting control of tumors....

  20. Combining multiple decisions: applications to bioinformatics

    International Nuclear Information System (INIS)

    Multi-class classification is one of the fundamental tasks in bioinformatics and typically arises in cancer diagnosis studies by gene expression profiling. This article reviews two recent approaches to multi-class classification by combining multiple binary classifiers, which are formulated based on a unified framework of error-correcting output coding (ECOC). The first approach is to construct a multi-class classifier in which each binary classifier to be aggregated has a weight value to be optimally tuned based on the observed data. In the second approach, misclassification of each binary classifier is formulated as a bit inversion error with a probabilistic model by making an analogy to the context of information transmission theory. Experimental studies using various real-world datasets including cancer classification problems reveal that both of the new methods are superior or comparable to other multi-class classification methods

  1. Bioinformatics Analysis of Estrogen-Responsive Genes.

    Science.gov (United States)

    Handel, Adam E

    2016-01-01

    Estrogen is a steroid hormone that plays critical roles in a myriad of intracellular pathways. The expression of many genes is regulated through the steroid hormone receptors ESR1 and ESR2. These bind to DNA and modulate the expression of target genes. Identification of estrogen target genes is greatly facilitated by the use of transcriptomic methods, such as RNA-seq and expression microarrays, and chromatin immunoprecipitation with massively parallel sequencing (ChIP-seq). Combining transcriptomic and ChIP-seq data enables a distinction to be drawn between direct and indirect estrogen target genes. This chapter discusses some methods of identifying estrogen target genes that do not require any expertise in programming languages or complex bioinformatics. PMID:26585125

  2. Academic Training - Bioinformatics: Decoding the Genome

    CERN Multimedia

    Chris Jones

    2006-01-01

    ACADEMIC TRAINING LECTURE SERIES 27, 28 February 1, 2, 3 March 2006 from 11:00 to 12:00 - Auditorium, bldg. 500 Decoding the Genome A special series of 5 lectures on: Recent extraordinary advances in the life sciences arising through new detection technologies and bioinformatics The past five years have seen an extraordinary change in the information and tools available in the life sciences. The sequencing of the human genome, the discovery that we possess far fewer genes than foreseen, the measurement of the tiny changes in the genomes that differentiate us, the sequencing of the genomes of many pathogens that lead to diseases such as malaria are all examples of completely new information that is now available in the quest for improved healthcare. New tools have allowed similar strides in the discovery of the associated protein structures, providing invaluable information for those searching for new drugs. New DNA microarray chips permit simultaneous measurement of the state of expression of tens...

  3. Applied bioinformatics: Genome annotation and transcriptome analysis

    DEFF Research Database (Denmark)

    Gupta, Vikas

    Next generation sequencing (NGS) has revolutionized the field of genomics and its wide range of applications has resulted in the genome-wide analysis of hundreds of species and the development of thousands of computational tools. This thesis represents my work on NGS analysis of four species, Lotus...... japonicus (Lotus), Vaccinium corymbosum (blueberry), Stegodyphus mimosarum (spider) and Trifolium occidentale (clover). From a bioinformatics data analysis perspective, my work can be divided into three parts; genome annotation, small RNA, and gene expression analysis. Lotus is a legume of significant...... agricultural and biological importance. Its capacity to form symbiotic relationships with rhizobia and microrrhizal fungi has fascinated researchers for years. Lotus has a small genome of approximately 470 Mb and a short life cycle of 2 to 3 months, which has made Lotus a model legume plant for many molecular...

  4. Comparison of Online and Onsite Bioinformatics Instruction for a Fully Online Bioinformatics Master’s Program

    Directory of Open Access Journals (Sweden)

    Kristina M. Obom

    2009-12-01

    Full Text Available The completely online Master of Science in Bioinformatics program differs from the onsite program only in the mode of content delivery. Analysis of student satisfaction indicates no statistically significant difference between most online and onsite student responses, however, online and onsite students do differ significantly in their responses to a few questions on the course evaluation queries. Analysis of student exam performance using three assessments indicates that there was no significant difference in grades earned by students in online and onsite courses. These results suggest that our model for online bioinformatics education provides students with a rigorous course of study that is comparable to onsite course instruction and possibly provides a more rigorous course load and more opportunities for participation.

  5. Comparison of Online and Onsite Bioinformatics Instruction for a Fully Online Bioinformatics Master’s Program

    OpenAIRE

    Obom, Kristina. M.; Cummings, Patrick J.

    2009-01-01

    The completely online Master of Science in Bioinformatics program differs from the onsite program only in the mode of content delivery. Analysis of student satisfaction indicates no statistically significant difference between most online and onsite student responses, however, online and onsite students do differ significantly in their responses to a few questions on the course evaluation queries. Analysis of student exam performance using three assessments indicates that there was no signifi...

  6. 基于131I摄取功能的乳头状甲状腺癌肺转移患者血清差异miRNA的筛选及生物信息学分析%Screening and bioinformatic analyses of serum microRNAs in papillary thyroid carcinoma with lung metastases based on iodine-131 uptake capacity

    Institute of Scientific and Technical Information of China (English)

    薛艳丽; 邱忠领; 宋红俊; 罗全勇

    2013-01-01

    目的 筛选与乳头状甲状腺癌(PTC)肺转移灶131I摄取功能相关的血清miRNAs.方法 收集18例伴肺转移PTC患者血清标本,分为A、B两组,A组9例,为肺转移灶具有摄取131I功能者,B组9例,为肺转移灶无131I摄取功能者.采用miRNA芯片检测两组间的差异血清miRNAs,并经生物信息学分析筛选相关核心miRNA.结果 通过miRNA的检测,发现A、B两组患者血清间具有显著差异的miRNAs 13种,其中5种miRNAs在B组上调,8种miRNAs在B组下调.生物信息学分析发现血清miR-106a为核心miRNA,并参与调节193个靶基因,其中包括MAPK1、MAP3K8、MAP3K3、MAP3K12、MAP3K5、MAP3K14、MAP3K2、MAPK11等基因.结论 PTC肺转移灶具有131I摄取功能组患者血清与肺转移灶无131I摄取功能组患者血清miRNAs存在显著差异,在肺转移灶无131I摄取功能组患者血清中显著上调的miR-106a为核心miRNA,可能与肺转移灶丧失131I摄取功能有关.%Objective To compare the differences in serum miRNAs between papillary thyroid carcinoma (PTC) patients with 131I-avid lung metastases and those with non-131I-avid lung metastases,and to screen a serum miRNA relevant to 131I uptake in lung metastases from PTC.Methods We collected serum samples from 9 PTC patients with 131I-avid lung metastases(Group A)and 9 PTC patients with non-131I-avid lung metastases(Group B)respectively,and both groups were matched for age and sex,without history of other tumors.The serum miRNAs were profiled using miRNA chip technology and the difference between the two groups was compared.Bioinformatics analysis was performed to screen the core miRNAs.Results A total of 13 serum miRNAs with statistically significant differences were detected between Group A and Group B.Among them,5 miRNAs displayed high expression,and 8 miRNAs showed low expression in Group B.Bioinformatics analysis showed that serum miR-106a was the core miRNA.The miR-106a was involved in regulating 193 target genes

  7. Bioinformatics approaches for identifying new therapeutic bioactive peptides in food

    Directory of Open Access Journals (Sweden)

    Nora Khaldi

    2012-10-01

    Full Text Available ABSTRACT:The traditional methods for mining foods for bioactive peptides are tedious and long. Similar to the drug industry, the length of time to identify and deliver a commercial health ingredient that reduces disease symptoms can take anything between 5 to 10 years. Reducing this time and effort is crucial in order to create new commercially viable products with clear and important health benefits. In the past few years, bioinformatics, the science that brings together fast computational biology, and efficient genome mining, is appearing as the long awaited solution to this problem. By quickly mining food genomes for characteristics of certain food therapeutic ingredients, researchers can potentially find new ones in a matter of a few weeks. Yet, surprisingly, very little success has been achieved so far using bioinformatics in mining for food bioactives.The absence of food specific bioinformatic mining tools, the slow integration of both experimental mining and bioinformatics, and the important difference between different experimental platforms are some of the reasons for the slow progress of bioinformatics in the field of functional food and more specifically in bioactive peptide discovery.In this paper I discuss some methods that could be easily translated, using a rational peptide bioinformatics design, to food bioactive peptide mining. I highlight the need for an integrated food peptide database. I also discuss how to better integrate experimental work with bioinformatics in order to improve the mining of food for bioactive peptides, therefore achieving a higher success rates.

  8. Continuing Education Workshops in Bioinformatics Positively Impact Research and Careers.

    Science.gov (United States)

    Brazas, Michelle D; Ouellette, B F Francis

    2016-06-01

    Bioinformatics.ca has been hosting continuing education programs in introductory and advanced bioinformatics topics in Canada since 1999 and has trained more than 2,000 participants to date. These workshops have been adapted over the years to keep pace with advances in both science and technology as well as the changing landscape in available learning modalities and the bioinformatics training needs of our audience. Post-workshop surveys have been a mandatory component of each workshop and are used to ensure appropriate adjustments are made to workshops to maximize learning. However, neither bioinformatics.ca nor others offering similar training programs have explored the long-term impact of bioinformatics continuing education training. Bioinformatics.ca recently initiated a look back on the impact its workshops have had on the career trajectories, research outcomes, publications, and collaborations of its participants. Using an anonymous online survey, bioinformatics.ca analyzed responses from those surveyed and discovered its workshops have had a positive impact on collaborations, research, publications, and career progression. PMID:27281025

  9. BCCTBbp: the Breast Cancer Campaign Tissue Bank bioinformatics portal.

    Science.gov (United States)

    Cutts, Rosalind J; Guerra-Assunção, José Afonso; Gadaleta, Emanuela; Dayem Ullah, Abu Z; Chelala, Claude

    2015-01-01

    BCCTBbp (http://bioinformatics.breastcancertissue bank.org) was initially developed as the data-mining portal of the Breast Cancer Campaign Tissue Bank (BCCTB), a vital resource of breast cancer tissue for researchers to support and promote cutting-edge research. BCCTBbp is dedicated to maximising research on patient tissues by initially storing genomics, methylomics, transcriptomics, proteomics and microRNA data that has been mined from the literature and linking to pathways and mechanisms involved in breast cancer. Currently, the portal holds 146 datasets comprising over 227,795 expression/genomic measurements from various breast tissues (e.g. normal, malignant or benign lesions), cell lines and body fluids. BCCTBbp can be used to build on breast cancer knowledge and maximise the value of existing research. By recording a large number of annotations on samples and studies, and linking to other databases, such as NCBI, Ensembl and Reactome, a wide variety of different investigations can be carried out. Additionally, BCCTBbp has a dedicated analytical layer allowing researchers to further analyse stored datasets. A future important role for BCCTBbp is to make available all data generated on BCCTB tissues thus building a valuable resource of information on the tissues in BCCTB that will save repetition of experiments and expand scientific knowledge. PMID:25332396

  10. Bioinformatic characterization of glycyl radical enzyme-associated bacterial microcompartments.

    Science.gov (United States)

    Zarzycki, Jan; Erbilgin, Onur; Kerfeld, Cheryl A

    2015-12-01

    Bacterial microcompartments (BMCs) are proteinaceous organelles encapsulating enzymes that catalyze sequential reactions of metabolic pathways. BMCs are phylogenetically widespread; however, only a few BMCs have been experimentally characterized. Among them are the carboxysomes and the propanediol- and ethanolamine-utilizing microcompartments, which play diverse metabolic and ecological roles. The substrate of a BMC is defined by its signature enzyme. In catabolic BMCs, this enzyme typically generates an aldehyde. Recently, it was shown that the most prevalent signature enzymes encoded by BMC loci are glycyl radical enzymes, yet little is known about the function of these BMCs. Here we characterize the glycyl radical enzyme-associated microcompartment (GRM) loci using a combination of bioinformatic analyses and active-site and structural modeling to show that the GRMs comprise five subtypes. We predict distinct functions for the GRMs, including the degradation of choline, propanediol, and fuculose phosphate. This is the first family of BMCs for which identification of the signature enzyme is insufficient for predicting function. The distinct GRM functions are also reflected in differences in shell composition and apparently different assembly pathways. The GRMs are the counterparts of the vitamin B12-dependent propanediol- and ethanolamine-utilizing BMCs, which are frequently associated with virulence. This study provides a comprehensive foundation for experimental investigations of the diverse roles of GRMs. Understanding this plasticity of function within a single BMC family, including characterization of differences in permeability and assembly, can inform approaches to BMC bioengineering and the design of therapeutics. PMID:26407889

  11. High-throughput bioinformatics with the Cyrille2 pipeline system

    Directory of Open Access Journals (Sweden)

    de Groot Joost CW

    2008-02-01

    Full Text Available Abstract Background Modern omics research involves the application of high-throughput technologies that generate vast volumes of data. These data need to be pre-processed, analyzed and integrated with existing knowledge through the use of diverse sets of software tools, models and databases. The analyses are often interdependent and chained together to form complex workflows or pipelines. Given the volume of the data used and the multitude of computational resources available, specialized pipeline software is required to make high-throughput analysis of large-scale omics datasets feasible. Results We have developed a generic pipeline system called Cyrille2. The system is modular in design and consists of three functionally distinct parts: 1 a web based, graphical user interface (GUI that enables a pipeline operator to manage the system; 2 the Scheduler, which forms the functional core of the system and which tracks what data enters the system and determines what jobs must be scheduled for execution, and; 3 the Executor, which searches for scheduled jobs and executes these on a compute cluster. Conclusion The Cyrille2 system is an extensible, modular system, implementing the stated requirements. Cyrille2 enables easy creation and execution of high throughput, flexible bioinformatics pipelines.

  12. Survey of MapReduce frame operation in bioinformatics.

    Science.gov (United States)

    Zou, Quan; Li, Xu-Bin; Jiang, Wen-Rui; Lin, Zi-Yu; Li, Gui-Lin; Chen, Ke

    2014-07-01

    Bioinformatics is challenged by the fact that traditional analysis tools have difficulty in processing large-scale data from high-throughput sequencing. The open source Apache Hadoop project, which adopts the MapReduce framework and a distributed file system, has recently given bioinformatics researchers an opportunity to achieve scalable, efficient and reliable computing performance on Linux clusters and on cloud computing services. In this article, we present MapReduce frame-based applications that can be employed in the next-generation sequencing and other biological domains. In addition, we discuss the challenges faced by this field as well as the future works on parallel computing in bioinformatics. PMID:23396756

  13. Bioconductor: open software development for computational biology and bioinformatics

    DEFF Research Database (Denmark)

    Gentleman, R.C.; Carey, V.J.; Bates, D.M.;

    2004-01-01

    The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisci......The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into...... interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples....

  14. Website for avian flu information and bioinformatics

    Institute of Scientific and Technical Information of China (English)

    GAO; George; Fu

    2009-01-01

    Highly pathogenic influenza A virus H5N1 has spread out worldwide and raised the public concerns. This increased the output of influenza virus sequence data as well as the research publication and other reports. In order to fight against H5N1 avian flu in a comprehensive way, we designed and started to set up the Website for Avian Flu Information (http://www.avian-flu.info) from 2004. Other than the influenza virus database available, the website is aiming to integrate diversified information for both researchers and the public. From 2004 to 2009, we collected information from all aspects, i.e. reports of outbreaks, scientific publications and editorials, policies for prevention, medicines and vaccines, clinic and diagnosis. Except for publications, all information is in Chinese. Till April 15, 2009, the cumulative news entries had been over 2000 and research papers were approaching 5000. By using the curated data from Influenza Virus Resource, we have set up an influenza virus sequence database and a bioinformatic platform, providing the basic functions for the sequence analysis of influenza virus. We will focus on the collection of experimental data and results as well as the integration of the data from the geological information system and avian influenza epidemiology.

  15. Website for avian flu information and bioinformatics

    Institute of Scientific and Technical Information of China (English)

    LIU Di; LIU Quan-He; WU Lin-Huan; LIU Bin; WU Jun; LAO Yi-Mei; LI Xiao-Jing; GAO George Fu; MA Jun-Cai

    2009-01-01

    Highly pathogenic influenza A virus H5N1 has spread out worldwide and raised the public concerns. This increased the output of influenza virus sequence data as well as the research publication and other reports. In order to fight against H5N1 avian flu in a comprehensive way, we designed and started to set up the Website for Avian Flu Information (http://www.avian-flu.info) from 2004. Other than the influenza virus database available, the website is aiming to integrate diversified information for both researchers and the public. From 2004 to 2009, we collected information from all aspects, i.e. reports of outbreaks, scientific publications and editorials, policies for prevention, medicines and vaccines, clinic and diagnosis. Except for publications, all information is in Chinese. Till April 15, 2009, the cumulative news entries had been over 2000 and research papers were approaching 5000. By using the curated data from Influenza Virus Resource, we have set up an influenza virus sequence database and a bioin-formatic platform, providing the basic functions for the sequence analysis of influenza virus. We will focus on the collection of experimental data and results as well as the integration of the data from the geological information system and avian influenza epidemiology.

  16. Bioinformatics in cancer therapy and drug design

    International Nuclear Information System (INIS)

    One of the mechanisms of external signal transduction (ionizing radiation, toxicants, stress) to the target cell is the existence of membrane and intracellular proteins with intrinsic tyrosine kinase activity. No wonder that etiology of malignant growth links to abnormalities in signal transduction through tyrosine kinases. The epidermal growth factor receptor (EGFR) tyrosine kinases play fundamental roles in development, proliferation and differentiation of tissues of epithelial, mesenchymal and neuronal origin. There are four types of EGFR: EGF receptor (ErbB1/HER1), ErbB2/Neu/HER2, ErbB3/HER3 and ErbB4/HER4. Abnormal expression of EGFR, appearance of receptor mutants with changed ability to protein-protein interactions or increased tyrosine kinase activity have been implicated in the malignancy of different types of human tumors. Bioinformatics is currently using in investigation on design and selection of drugs that can make alterations in structure or competitively bind with receptors and so display antagonistic characteristics. (authors)

  17. Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud.

    Directory of Open Access Journals (Sweden)

    Enis Afgan

    Full Text Available Analyzing high throughput genomics data is a complex and compute intensive task, generally requiring numerous software tools and large reference data sets, tied together in successive stages of data transformation and visualisation. A computational platform enabling best practice genomics analysis ideally meets a number of requirements, including: a wide range of analysis and visualisation tools, closely linked to large user and reference data sets; workflow platform(s enabling accessible, reproducible, portable analyses, through a flexible set of interfaces; highly available, scalable computational resources; and flexibility and versatility in the use of these resources to meet demands and expertise of a variety of users. Access to an appropriate computational platform can be a significant barrier to researchers, as establishing such a platform requires a large upfront investment in hardware, experience, and expertise.We designed and implemented the Genomics Virtual Laboratory (GVL as a middleware layer of machine images, cloud management tools, and online services that enable researchers to build arbitrarily sized compute clusters on demand, pre-populated with fully configured bioinformatics tools, reference datasets and workflow and visualisation options. The platform is flexible in that users can conduct analyses through web-based (Galaxy, RStudio, IPython Notebook or command-line interfaces, and add/remove compute nodes and data resources as required. Best-practice tutorials and protocols provide a path from introductory training to practice. The GVL is available on the OpenStack-based Australian Research Cloud (http://nectar.org.au and the Amazon Web Services cloud. The principles, implementation and build process are designed to be cloud-agnostic.This paper provides a blueprint for the design and implementation of a cloud-based Genomics Virtual Laboratory. We discuss scope, design considerations and technical and logistical constraints

  18. Survey of Natural Language Processing Techniques in Bioinformatics.

    Science.gov (United States)

    Zeng, Zhiqiang; Shi, Hua; Wu, Yun; Hong, Zhiling

    2015-01-01

    Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers. PMID:26525745

  19. Survey of Natural Language Processing Techniques in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Zhiqiang Zeng

    2015-01-01

    Full Text Available Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers.

  20. Creating Bioinformatic Workflows within the BioExtract Server

    Science.gov (United States)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows generally require access to multiple, distributed data sources and analytic tools. The requisite data sources may include large public data repositories, community...

  1. Scalable pattern recognition algorithms applications in computational biology and bioinformatics

    CERN Document Server

    Maji, Pradipta

    2014-01-01

    Reviews the development of scalable pattern recognition algorithms for computational biology and bioinformatics Includes numerous examples and experimental results to support the theoretical concepts described Concludes each chapter with directions for future research and a comprehensive bibliography

  2. Survey of Natural Language Processing Techniques in Bioinformatics

    OpenAIRE

    Zhiqiang Zeng; Hua Shi; Yun Wu; Zhiling Hong

    2015-01-01

    Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications ...

  3. Big Data Analytics in Bioinformatics: A Machine Learning Perspective

    OpenAIRE

    Kashyap, Hirak; Ahmed, Hasin Afzal; Hoque, Nazrul; Roy, Swarup; Bhattacharyya, Dhruba Kumar

    2015-01-01

    Bioinformatics research is characterized by voluminous and incremental datasets and complex data analytics methods. The machine learning methods used in bioinformatics are iterative and parallel. These methods can be scaled to handle big data using the distributed and parallel computing technologies. Usually big data tools perform computation in batch-mode and are not optimized for iterative processing and high data dependency among operations. In the recent years, parallel, incremental, and ...

  4. Bioinformatics Training: A Review of Challenges, Actions and Support Requirements

    DEFF Research Database (Denmark)

    Schneider, M.V.; Watson, J.; Attwood, T.;

    2010-01-01

    services, and discuss successful training strategies shared by a diverse set of bioinformatics trainers. We also identify steps that trainers in bioinformatics could take together to advance the state of the art in current training practices. The ideas presented in this article derive from the first...... Trainer Networking Session held under the auspices of the EU-funded SLING Integrating Activity, which took place in November 2009....

  5. A high-throughput bioinformatics distributed computing platform

    OpenAIRE

    Keane, Thomas M; Page, Andrew J.; McInerney, James O.; Naughton, Thomas J

    2005-01-01

    In the past number of years the demand for high performance computing has greatly increased in the area of bioinformatics. The huge increase in size of many genomic databases has meant that many common tasks in bioinformatics are not possible to complete in a reasonable amount of time on a single processor. Recently distributed computing has emerged as an inexpensive alternative to dedicated parallel computing. We have developed a general-purpose distributed computing platform ...

  6. An innovative approach for testing bioinformatics programs using metamorphic testing

    Directory of Open Access Journals (Sweden)

    Liu Huai

    2009-01-01

    Full Text Available Abstract Background Recent advances in experimental and computational technologies have fueled the development of many sophisticated bioinformatics programs. The correctness of such programs is crucial as incorrectly computed results may lead to wrong biological conclusion or misguide downstream experimentation. Common software testing procedures involve executing the target program with a set of test inputs and then verifying the correctness of the test outputs. However, due to the complexity of many bioinformatics programs, it is often difficult to verify the correctness of the test outputs. Therefore our ability to perform systematic software testing is greatly hindered. Results We propose to use a novel software testing technique, metamorphic testing (MT, to test a range of bioinformatics programs. Instead of requiring a mechanism to verify whether an individual test output is correct, the MT technique verifies whether a pair of test outputs conform to a set of domain specific properties, called metamorphic relations (MRs, thus greatly increases the number and variety of test cases that can be applied. To demonstrate how MT is used in practice, we applied MT to test two open-source bioinformatics programs, namely GNLab and SeqMap. In particular we show that MT is simple to implement, and is effective in detecting faults in a real-life program and some artificially fault-seeded programs. Further, we discuss how MT can be applied to test programs from various domains of bioinformatics. Conclusion This paper describes the application of a simple, effective and automated technique to systematically test a range of bioinformatics programs. We show how MT can be implemented in practice through two real-life case studies. Since many bioinformatics programs, particularly those for large scale simulation and data analysis, are hard to test systematically, their developers may benefit from using MT as part of the testing strategy. Therefore our work

  7. BioWarehouse: a bioinformatics database warehouse toolkit

    OpenAIRE

    Stringer-Calvert David WJ; Gupta Priyanka; Wagner Valerie; Pouliot Yannick; Lee Thomas J; Tenenbaum Jessica D; Karp Peter D

    2006-01-01

    Abstract Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) bu...

  8. Can Bioinformatics Be Considered as an Experimental Biological Science?

    OpenAIRE

    Ilzins, Olaf; Isea, Raul; Hoebeke, Johan

    2016-01-01

    The objective of this short report is to reconsider the subject of bioinformatics as just being a tool of experimental biological science. To do that, we introduce three examples to show how bioinformatics could be considered as an experimental science. These examples show how the development of theoretical biological models generates experimentally verifiable computer hypotheses, which necessarily must be validated by experiments in vitro or in vivo.

  9. CAPweb: a bioinformatics CGH array Analysis Platform

    OpenAIRE

    Liva, Stéphane; Hupé, Philippe; Neuvial, Pierre; Brito, Isabel; Viara, Eric; La Rosa, Philippe; Barillot, Emmanuel

    2006-01-01

    Assessing variations in DNA copy number is crucial for understanding constitutional or somatic diseases, particularly cancers. The recently developed array-CGH (comparative genomic hybridization) technology allows this to be investigated at the genomic level. We report the availability of a web tool for analysing array-CGH data. CAPweb (CGH array Analysis Platform on the Web) is intended as a user-friendly tool enabling biologists to completely analyse CGH arrays from the raw data to the visu...

  10. Effects of advanced glycosylation end products on interactions between cis-elements and trans-acting factors of hepatic insulin receptor gene%晚期糖基化终末产物对胰岛素受体顺式调控元件和反式作用因子相互作用的影响

    Institute of Scientific and Technical Information of China (English)

    戎健; 于长青; 杨沛

    2009-01-01

    , along with nonspecifie binding increasing. Conclusion The impact of AGEs on hlR gene expression seems to be related to the effects on cis-elements and trans-acting factors.

  11. Model-driven user interfaces for bioinformatics data resources: regenerating the wheel as an alternative to reinventing it

    OpenAIRE

    Swainston Neil; Griffiths Tony; Hedeler Cornelia; Garwood Christopher; Garwood Kevin; Oliver Stephen G; Paton Norman W

    2006-01-01

    Abstract Background The proliferation of data repositories in bioinformatics has resulted in the development of numerous interfaces that allow scientists to browse, search and analyse the data that they contain. Interfaces typically support repository access by means of web pages, but other means are also used, such as desktop applications and command line tools. Interfaces often duplicate functionality amongst each other, and this implies that associated development activities are repeated i...

  12. Embracing the Future: Bioinformatics for High School Women

    Science.gov (United States)

    Zales, Charlotte Rappe; Cronin, Susan J.

    Sixteen high school women participated in a 5-week residential summer program designed to encourage female and minority students to choose careers in scientific fields. Students gained expertise in bioinformatics through problem-based learning in a complex learning environment of content instruction, speakers, labs, and trips. Innovative hands-on activities filled the program. Students learned biological principles in context and sophisticated bioinformatics tools for processing data. Students additionally mastered a variety of information-searching techniques. Students completed creative individual and group projects, demonstrating the successful integration of biology, information technology, and bioinformatics. Discussions with female scientists allowed students to see themselves in similar roles. Summer residential aspects fostered an atmosphere in which students matured in interacting with others and in their views of diversity.

  13. Drug targets for lymphatic filariasis: A bioinformatics approach

    Directory of Open Access Journals (Sweden)

    Om Prakash Sharma

    2013-08-01

    Full Text Available This review article discusses the current scenario of the national and international burden due to lymphatic filariasis (LF and describes the active elimination programmes for LF and their achievements to eradicate this most debilitating disease from the earth. Since, bioinformatics is a rapidly growing field of biological study, and it has an increasingly significant role in various fields of biology. We have reviewed its leading involvement in the filarial research using different approaches of bioinformatics and have summarized available existing drugs and their targets to re-examine and to keep away from the resisting conditions. Moreover, some of the novel drug targets have been assembled for further study to design fresh and better pharmacological therapeutics. Various bioinformatics-based web resources, and databases have been discussed, which may enrich the filarial research.

  14. Application of Bioinformatics and Systems Biology in Medicinal Plant Studies

    Institute of Scientific and Technical Information of China (English)

    DENG You-ping; AI Jun-mei; XIAO Pei-gen

    2010-01-01

    One important purpose to investigate medicinal plants is to understand genes and enzymes that govern the biological metabolic process to produce bioactive compounds.Genome wide high throughput technologies such as genomics,transcriptomics,proteomics and metabolomics can help reach that goal.Such technologies can produce a vast amount of data which desperately need bioinformatics and systems biology to process,manage,distribute and understand these data.By dealing with the"omics"data,bioinformatics and systems biology can also help improve the quality of traditional medicinal materials,develop new approaches for the classification and authentication of medicinal plants,identify new active compounds,and cultivate medicinal plant species that tolerate harsh environmental conditions.In this review,the application of bioinformatics and systems biology in medicinal plants is briefly introduced.

  15. A Survey of the Availability of Primary Bioinformatics Web Resources

    Institute of Scientific and Technical Information of China (English)

    Trias Thireou; George Spyrou; Vassilis Atlamazoglou

    2007-01-01

    The explosive growth of the bioinformatics field has led to a large amount of data and software applications publicly available as web resources. However, the lack of persistence of web references is a barrier to a comprehensive shared access. We conducted a study of the current availability and other features of primary bioinformatics web resources (such as software tools and databases). The majority (95%) of the examined bioinformatics web resources were found running on UNIX/Linux operating systems, and the most widely used web server was found to be Apache (or Apache-related products). Of the overall 1,130 Uniform Resource Locators (URLs) examined, 91% were highly available (more than 90% of the time), while only 4% showed low accessibility (less than 50% of the time) during the survey. Furthermore, the most common URL failure modes are presented and analyzed.

  16. Probabilistic models and machine learning in structural bioinformatics

    DEFF Research Database (Denmark)

    Hamelryck, Thomas

    2009-01-01

    Structural bioinformatics is concerned with the molecular structure of biomacromolecules on a genomic scale, using computational methods. Classic problems in structural bioinformatics include the prediction of protein and RNA structure from sequence, the design of artificial proteins or enzymes...... and experimental determination of macromolecular structure that are based on such methods. These developments include generative models of protein structure, the estimation of the parameters of energy functions that are used in structure prediction, the superposition of macromolecules and structure...... bioinformatics. Recently, probabilistic models and machine learning methods based on Bayesian principles are providing efficient and rigorous solutions to challenging problems that were long regarded as intractable. In this review, I will highlight some important recent developments in the prediction, analysis...

  17. Bioinformatic scaling of allosteric interactions in biomedical isozymes

    Science.gov (United States)

    Phillips, J. C.

    2016-09-01

    Allosteric (long-range) interactions can be surprisingly strong in proteins of biomedical interest. Here we use bioinformatic scaling to connect prior results on nonsteroidal anti-inflammatory drugs to promising new drugs that inhibit cancer cell metabolism. Many parallel features are apparent, which explain how even one amino acid mutation, remote from active sites, can alter medical results. The enzyme twins involved are cyclooxygenase (aspirin) and isocitrate dehydrogenase (IDH). The IDH results are accurate to 1% and are overdetermined by adjusting a single bioinformatic scaling parameter. It appears that the final stage in optimizing protein functionality may involve leveling of the hydrophobic limits of the arms of conformational hydrophilic hinges.

  18. Approaches in integrative bioinformatics towards the virtual cell

    CERN Document Server

    Chen, Ming

    2014-01-01

    Approaches in Integrative Bioinformatics provides a basic introduction to biological information systems, as well as guidance for the computational analysis of systems biology. This book also covers a range of issues and methods that reveal the multitude of omics data integration types and the relevance that integrative bioinformatics has today. Topics include biological data integration and manipulation, modeling and simulation of metabolic networks, transcriptomics and phenomics, and virtual cell approaches, as well as a number of applications of network biology. It helps to illustrat

  19. High-performance computational solutions in protein bioinformatics

    CERN Document Server

    Mrozek, Dariusz

    2014-01-01

    Recent developments in computer science enable algorithms previously perceived as too time-consuming to now be efficiently used for applications in bioinformatics and life sciences. This work focuses on proteins and their structures, protein structure similarity searching at main representation levels and various techniques that can be used to accelerate similarity searches. Divided into four parts, the first part provides a formal model of 3D protein structures for functional genomics, comparative bioinformatics and molecular modeling. The second part focuses on the use of multithreading for

  20. Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum

    Energy Technology Data Exchange (ETDEWEB)

    Ditty, Jayna L.; Kvaal, Christopher A.; Goodner, Brad; Freyermuth, Sharyn K.; Bailey, Cheryl; Britton, Robert A.; Gordon, Stuart G.; Heinhorst, Sabine; Reed, Kelynne; Xu, Zhaohui; Sanders-Lorenz, Erin R.; Axen, Seth; Kim, Edwin; Johns, Mitrick; Scott, Kathleen; Kerfeld, Cheryl A.

    2011-08-01

    Undergraduate life sciences education needs an overhaul, as clearly described in the National Research Council of the National Academies publication BIO 2010: Transforming Undergraduate Education for Future Research Biologists. Among BIO 2010's top recommendations is the need to involve students in working with real data and tools that reflect the nature of life sciences research in the 21st century. Education research studies support the importance of utilizing primary literature, designing and implementing experiments, and analyzing results in the context of a bona fide scientific question in cultivating the analytical skills necessary to become a scientist. Incorporating these basic scientific methodologies in undergraduate education leads to increased undergraduate and post-graduate retention in the sciences. Toward this end, many undergraduate teaching organizations offer training and suggestions for faculty to update and improve their teaching approaches to help students learn as scientists, through design and discovery (e.g., Council of Undergraduate Research [www.cur.org] and Project Kaleidoscope [www.pkal.org]). With the advent of genome sequencing and bioinformatics, many scientists now formulate biological questions and interpret research results in the context of genomic information. Just as the use of bioinformatic tools and databases changed the way scientists investigate problems, it must change how scientists teach to create new opportunities for students to gain experiences reflecting the influence of genomics, proteomics, and bioinformatics on modern life sciences research. Educators have responded by incorporating bioinformatics into diverse life science curricula. While these published exercises in, and guidelines for, bioinformatics curricula are helpful and inspirational, faculty new to the area of bioinformatics inevitably need training in the theoretical underpinnings of the algorithms. Moreover, effectively integrating bioinformatics

  1. Technical phosphoproteomic and bioinformatic tools useful in cancer research

    Directory of Open Access Journals (Sweden)

    López Elena

    2011-10-01

    Full Text Available Abstract Reversible protein phosphorylation is one of the most important forms of cellular regulation. Thus, phosphoproteomic analysis of protein phosphorylation in cells is a powerful tool to evaluate cell functional status. The importance of protein kinase-regulated signal transduction pathways in human cancer has led to the development of drugs that inhibit protein kinases at the apex or intermediary levels of these pathways. Phosphoproteomic analysis of these signalling pathways will provide important insights for operation and connectivity of these pathways to facilitate identification of the best targets for cancer therapies. Enrichment of phosphorylated proteins or peptides from tissue or bodily fluid samples is required. The application of technologies such as phosphoenrichments, mass spectrometry (MS coupled to bioinformatics tools is crucial for the identification and quantification of protein phosphorylation sites for advancing in such relevant clinical research. A combination of different phosphopeptide enrichments, quantitative techniques and bioinformatic tools is necessary to achieve good phospho-regulation data and good structural analysis of protein studies. The current and most useful proteomics and bioinformatics techniques will be explained with research examples. Our aim in this article is to be helpful for cancer research via detailing proteomics and bioinformatic tools.

  2. Technical phosphoproteomic and bioinformatic tools useful in cancer research.

    Science.gov (United States)

    López, Elena; Wesselink, Jan-Jaap; López, Isabel; Mendieta, Jesús; Gómez-Puertas, Paulino; Muñoz, Sarbelio Rodríguez

    2011-01-01

    Reversible protein phosphorylation is one of the most important forms of cellular regulation. Thus, phosphoproteomic analysis of protein phosphorylation in cells is a powerful tool to evaluate cell functional status. The importance of protein kinase-regulated signal transduction pathways in human cancer has led to the development of drugs that inhibit protein kinases at the apex or intermediary levels of these pathways. Phosphoproteomic analysis of these signalling pathways will provide important insights for operation and connectivity of these pathways to facilitate identification of the best targets for cancer therapies. Enrichment of phosphorylated proteins or peptides from tissue or bodily fluid samples is required. The application of technologies such as phosphoenrichments, mass spectrometry (MS) coupled to bioinformatics tools is crucial for the identification and quantification of protein phosphorylation sites for advancing in such relevant clinical research. A combination of different phosphopeptide enrichments, quantitative techniques and bioinformatic tools is necessary to achieve good phospho-regulation data and good structural analysis of protein studies. The current and most useful proteomics and bioinformatics techniques will be explained with research examples. Our aim in this article is to be helpful for cancer research via detailing proteomics and bioinformatic tools. PMID:21967744

  3. BioRuby: Bioinformatics software for the Ruby programming language

    NARCIS (Netherlands)

    Goto, N.; Prins, J.C.P.; Nakao, M.; Bonnal, R.; Aerts, J.; Katayama, A.

    2010-01-01

    The BioRuby software toolkit contains a comprehensive set of free development tools and libraries for bioinformatics and molecular biology, written in the Ruby programming language. BioRuby has components for sequence analysis, pathway analysis, protein modelling and phylogenetic analysis; it suppor

  4. An International Bioinformatics Infrastructure to Underpin the Arabidopsis Community

    Science.gov (United States)

    The future bioinformatics needs of the Arabidopsis community as well as those of other scientific communities that depend on Arabidopsis resources were discussed at a pair of recent meetings held by the Multinational Arabidopsis Steering Committee (MASC) and the North American Arabidopsis Steering C...

  5. CROSSWORK for Glycans: Glycan Identificatin Through Mass Spectrometry and Bioinformatics

    DEFF Research Database (Denmark)

    Rasmussen, Morten; Thaysen-Andersen, Morten; Højrup, Peter

      We have developed "GLYCANthrope " - CROSSWORKS for glycans:  a bioinformatics tool, which assists in identifying N-linked glycosylated peptides as well as their glycan moieties from MS2 data of enzymatically digested glycoproteins. The program runs either as a stand-alone application or as a plug...

  6. Intrageneric Primer Design: Bringing Bioinformatics Tools to the Class

    Science.gov (United States)

    Lima, Andre O. S.; Garces, Sergio P. S.

    2006-01-01

    Bioinformatics is one of the fastest growing scientific areas over the last decade. It focuses on the use of informatics tools for the organization and analysis of biological data. An example of their importance is the availability nowadays of dozens of software programs for genomic and proteomic studies. Thus, there is a growing field (private…

  7. Learning Genetics through an Authentic Research Simulation in Bioinformatics

    Science.gov (United States)

    Gelbart, Hadas; Yarden, Anat

    2006-01-01

    Following the rationale that learning is an active process of knowledge construction as well as enculturation into a community of experts, we developed a novel web-based learning environment in bioinformatics for high-school biology majors in Israel. The learning environment enables the learners to actively participate in a guided inquiry process…

  8. Setubal named deputy director at Virginia Bioinformatics Institute

    OpenAIRE

    Whyte, Barry James

    2005-01-01

    The Virginia Bioinformatics Institute (VBI) at Virginia Tech has announced the appointment of Joao Setubal as the institute's deputy director. As deputy director, Setubal will act on behalf of VBI's executive and scientific director, Bruno Sobral, handling internal administrative functions, as well as scientific decision making.

  9. Bioinformatics Assisted Gene Discovery and Annotation of Human Genome

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    As the sequencing stage of human genome project is near the end, the work has begun for discovering novel genes from genome sequences and annotating their biological functions. Here are reviewed current major bioinformatics tools and technologies available for large scale gene discovery and annotation from human genome sequences. Some ideas about possible future development are also provided.

  10. Pladipus Enables Universal Distributed Computing in Proteomics Bioinformatics.

    Science.gov (United States)

    Verheggen, Kenneth; Maddelein, Davy; Hulstaert, Niels; Martens, Lennart; Barsnes, Harald; Vaudel, Marc

    2016-03-01

    The use of proteomics bioinformatics substantially contributes to an improved understanding of proteomes, but this novel and in-depth knowledge comes at the cost of increased computational complexity. Parallelization across multiple computers, a strategy termed distributed computing, can be used to handle this increased complexity; however, setting up and maintaining a distributed computing infrastructure requires resources and skills that are not readily available to most research groups. Here we propose a free and open-source framework named Pladipus that greatly facilitates the establishment of distributed computing networks for proteomics bioinformatics tools. Pladipus is straightforward to install and operate thanks to its user-friendly graphical interface, allowing complex bioinformatics tasks to be run easily on a network instead of a single computer. As a result, any researcher can benefit from the increased computational efficiency provided by distributed computing, hence empowering them to tackle more complex bioinformatics challenges. Notably, it enables any research group to perform large-scale reprocessing of publicly available proteomics data, thus supporting the scientific community in mining these data for novel discoveries. PMID:26510693

  11. Managing, Analysing, and Integrating Big Data in Medical Bioinformatics: Open Problems and Future Perspectives

    OpenAIRE

    2014-01-01

    The explosion of the data both in the biomedical research and in the healthcare systems demands urgent solutions. In particular, the research in omics sciences is moving from a hypothesis-driven to a data-driven approach. Healthcare is additionally always asking for a tighter integration with biomedical data in order to promote personalized medicine and to provide better treatments. Efficient analysis and interpretation of Big Data opens new avenues to explore molecular biology, new questions...

  12. BioWarehouse: a bioinformatics database warehouse toolkit

    Directory of Open Access Journals (Sweden)

    Stringer-Calvert David WJ

    2006-03-01

    Full Text Available Abstract Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the

  13. Freshwater Metaviromics and Bacteriophages: A Current Assessment of the State of the Art in Relation to Bioinformatic Challenges

    Science.gov (United States)

    Bruder, Katherine; Malki, Kema; Cooper, Alexandria; Sible, Emily; Shapiro, Jason W.; Watkins, Siobhan C.; Putonti, Catherine

    2016-01-01

    Advances in bioinformatics and sequencing technologies have allowed for the analysis of complex microbial communities at an unprecedented rate. While much focus is often placed on the cellular members of these communities, viruses play a pivotal role, particularly bacteria-infecting viruses (bacteriophages); phages mediate global biogeochemical processes and drive microbial evolution through bacterial grazing and horizontal gene transfer. Despite their importance and ubiquity in nature, very little is known about the diversity and structure of viral communities. Though the need for culture-based methods for viral identification has been somewhat circumvented through metagenomic techniques, the analysis of metaviromic data is marred with many unique issues. In this review, we examine the current bioinformatic approaches for metavirome analyses and the inherent challenges facing the field as illustrated by the ongoing efforts in the exploration of freshwater phage populations. PMID:27375355

  14. Missing "Links" in Bioinformatics Education: Expanding Students' Conceptions of Bioinformatics Using a Biodiversity Database of Living and Fossil Reef Corals

    Science.gov (United States)

    Nehm, Ross H.; Budd, Ann F.

    2006-01-01

    NMITA is a reef coral biodiversity database that we use to introduce students to the expansive realm of bioinformatics beyond genetics. We introduce a series of lessons that have students use this database, thereby accessing real data that can be used to test hypotheses about biodiversity and evolution while targeting the "National Science …

  15. Applying Instructional Design Theories to Bioinformatics Education in Microarray Analysis and Primer Design Workshops

    Science.gov (United States)

    Shachak, Aviv; Ophir, Ron; Rubin, Eitan

    2005-01-01

    The need to support bioinformatics training has been widely recognized by scientists, industry, and government institutions. However, the discussion of instructional methods for teaching bioinformatics is only beginning. Here we report on a systematic attempt to design two bioinformatics workshops for graduate biology students on the basis of…

  16. Design and Implementation of an Interdepartmental Bioinformatics Program across Life Science Curricula

    Science.gov (United States)

    Miskowski, Jennifer A.; Howard, David R.; Abler, Michael L.; Grunwald, Sandra K.

    2007-01-01

    Over the past 10 years, there has been a technical revolution in the life sciences leading to the emergence of a new discipline called bioinformatics. In response, bioinformatics-related topics have been incorporated into various undergraduate courses along with the development of new courses solely focused on bioinformatics. This report describes…

  17. Introductory Bioinformatics Exercises Utilizing Hemoglobin and Chymotrypsin to Reinforce the Protein Sequence-Structure-Function Relationship

    Science.gov (United States)

    Inlow, Jennifer K.; Miller, Paige; Pittman, Bethany

    2007-01-01

    We describe two bioinformatics exercises intended for use in a computer laboratory setting in an upper-level undergraduate biochemistry course. To introduce students to bioinformatics, the exercises incorporate several commonly used bioinformatics tools, including BLAST, that are freely available online. The exercises build upon the students'…

  18. Report on the EMBER Project--A European Multimedia Bioinformatics Educational Resource

    Science.gov (United States)

    Attwood, Terri K.; Selimas, Ioannis; Buis, Rob; Altenburg, Ruud; Herzog, Robert; Ledent, Valerie; Ghita, Viorica; Fernandes, Pedro; Marques, Isabel; Brugman, Marc

    2005-01-01

    EMBER was a European project aiming to develop bioinformatics teaching materials on the Web and CD-ROM to help address the recognised skills shortage in bioinformatics. The project grew out of pilot work on the development of an interactive web-based bioinformatics tutorial and the desire to repackage that resource with the help of a professional…

  19. Vertical and Horizontal Integration of Bioinformatics Education: A Modular, Interdisciplinary Approach

    Science.gov (United States)

    Furge, Laura Lowe; Stevens-Truss, Regina; Moore, D. Blaine; Langeland, James A.

    2009-01-01

    Bioinformatics education for undergraduates has been approached primarily in two ways: introduction of new courses with largely bioinformatics focus or introduction of bioinformatics experiences into existing courses. For small colleges such as Kalamazoo, creation of new courses within an already resource-stretched setting has not been an option.…

  20. Wnt-signalling pathways and microRNAs network in carcinogenesis: experimental and bioinformatics approaches.

    Science.gov (United States)

    Onyido, Emenike K; Sweeney, Eloise; Nateri, Abdolrahman Shams

    2016-01-01

    Over the past few years, microRNAs (miRNAs) have not only emerged as integral regulators of gene expression at the post-transcriptional level but also respond to signalling molecules to affect cell function(s). miRNAs crosstalk with a variety of the key cellular signalling networks such as Wnt, transforming growth factor-β and Notch, control stem cell activity in maintaining tissue homeostasis, while if dysregulated contributes to the initiation and progression of cancer. Herein, we overview the molecular mechanism(s) underlying the crosstalk between Wnt-signalling components (canonical and non-canonical) and miRNAs, as well as changes in the miRNA/Wnt-signalling components observed in the different forms of cancer. Furthermore, the fundamental understanding of miRNA-mediated regulation of Wnt-signalling pathway and vice versa has been significantly improved by high-throughput genomics and bioinformatics technologies. Whilst, these approaches have identified a number of specific miRNA(s) that function as oncogenes or tumour suppressors, additional analyses will be necessary to fully unravel the links among conserved cellular signalling pathways and miRNAs and their potential associated components in cancer, thereby creating therapeutic avenues against tumours. Hence, we also discuss the current challenges associated with Wnt-signalling/miRNAs complex and the analysis using the biomedical experimental and bioinformatics approaches. PMID:27590724

  1. TOGGLE : toolbox for generic NGS analyses

    OpenAIRE

    Monat, Cécile; Tranchant-Dubreuil, Christine; Kougbeadjo, Ayité; Farcy, Cédric; Ortega-Abboud, Enrique; Amanzougarene, Souhila; Ravel, Sébastien; Agbessi, Mawussé; Orjuela-Bouniol, Julie; Summo, Maryline; Sabot, François

    2015-01-01

    Background The explosion of NGS (Next Generation Sequencing) sequence data requires a huge effort in Bioinformatics methods and analyses. The creation of dedicated, robust and reliable pipelines able to handle dozens of samples from raw FASTQ data to relevant biological data is a time-consuming task in all projects relying on NGS. To address this, we created a generic and modular toolbox for developing such pipelines. Results TOGGLE (TOolbox for Generic nGs anaLysEs) is a suite of tools able ...

  2. TOGGLE: toolbox for generic NGS analyses

    OpenAIRE

    Monat, Cécile; Tranchant-Dubreuil, Christine; Kougbeadjo, Ayité; Farcy , Cédric; Ortega-Abboud, Enrique; Amanzougarene, Souhila; Ravel, Sébastien; Agbessi, Mawusse; Orjuela-Bouniol, Julie; Summo, Marilyne; Sabot, François

    2015-01-01

    Background: The explosion of NGS (Next Generation Sequencing) sequence data requires a huge effort in Bioinformatics methods and analyses. The creation of dedicated, robust and reliable pipelines able to handle dozens of samples from raw FASTQ data to relevant biological data is a time-consuming task in all projects relying on NGS. To address this, we created a generic and modular toolbox for developing such pipelines. Results: TOGGLE (TOolbox for Generic nGs anaLysEs) is a suite of tools abl...

  3. TOGGLE: toolbox for generic NGS analyses

    OpenAIRE

    Monat, Cécile; Tranchant-Dubreuil, Christine; Kougbeadjo, Ayité; Farcy, Cédric; Ortega-Abboud, Enrique; Amanzougarene, Souhila; Ravel, Sébastien; Agbessi, Mawusse; Orjuela-Bouniol , Julie; Summo, Marilyne; Sabot, François

    2015-01-01

    Background The explosion of NGS (Next Generation Sequencing) sequence data requires a huge effort in Bioinformatics methods and analyses. The creation of dedicated, robust and reliable pipelines able to handle dozens of samples from raw FASTQ data to relevant biological data is a time-consuming task in all projects relying on NGS. To address this, we created a generic and modular toolbox for developing such pipelines. Results TOGGLE (TOolbox for Generic nGs anaLysEs) is a suite of tools able ...

  4. Bioinformatics Data Distribution and Integration via Web Services and XML

    Institute of Scientific and Technical Information of China (English)

    Xiao Li; Yizheng Zhang

    2003-01-01

    It is widely recognized that exchange, distribution, and integration of biological data are the keys to improve bioinformatics and genome biology in post-genomic era. However, the problem of exchanging and integrating biological data is not solved satisfactorily. The eXtensible Markup Language (XML) is rapidly spreading as an emerging standard for structuring documents to exchange and integrate data on the World Wide Web (WWW). Web service is the next generation of WWW and is founded upon the open standards of W3C (World Wide Web Consortium)and IETF (Internet Engineering Task Force). This paper presents XML and Web Services technologies and their use for an appropriate solution to the problem of bioinformatics data exchange and integration.

  5. Bioinformatics Analysis of Zinc Transporter from Baoding Alfalfa

    Institute of Scientific and Technical Information of China (English)

    Haibo WANG; Junyun GUO

    2012-01-01

    [Objective] This study aimed to perform the bioinformatics analysis of Zinc transporter (ZnT) from Baoding Alfalfa. [Method] Based on the amino acid sequence, the physical and chemical properties, hydrophilicity/hydrophobicity, secondary structure of ZnT from Baoding alfalfa were predicted by a series of bioinformatics software. And the transmembrane domains were predicted by using different online tools. [Result] ZnT is a hydrophobic protein containing 408 amino acids with the theoretical pl of 5.94, and it has 7 potential transmembrane hydrophobic regions. In the sec- ondary structure, co-helix (Hh) accounted for 48.04%, extended strand (Ee) for 9.56%, random coil (Cc) for 42.40%, which was accored with the characteristic of transmembrane protein. [Conclusion] mZnT is a member of CDF family, responsible for transporting Zn^2+ out of the cell membrane to reduce the concentration and toxicity of Zn^2+.

  6. Bioinformatics for whole-genome shotgun sequencing of microbial communities.

    Directory of Open Access Journals (Sweden)

    Kevin Chen

    2005-07-01

    Full Text Available The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fecal samples. The interpretation of this new kind of data poses a wide variety of exciting and difficult bioinformatics problems. The aim of this review is to introduce the bioinformatics community to this emerging field by surveying existing techniques and promising new approaches for several of the most interesting of these computational problems.

  7. Bioinformatics and Microarray Data Analysis on the Cloud.

    Science.gov (United States)

    Calabrese, Barbara; Cannataro, Mario

    2016-01-01

    High-throughput platforms such as microarray, mass spectrometry, and next-generation sequencing are producing an increasing volume of omics data that needs large data storage and computing power. Cloud computing offers massive scalable computing and storage, data sharing, on-demand anytime and anywhere access to resources and applications, and thus, it may represent the key technology for facing those issues. In fact, in the recent years it has been adopted for the deployment of different bioinformatics solutions and services both in academia and in the industry. Although this, cloud computing presents several issues regarding the security and privacy of data, that are particularly important when analyzing patients data, such as in personalized medicine. This chapter reviews main academic and industrial cloud-based bioinformatics solutions; with a special focus on microarray data analysis solutions and underlines main issues and problems related to the use of such platforms for the storage and analysis of patients data. PMID:25863787

  8. Statistical modelling in biostatistics and bioinformatics selected papers

    CERN Document Server

    Peng, Defen

    2014-01-01

    This book presents selected papers on statistical model development related mainly to the fields of Biostatistics and Bioinformatics. The coverage of the material falls squarely into the following categories: (a) Survival analysis and multivariate survival analysis, (b) Time series and longitudinal data analysis, (c) Statistical model development and (d) Applied statistical modelling. Innovations in statistical modelling are presented throughout each of the four areas, with some intriguing new ideas on hierarchical generalized non-linear models and on frailty models with structural dispersion, just to mention two examples. The contributors include distinguished international statisticians such as Philip Hougaard, John Hinde, Il Do Ha, Roger Payne and Alessandra Durio, among others, as well as promising newcomers. Some of the contributions have come from researchers working in the BIO-SI research programme on Biostatistics and Bioinformatics, centred on the Universities of Limerick and Galway in Ireland and fu...

  9. Architecture exploration of FPGA based accelerators for bioinformatics applications

    CERN Document Server

    Varma, B Sharat Chandra; Balakrishnan, M

    2016-01-01

    This book presents an evaluation methodology to design future FPGA fabrics incorporating hard embedded blocks (HEBs) to accelerate applications. This methodology will be useful for selection of blocks to be embedded into the fabric and for evaluating the performance gain that can be achieved by such an embedding. The authors illustrate the use of their methodology by studying the impact of HEBs on two important bioinformatics applications: protein docking and genome assembly. The book also explains how the respective HEBs are designed and how hardware implementation of the application is done using these HEBs. It shows that significant speedups can be achieved over pure software implementations by using such FPGA-based accelerators. The methodology presented in this book may also be used for designing HEBs for accelerating software implementations in other domains besides bioinformatics. This book will prove useful to students, researchers, and practicing engineers alike.

  10. 2nd Colombian Congress on Computational Biology and Bioinformatics

    CERN Document Server

    Cristancho, Marco; Isaza, Gustavo; Pinzón, Andrés; Rodríguez, Juan

    2014-01-01

    This volume compiles accepted contributions for the 2nd Edition of the Colombian Computational Biology and Bioinformatics Congress CCBCOL, after a rigorous review process in which 54 papers were accepted for publication from 119 submitted contributions. Bioinformatics and Computational Biology are areas of knowledge that have emerged due to advances that have taken place in the Biological Sciences and its integration with Information Sciences. The expansion of projects involving the study of genomes has led the way in the production of vast amounts of sequence data which needs to be organized, analyzed and stored to understand phenomena associated with living organisms related to their evolution, behavior in different ecosystems, and the development of applications that can be derived from this analysis.  .

  11. BioJava: an open-source framework for bioinformatics

    OpenAIRE

    Holland, R. C. G.; Down, T. A.; Pocock, M.; Prlić, A.; Huen, D; James, K.; Foisy, S.; Dräger, A.; Yates, A; Heuer, M.; Schreiber, M. J.

    2008-01-01

    Summary: BioJava is a mature open-source project that provides a framework for processing of biological data. BioJava contains powerful analysis and statistical routines, tools for parsing common file formats and packages for manipulating sequences and 3D structures. It enables rapid bioinformatics application development in the Java programming language. Availability: BioJava is an open-source project distributed under the Lesser GPL (LGPL). BioJava can be downloaded from the BioJava website...

  12. Fighting against uncertainty: An essential issue in bioinformatics

    OpenAIRE

    Hamada, Michiaki

    2013-01-01

    Many bioinformatics problems, such as sequence alignment, gene prediction, phylogenetic tree estimation and RNA secondary structure prediction, are often affected by the "uncertainty" of a solution; that is, the probability of the solution is extremely small. This situation arises for estimation problems on high-dimensional discrete spaces in which the number of possible discrete solutions is immense. In the analysis of biological data or the development of prediction algorithms, this uncerta...

  13. BIRCH: A user-oriented, locally-customizable, bioinformatics system

    Directory of Open Access Journals (Sweden)

    Fristensky Brian

    2007-02-01

    Full Text Available Abstract Background Molecular biologists need sophisticated analytical tools which often demand extensive computational resources. While finding, installing, and using these tools can be challenging, pipelining data from one program to the next is particularly awkward, especially when using web-based programs. At the same time, system administrators tasked with maintaining these tools do not always appreciate the needs of research biologists. Results BIRCH (Biological Research Computing Hierarchy is an organizational framework for delivering bioinformatics resources to a user group, scaling from a single lab to a large institution. The BIRCH core distribution includes many popular bioinformatics programs, unified within the GDE (Genetic Data Environment graphic interface. Of equal importance, BIRCH provides the system administrator with tools that simplify the job of managing a multiuser bioinformatics system across different platforms and operating systems. These include tools for integrating locally-installed programs and databases into BIRCH, and for customizing the local BIRCH system to meet the needs of the user base. BIRCH can also act as a front end to provide a unified view of already-existing collections of bioinformatics software. Documentation for the BIRCH and locally-added programs is merged in a hierarchical set of web pages. In addition to manual pages for individual programs, BIRCH tutorials employ step by step examples, with screen shots and sample files, to illustrate both the important theoretical and practical considerations behind complex analytical tasks. Conclusion BIRCH provides a versatile organizational framework for managing software and databases, and making these accessible to a user base. Because of its network-centric design, BIRCH makes it possible for any user to do any task from anywhere.

  14. Bioinformatics approaches for identifying new therapeutic bioactive peptides in food

    OpenAIRE

    Nora Khaldi

    2012-01-01

    ABSTRACT:The traditional methods for mining foods for bioactive peptides are tedious and long. Similar to the drug industry, the length of time to identify and deliver a commercial health ingredient that reduces disease symptoms can take anything between 5 to 10 years. Reducing this time and effort is crucial in order to create new commercially viable products with clear and important health benefits. In the past few years, bioinformatics, the science that brings together fast computational b...

  15. Promoting synergistic research and education in genomics and bioinformatics

    OpenAIRE

    Yang, Jack Y; Yang, Mary Qu; Zhu, Mengxia (Michelle); Arabnia, Hamid R; Deng, Youping

    2008-01-01

    Bioinformatics and Genomics are closely related disciplines that hold great promises for the advancement of research and development in complex biomedical systems, as well as public health, drug design, comparative genomics, personalized medicine and so on. Research and development in these two important areas are impacting the science and technology. High throughput sequencing and molecular imaging technologies marked the beginning of a new era for modern translational medicine and personali...

  16. Drug targets for lymphatic filariasis: A bioinformatics approach

    OpenAIRE

    Om Prakash Sharma; Yellamandaya Vadlamudi; Arun Gupta Kota; Vikrant Kumar Sinha; Muthuvel Suresh Kumar

    2013-01-01

    This review article discusses the current scenario of the national and international burden due to lymphatic filariasis (LF) and describes the active elimination programmes for LF and their achievements to eradicate this most debilitating disease from the earth. Since, bioinformatics is a rapidly growing field of biological study, and it has an increasingly significant role in various fields of biology. We have reviewed its leading involvement in the filarial research using different approach...

  17. Developing expertise in bioinformatics for biomedical research in Africa

    OpenAIRE

    Karikari, Thomas K.; Emmanuel Quansah; Wael M.Y. Mohamed

    2015-01-01

    Research in bioinformatics has a central role in helping to advance biomedical research. However, its introduction to Africa has been met with some challenges (such as inadequate infrastructure, training opportunities, research funding, human resources, biorepositories and databases) that have contributed to the slow pace of development in this field across the continent. Fortunately, recent improvements in areas such as research funding, infrastructural support and capacity building are help...

  18. The European Bioinformatics Institute’s data resources

    OpenAIRE

    Brooksbank, Catherine; Cameron, Graham; Thornton, Janet

    2009-01-01

    The wide uptake of next-generation sequencing and other ultra-high throughput technologies by life scientists with a diverse range of interests, spanning fundamental biological research, medicine, agriculture and environmental science, has led to unprecedented growth in the amount of data generated. It has also put the need for unrestricted access to biological data at the centre of biology. The European Bioinformatics Institute (EMBL-EBI) is unique in Europe and is one of only two organisati...

  19. A quick guide for building a successful bioinformatics community.

    Science.gov (United States)

    Budd, Aidan; Corpas, Manuel; Brazas, Michelle D; Fuller, Jonathan C; Goecks, Jeremy; Mulder, Nicola J; Michaut, Magali; Ouellette, B F Francis; Pawlik, Aleksandra; Blomberg, Niklas

    2015-02-01

    "Scientific community" refers to a group of people collaborating together on scientific-research-related activities who also share common goals, interests, and values. Such communities play a key role in many bioinformatics activities. Communities may be linked to a specific location or institute, or involve people working at many different institutions and locations. Education and training is typically an important component of these communities, providing a valuable context in which to develop skills and expertise, while also strengthening links and relationships within the community. Scientific communities facilitate: (i) the exchange and development of ideas and expertise; (ii) career development; (iii) coordinated funding activities; (iv) interactions and engagement with professionals from other fields; and (v) other activities beneficial to individual participants, communities, and the scientific field as a whole. It is thus beneficial at many different levels to understand the general features of successful, high-impact bioinformatics communities; how individual participants can contribute to the success of these communities; and the role of education and training within these communities. We present here a quick guide to building and maintaining a successful, high-impact bioinformatics community, along with an overview of the general benefits of participating in such communities. This article grew out of contributions made by organizers, presenters, panelists, and other participants of the ISMB/ECCB 2013 workshop "The 'How To Guide' for Establishing a Successful Bioinformatics Network" at the 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and the 12th European Conference on Computational Biology (ECCB). PMID:25654371

  20. Promoting synergistic research and education in genomics and bioinformatics

    OpenAIRE

    Arabnia Hamid R; Zhu Mengxia; Yang Mary Qu; Yang Jack Y; Deng Youping

    2008-01-01

    Abstract Bioinformatics and Genomics are closely related disciplines that hold great promises for the advancement of research and development in complex biomedical systems, as well as public health, drug design, comparative genomics, personalized medicine and so on. Research and development in these two important areas are impacting the science and technology. High throughput sequencing and molecular imaging technologies marked the beginning of a new era for modern translational medicine and ...

  1. Applications of Structural Bioinformatics for the Structural Genomics Era

    OpenAIRE

    Novotny, Marian

    2007-01-01

    Structural bioinformatics deals with the analysis, classification and prediction of three-dimensional structures of biomacromolecules. It is becoming increasingly important as the number of structures is growing rapidly. This thesis describes three studies concerned with protein-function prediction and two studies about protein structure validation. New protein structures are often compared to known structures to find out if they have a known fold, which may provide hints about their function...

  2. Bioinformatics glossary based Database of Biological Databases: DBD

    OpenAIRE

    Siva Kiran RR; Setty MVN; Hanumatha Rao G

    2009-01-01

    Database of Biological/Bioinformatics Databases (DBD) is a collection of 1669 databases and online resources collected from NAR Database Summary Papers (http://www.oxfordjournals.org/nar/database/a/) & Internet search engines. The database has been developed based on 437 keywords (Glossary) available in http://falcon.roswellpark.org/labweb/glossary.html. Keywords with their relevant databases are arranged in alphabetic order which enables quick accession of databases by researchers. Dat...

  3. Integration of bioinformatics into an undergraduate biology curriculum and the impact on development of mathematical skills.

    Science.gov (United States)

    Wightman, Bruce; Hark, Amy T

    2012-01-01

    The development of fields such as bioinformatics and genomics has created new challenges and opportunities for undergraduate biology curricula. Students preparing for careers in science, technology, and medicine need more intensive study of bioinformatics and more sophisticated training in the mathematics on which this field is based. In this study, we deliberately integrated bioinformatics instruction at multiple course levels into an existing biology curriculum. Students in an introductory biology course, intermediate lab courses, and advanced project-oriented courses all participated in new course components designed to sequentially introduce bioinformatics skills and knowledge, as well as computational approaches that are common to many bioinformatics applications. In each course, bioinformatics learning was embedded in an existing disciplinary instructional sequence, as opposed to having a single course where all bioinformatics learning occurs. We designed direct and indirect assessment tools to follow student progress through the course sequence. Our data show significant gains in both student confidence and ability in bioinformatics during individual courses and as course level increases. Despite evidence of substantial student learning in both bioinformatics and mathematics, students were skeptical about the link between learning bioinformatics and learning mathematics. While our approach resulted in substantial learning gains, student "buy-in" and engagement might be better in longer project-based activities that demand application of skills to research problems. Nevertheless, in situations where a concentrated focus on project-oriented bioinformatics is not possible or desirable, our approach of integrating multiple smaller components into an existing curriculum provides an alternative. PMID:22987552

  4. Bioinformatics: Current practice and future challenges for life science education.

    Science.gov (United States)

    Hack, Catherine; Kendall, Gary

    2005-03-01

    It is widely predicted that the application of high-throughput technologies to the quantification and identification of biological molecules will cause a paradigm shift in the life sciences. However, if the biosciences are to evolve from a predominantly descriptive discipline to an information science, practitioners will require enhanced skills in mathematics, computing, and statistical analysis. Universities have responded to the widely perceived skills gap primarily by developing masters programs in bioinformatics, resulting in a rapid expansion in the provision of postgraduate bioinformatics education. There is, however, a clear need to improve the quantitative and analytical skills of life science undergraduates. This article reviews the response of academia in the United Kingdom and proposes the learning outcomes that graduates should achieve to cope with the new biology. While the analysis discussed here uses the development of bioinformatics education in the United Kingdom as an illustrative example, it is hoped that the issues raised will resonate with all those involved in curriculum development in the life sciences. PMID:21638550

  5. A comparison of common programming languages used in bioinformatics

    Directory of Open Access Journals (Sweden)

    Gillings Michael R

    2008-02-01

    Full Text Available Abstract Background The performance of different programming languages has previously been benchmarked using abstract mathematical algorithms, but not using standard bioinformatics algorithms. We compared the memory usage and speed of execution for three standard bioinformatics methods, implemented in programs using one of six different programming languages. Programs for the Sellers algorithm, the Neighbor-Joining tree construction algorithm and an algorithm for parsing BLAST file outputs were implemented in C, C++, C#, Java, Perl and Python. Results Implementations in C and C++ were fastest and used the least memory. Programs in these languages generally contained more lines of code. Java and C# appeared to be a compromise between the flexibility of Perl and Python and the fast performance of C and C++. The relative performance of the tested languages did not change from Windows to Linux and no clear evidence of a faster operating system was found. Source code and additional information are available from http://www.bioinformatics.org/benchmark/ Conclusion This benchmark provides a comparison of six commonly used programming languages under two different operating systems. The overall comparison shows that a developer should choose an appropriate language carefully, taking into account the performance expected and the library availability for each language.

  6. SNPTrack™ : an integrated bioinformatics system for genetic association studies.

    Science.gov (United States)

    Xu, Joshua; Kelly, Reagan; Zhou, Guangxu; Turner, Steven A; Ding, Don; Harris, Stephen C; Hong, Huixiao; Fang, Hong; Tong, Weida

    2012-01-01

    A genetic association study is a complicated process that involves collecting phenotypic data, generating genotypic data, analyzing associations between genotypic and phenotypic data, and interpreting genetic biomarkers identified. SNPTrack is an integrated bioinformatics system developed by the US Food and Drug Administration (FDA) to support the review and analysis of pharmacogenetics data resulting from FDA research or submitted by sponsors. The system integrates data management, analysis, and interpretation in a single platform for genetic association studies. Specifically, it stores genotyping data and single-nucleotide polymorphism (SNP) annotations along with study design data in an Oracle database. It also integrates popular genetic analysis tools, such as PLINK and Haploview. SNPTrack provides genetic analysis capabilities and captures analysis results in its database as SNP lists that can be cross-linked for biological interpretation to gene/protein annotations, Gene Ontology, and pathway analysis data. With SNPTrack, users can do the entire stream of bioinformatics jobs for genetic association studies. SNPTrack is freely available to the public at http://www.fda.gov/ScienceResearch/BioinformaticsTools/SNPTrack/default.htm. PMID:23245293

  7. An Integrative Study on Bioinformatics Computing Concepts, Issues and Problems

    Directory of Open Access Journals (Sweden)

    Muhammad Zakarya

    2011-11-01

    Full Text Available Bioinformatics is the permutation and mishmash of biological science and 4IT. The discipline covers every computational tools and techniques used to administer, examine and manipulate huge sets of biological statistics. The discipline also helps in creation of databases to store up and supervise biological statistics, improvement of computer algorithms to find out relations in these databases and use of computer tools for the study and understanding of biological information, including DNA, RNA, protein sequences, gene expression profiles, protein structures, and biochemical pathways. The study of this paper implements an integrative solution. As we know that solution to a problem in a specific discipline may be a solution to another problem in a different discipline. For example entropy that has been rented from physical sciences is solution to most of the problems and issues in computer science. Another example is bioinformatics, where computing method and applications are implemented over biological information. This paper shows an initiative step towards that and will discuss upon the needs for integration of multiple discipline and sciences. Similarly green chemistry gives birth to a new kind of computing i.e. green computing. In next versions of this paper we will study biological fuel cell and will discuss to develop a mobile battery that will be life time charged using the concepts of biological fuel cell. Another issue that we are going to discuss in our series is brain tumor detection. This paper is a review on BI i.e. bioinformatics to start with.

  8. The Patentability of Biomolecules – Does Online Bioinformatics Compromise Novelty?

    Directory of Open Access Journals (Sweden)

    Anton Hutter

    2006-04-01

    Full Text Available Researchers are becoming increasingly concerned that the confidentiality of their novel biomolecule sequences is being jeopardised, particularly when these sequences are either submitted to sequence databases or uploaded as query terms onto internet-based bioinformatic software suites. The researcher’s fears stem from the fact that the actual uploading of their sequences acts as a novelty destroying prior disclosure or publication, and that this may subsequently preclude valid patent protection for the sequences. This article addresses the key issues involved in the analyses of biomolecules, highlighting potential risks taken by many researchers in regard to patent protection and suggests possible ways in which these risks may be mitigated.

  9. A novel bioinformatic strategy to characterise microbial communities in biogas reactors

    DEFF Research Database (Denmark)

    Treu, Laura; Campanaro, Stefano; De Francisci, Davide;

    2014-01-01

    16S hypervariable regions, especially when working with the not high quality very short reads characteristics of next generation sequencers (Mande S.S. et al., 2012). Previous works analysed the microbial community composition in biogas reactors via 16S rDNA sequencing (Luo, G. et al., 2013; Werner......, J.J. et al., 2011). For this reason we developed a bioinformatics strategy in order to create a tool to review the generated dataset and to obtain a more strict control on the bacterial composition at the species level, with estimation of its reliability. The program perform local similarity search......%). Bacteroides, Acetobacterium and Pseudomonas genera had difficulties in the assignment, gathering medium-low probability as well (1 to 50%). On the contrary several other genera were assigned with high probability and no multiple matches. Some of them including only one species uniquely identified, some other...

  10. 10th International Conference on Practical Applications of Computational Biology & Bioinformatics

    CERN Document Server

    Rocha, Miguel; Fdez-Riverola, Florentino; Mayo, Francisco; Paz, Juan

    2016-01-01

    Biological and biomedical research are increasingly driven by experimental techniques that challenge our ability to analyse, process and extract meaningful knowledge from the underlying data. The impressive capabilities of next generation sequencing technologies, together with novel and ever evolving distinct types of omics data technologies, have put an increasingly complex set of challenges for the growing fields of Bioinformatics and Computational Biology. The analysis of the datasets produced and their integration call for new algorithms and approaches from fields such as Databases, Statistics, Data Mining, Machine Learning, Optimization, Computer Science and Artificial Intelligence. Clearly, Biology is more and more a science of information requiring tools from the computational sciences. In the last few years, we have seen the surge of a new generation of interdisciplinary scientists that have a strong background in the biological and computational sciences. In this context, the interaction of researche...

  11. 8th International Conference on Practical Applications of Computational Biology & Bioinformatics

    CERN Document Server

    Rocha, Miguel; Fdez-Riverola, Florentino; Santana, Juan

    2014-01-01

    Biological and biomedical research are increasingly driven by experimental techniques that challenge our ability to analyse, process and extract meaningful knowledge from the underlying data. The impressive capabilities of next generation sequencing technologies, together with novel and ever evolving distinct types of omics data technologies, have put an increasingly complex set of challenges for the growing fields of Bioinformatics and Computational Biology. The analysis of the datasets produced and their integration call for new algorithms and approaches from fields such as Databases, Statistics, Data Mining, Machine Learning, Optimization, Computer Science and Artificial Intelligence. Clearly, Biology is more and more a science of information requiring tools from the computational sciences. In the last few years, we have seen the surge of a new generation of interdisciplinary scientists that have a strong background in the biological and computational sciences. In this context, the interaction of researche...

  12. Analysing EWviews

    DEFF Research Database (Denmark)

    Jelsøe, Erling; Jæger, Birgit

    2015-01-01

    When analysing the results of a European wide citizen consultation on sustainable consumption it is necessary to take a number of issues into account, such as the question of representativity and tensions between national and European identies and between consumer and Citizen orientations regarding...

  13. Massive Collection of Full-Length Complementary DNA Clones and Microarray Analyses:. Keys to Rice Transcriptome Analysis

    Science.gov (United States)

    Kikuchi, Shoshi

    2009-02-01

    Completion of the high-precision genome sequence analysis of rice led to the collection of about 35,000 full-length cDNA clones and the determination of their complete sequences. Mapping of these full-length cDNA sequences has given us information on (1) the number of genes expressed in the rice genome; (2) the start and end positions and exon-intron structures of rice genes; (3) alternative transcripts; (4) possible encoded proteins; (5) non-protein-coding (np) RNAs; (6) the density of gene localization on the chromosome; (7) setting the parameters of gene prediction programs; and (8) the construction of a microarray system that monitors global gene expression. Manual curation for rice gene annotation by using mapping information on full-length cDNA and EST assemblies has revealed about 32,000 expressed genes in the rice genome. Analysis of major gene families, such as those encoding membrane transport proteins (pumps, ion channels, and secondary transporters), along with the evolution from bacteria to higher animals and plants, reveals how gene numbers have increased through adaptation to circumstances. Family-based gene annotation also gives us a new way of comparing organisms. Massive amounts of data on gene expression under many kinds of physiological conditions are being accumulated in rice oligoarrays (22K and 44K) based on full-length cDNA sequences. Cluster analyses of genes that have the same promoter cis-elements, that have similar expression profiles, or that encode enzymes in the same metabolic pathways or signal transduction cascades give us clues to understanding the networks of gene expression in rice. As a tool for that purpose, we recently developed "RiCES", a tool for searching for cis-elements in the promoter regions of clustered genes.

  14. Bioinformatics in microbial biotechnology – a mini review

    Directory of Open Access Journals (Sweden)

    Bansal Arvind K

    2005-06-01

    Full Text Available Abstract The revolutionary growth in the computation speed and memory storage capability has fueled a new era in the analysis of biological data. Hundreds of microbial genomes and many eukaryotic genomes including a cleaner draft of human genome have been sequenced raising the expectation of better control of microorganisms. The goals are as lofty as the development of rational drugs and antimicrobial agents, development of new enhanced bacterial strains for bioremediation and pollution control, development of better and easy to administer vaccines, the development of protein biomarkers for various bacterial diseases, and better understanding of host-bacteria interaction to prevent bacterial infections. In the last decade the development of many new bioinformatics techniques and integrated databases has facilitated the realization of these goals. Current research in bioinformatics can be classified into: (i genomics – sequencing and comparative study of genomes to identify gene and genome functionality, (ii proteomics – identification and characterization of protein related properties and reconstruction of metabolic and regulatory pathways, (iii cell visualization and simulation to study and model cell behavior, and (iv application to the development of drugs and anti-microbial agents. In this article, we will focus on the techniques and their limitations in genomics and proteomics. Bioinformatics research can be classified under three major approaches: (1 analysis based upon the available experimental wet-lab data, (2 the use of mathematical modeling to derive new information, and (3 an integrated approach that integrates search techniques with mathematical modeling. The major impact of bioinformatics research has been to automate the genome sequencing, automated development of integrated genomics and proteomics databases, automated genome comparisons to identify the genome function, automated derivation of metabolic pathways, gene

  15. Composable languages for bioinformatics: the NYoSh experiment

    Directory of Open Access Journals (Sweden)

    Manuele Simi

    2014-01-01

    Full Text Available Language WorkBenches (LWBs are software engineering tools that help domain experts develop solutions to various classes of problems. Some of these tools focus on non-technical users and provide languages to help organize knowledge while other workbenches provide means to create new programming languages. A key advantage of language workbenches is that they support the seamless composition of independently developed languages. This capability is useful when developing programs that can benefit from different levels of abstraction. We reasoned that language workbenches could be useful to develop bioinformatics software solutions. In order to evaluate the potential of language workbenches in bioinformatics, we tested a prominent workbench by developing an alternative to shell scripting. To illustrate what LWBs and Language Composition can bring to bioinformatics, we report on our design and development of NYoSh (Not Your ordinary Shell. NYoSh was implemented as a collection of languages that can be composed to write programs as expressive and concise as shell scripts. This manuscript offers a concrete illustration of the advantages and current minor drawbacks of using the MPS LWB. For instance, we found that we could implement an environment-aware editor for NYoSh that can assist the programmers when developing scripts for specific execution environments. This editor further provides semantic error detection and can be compiled interactively with an automatic build and deployment system. In contrast to shell scripts, NYoSh scripts can be written in a modern development environment, supporting context dependent intentions and can be extended seamlessly by end-users with new abstractions and language constructs. We further illustrate language extension and composition with LWBs by presenting a tight integration of NYoSh scripts with the GobyWeb system. The NYoSh Workbench prototype, which implements a fully featured integrated development environment

  16. Composable languages for bioinformatics: the NYoSh experiment.

    Science.gov (United States)

    Simi, Manuele; Campagne, Fabien

    2014-01-01

    Language WorkBenches (LWBs) are software engineering tools that help domain experts develop solutions to various classes of problems. Some of these tools focus on non-technical users and provide languages to help organize knowledge while other workbenches provide means to create new programming languages. A key advantage of language workbenches is that they support the seamless composition of independently developed languages. This capability is useful when developing programs that can benefit from different levels of abstraction. We reasoned that language workbenches could be useful to develop bioinformatics software solutions. In order to evaluate the potential of language workbenches in bioinformatics, we tested a prominent workbench by developing an alternative to shell scripting. To illustrate what LWBs and Language Composition can bring to bioinformatics, we report on our design and development of NYoSh (Not Your ordinary Shell). NYoSh was implemented as a collection of languages that can be composed to write programs as expressive and concise as shell scripts. This manuscript offers a concrete illustration of the advantages and current minor drawbacks of using the MPS LWB. For instance, we found that we could implement an environment-aware editor for NYoSh that can assist the programmers when developing scripts for specific execution environments. This editor further provides semantic error detection and can be compiled interactively with an automatic build and deployment system. In contrast to shell scripts, NYoSh scripts can be written in a modern development environment, supporting context dependent intentions and can be extended seamlessly by end-users with new abstractions and language constructs. We further illustrate language extension and composition with LWBs by presenting a tight integration of NYoSh scripts with the GobyWeb system. The NYoSh Workbench prototype, which implements a fully featured integrated development environment for NYoSh is

  17. Applications of Support Vector Machines In Chemo And Bioinformatics

    Science.gov (United States)

    Jayaraman, V. K.; Sundararajan, V.

    2010-10-01

    Conventional linear & nonlinear tools for classification, regression & data driven modeling are being replaced on a rapid scale by newer techniques & tools based on artificial intelligence and machine learning. While the linear techniques are not applicable for inherently nonlinear problems, newer methods serve as attractive alternatives for solving real life problems. Support Vector Machine (SVM) classifiers are a set of universal feed-forward network based classification algorithms that have been formulated from statistical learning theory and structural risk minimization principle. SVM regression closely follows the classification methodology. In this work recent applications of SVM in Chemo & Bioinformatics will be described with suitable illustrative examples.

  18. SPOT--towards temporal data mining in medicine and bioinformatics.

    Science.gov (United States)

    Tusch, Guenter; Bretl, Chris; O'Connor, Martin; Connor, Martin; Das, Amar

    2008-01-01

    Mining large clinical and bioinformatics databases often includes exploration of temporal data. E.g., in liver transplantation, researchers might look for patients with an unusual time pattern of potential complications of the liver. In Knowledge-based Temporal Abstraction time-stamped data points are transformed into an interval-based representation. We extended this framework by creating an open-source platform, SPOT. It supports the R statistical package and knowledge representation standards (OWL, SWRL) using the open source Semantic Web tool Protégé-OWL. PMID:18999225

  19. Application of bioinformatics on the detection of pathogens by Pcr

    International Nuclear Information System (INIS)

    Salmonellas are the main responsible agent for the frequent food-borne gastrointestinal diseases. Their detection using classical methods are laborious and their results take a lot of time to be revealed. In this context, we tried to set up a revealing technique of the invA virulence gene, found in the majority of Salmonella species. After amplification with PCR using specific primers created and verified by bioinformatics programs, two couples of primers were set up and they appeared to be very specific and sensitive for the detection of invA gene. (Author)

  20. WU-Blast2 server at the European Bioinformatics Institute

    OpenAIRE

    Lopez, Rodrigo; Silventoinen, Ville; Robinson, Stephen; Kibria, Asif; Gish, Warren

    2003-01-01

    Since 1995, the WU-BLAST programs (http://blast.wustl.edu) have provided a fast, flexible and reliable method for similarity searching of biological sequence databases. The software is in use at many locales and web sites. The European Bioinformatics Institute's WU-Blast2 (http://www.ebi.ac.uk/blast2/) server has been providing free access to these search services since 1997 and today supports many features that both enhance the usability and expand on the scope of the software.

  1. Biophysics and bioinformatics of transcription regulation in bacteria and bacteriophages

    Science.gov (United States)

    Djordjevic, Marko

    2005-11-01

    Due to rapid accumulation of biological data, bioinformatics has become a very important branch of biological research. In this thesis, we develop novel bioinformatic approaches and aid design of biological experiments by using ideas and methods from statistical physics. Identification of transcription factor binding sites within the regulatory segments of genomic DNA is an important step towards understanding of the regulatory circuits that control expression of genes. We propose a novel, biophysics based algorithm, for the supervised detection of transcription factor (TF) binding sites. The method classifies potential binding sites by explicitly estimating the sequence-specific binding energy and the chemical potential of a given TF. In contrast with the widely used information theory based weight matrix method, our approach correctly incorporates saturation in the transcription factor/DNA binding probability. This results in a significant reduction in the number of expected false positives, and in the explicit appearance---and determination---of a binding threshold. The new method was used to identify likely genomic binding sites for the Escherichia coli TFs, and to examine the relationship between TF binding specificity and degree of pleiotropy (number of regulatory targets). We next address how parameters of protein-DNA interactions can be obtained from data on protein binding to random oligos under controlled conditions (SELEX experiment data). We show that 'robust' generation of an appropriate data set is achieved by a suitable modification of the standard SELEX procedure, and propose a novel bioinformatic algorithm for analysis of such data. Finally, we use quantitative data analysis, bioinformatic methods and kinetic modeling to analyze gene expression strategies of bacterial viruses. We study bacteriophage Xp10 that infects rice pathogen Xanthomonas oryzae. Xp10 is an unusual bacteriophage, which has morphology and genome organization that most closely

  2. Bioinformatics pipeline for functional identification and characterization of proteins

    Science.gov (United States)

    Skarzyńska, Agnieszka; Pawełkowicz, Magdalena; Krzywkowski, Tomasz; Świerkula, Katarzyna; PlÄ der, Wojciech; Przybecki, Zbigniew

    2015-09-01

    The new sequencing methods, called Next Generation Sequencing gives an opportunity to possess a vast amount of data in short time. This data requires structural and functional annotation. Functional identification and characterization of predicted proteins could be done by in silico approches, thanks to a numerous computational tools available nowadays. However, there is a need to confirm the results of proteins function prediction using different programs and comparing the results or confirm experimentally. Here we present a bioinformatics pipeline for structural and functional annotation of proteins.

  3. MicroRNA from tuberculosis RNA: A bioinformatics study

    OpenAIRE

    Wiwanitkit, Somsri; Wiwanitkit, Viroj

    2012-01-01

    The role of microRNA in the pathogenesis of pulmonary tuberculosis is the interesting topic in chest medicine at present. Recently, it was proposed that the microRNA can be a useful biomarker for monitoring of pulmonary tuberculosis and might be the important part in pathogenesis of disease. Here, the authors perform a bioinformatics study to assess the microRNA within known tuberculosis RNA. The microRNA part can be detected and this can be important key information in further study of the p...

  4. Provenance of e-Science Experiments - experience from Bioinformatics

    OpenAIRE

    Greenwood, M.; Goble, C.A.; Stevens, R. D.; Zhao, J.(Central China Normal University (HZNU), Wuhan, 430079, China); Addis, M; Marvin, D; Moreau, L; Oinn, T.

    2003-01-01

    Like experiments performed at a laboratory bench, the data associated with an e-Science experiment are of reduced value if other scientists are not able to identify the origin, or provenance, of those data. Provenance information is essential if experiments are to be validated and verified by others, or even by those who originally performed them. In this article, we give an overview of our initial work on the provenance of bioinformatics e-Science experiments within myGrid. We use two kinds ...

  5. Multi-Institutional FASTQ File Exchange as a Means of Proficiency Testing for Next-Generation Sequencing Bioinformatics and Variant Interpretation.

    Science.gov (United States)

    Davies, Kurtis D; Farooqi, Midhat S; Gruidl, Mike; Hill, Charles E; Woolworth-Hirschhorn, Julie; Jones, Heather; Jones, Kenneth L; Magliocco, Anthony; Mitui, Midori; O'Neill, Philip H; O'Rourke, Rebecca; Patel, Nirali M; Qin, Dahui; Ramos, Erica; Rossi, Michael R; Schneider, Thomas M; Smith, Geoffrey H; Zhang, Linsheng; Park, Jason Y; Aisner, Dara L

    2016-07-01

    Next-generation sequencing is becoming increasingly common in clinical laboratories worldwide and is revolutionizing clinical molecular testing. However, the large amounts of raw data produced by next-generation sequencing assays and the need for complex bioinformatics analyses present unique challenges. Proficiency testing in clinical laboratories has traditionally been designed to evaluate assays in their entirety; however, it can be alternatively applied to separate assay components. We developed and implemented a multi-institutional proficiency testing approach to directly assess custom bioinformatics and variant interpretation processes. Six clinical laboratories, all of which use the same commercial library preparation kit for next-generation sequencing analysis of tumor specimens, each submitted raw data (FASTQ files) from four samples. These 24 file sets were then deidentified and redistributed to five of the institutions for analysis and interpretation according to their clinically validated approach. Among the laboratories, there was a high rate of concordance in the calling of single-nucleotide variants, in particular those we considered clinically significant (100% concordance). However, there was significant discordance in the calling of clinically significant insertions/deletions, with only two of seven being called by all participating laboratories. Missed calls were addressed by each laboratory to improve their bioinformatics processes. Thus, through our alternative proficiency testing approach, we identified the bioinformatic detection of insertions/deletions as an area of particular concern for clinical laboratories performing next-generation sequencing testing. PMID:27155050

  6. The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis.

    OpenAIRE

    Alva, V.; Nam, S.; Söding, J.; Lupas, A.

    2016-01-01

    The MPI Bioinformatics Toolkit (http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been av...

  7. Preliminary Study of Bioinformatics Patents and Their Classifications Registered in the KIPRIS Database

    OpenAIRE

    Park, Hyun-Seok

    2012-01-01

    Whereas a vast amount of new information on bioinformatics is made available to the public through patents, only a small set of patents are cited in academic papers. A detailed analysis of registered bioinformatics patents, using the existing patent search system, can provide valuable information links between science and technology. However, it is extremely difficult to select keywords to capture bioinformatics patents, reflecting the convergence of several underlying technologies. No single...

  8. BOD: a customizable bioinformatics on demand system accommodating multiple steps and parallel tasks

    OpenAIRE

    Qiao, Li-An; Zhu, Jing; Liu, Qingyan; Zhu, Tao; Song, Chi; Lin, Wei; Wei, Guozhu; Mu, Lisen; Tao, Jiang; Zhao, Nanming; Yang, Guangwen; Liu, Xiangjun

    2004-01-01

    The integration of bioinformatics resources worldwide is one of the major concerns of the biological community. We herein established the BOD (Bioinformatics on demand) system to use Grid computing technology to set up a virtual workbench via a web-based platform, to assist researchers performing customized comprehensive bioinformatics work. Users will be able to submit entire search queries and computation requests, e.g. from DNA assembly to gene prediction and finally protein folding, from ...

  9. How to Establish a Bioinformatics Postgraduate Degree Programme: A Case Study from South Africa

    OpenAIRE

    Machanick, Philip; Bishop, Ozlem Tastan

    2014-01-01

    The Research Unit in Bioinformatics at Rhodes University (RUBi), South Africa offers a Masters of Science in Bioinformatics. Growing demand for Bioinformatics qualifications results in applications from across Africa. Courses aim to bridge gaps in the diverse backgrounds of students who range from biologists with no prior computing exposure to computer scientists with no biology background. The programme is evenly split between coursework and research, with diverse modules from a range of dep...

  10. Emerging bioinformatics approaches for analysis of NGS-derived coding and non-coding RNAs in neurodegenerative diseases

    Directory of Open Access Journals (Sweden)

    Alessandro Guffanti

    2014-03-01

    Full Text Available Neurodegenerative diseases such as late-onset Alzheimer’s disease (LOAD involve a genetically complex and still obscure ensemble of causative and risk factors accompanied by complex feedback responses. The advent of ‘high-throughput’ transcriptome investigation technologies such as deep sequecing is increasingly being combined with statistical and bioinformatics analysis methods complemented by Bayesian Networks or network and graph analyses. Together, such 'integrative' studies are beginning to identify co-regulated gene networks linked with biological pathways and potentially modulating disease predisposition, outcome and progression. Specifically, bioinformatics analyses of integrated microarray and genotyping data in cases and controls reveal changes in gene expression of both protein-coding and small and long regulatory RNAs; highlight relevant quantitative transcriptional differences between LOAD and non-demented control brains and demonstrate reconfiguration of functionally meaningful molecular interaction structures in LOAD. These may be measured as changes in connectivity in ‘hub nodes’ of relevant gene networks. We illustrate here the open analytical questions in the transcriptome investigation of neurodegenerative disease studies, proposing 'ad-hoc' strategies for the evaluation of differential gene expression and hints for a simple analysis of the ncRNA part of such datasets. We then survey the emerging role of long non-coding RNA in the healthy and diseased brain transcriptome and describe the main current methods for computational modeling of gene networks. We propose accessible modular and pathway-oriented methods and guidelines for bioinformatics investigations of whole transcriptome NGS datasets. We finally present methods and databases for functional interpretations of long ncRNAs and propose a simple heuristic approach to visualize and represent physical and and functional interactions of the coding and non

  11. Technosciences in Academia: Rethinking a Conceptual Framework for Bioinformatics Undergraduate Curricula

    Science.gov (United States)

    Symeonidis, Iphigenia Sofia

    This paper aims to elucidate guiding concepts for the design of powerful undergraduate bioinformatics degrees which will lead to a conceptual framework for the curriculum. "Powerful" here should be understood as having truly bioinformatics objectives rather than enrichment of existing computer science or life science degrees on which bioinformatics degrees are often based. As such, the conceptual framework will be one which aims to demonstrate intellectual honesty in regards to the field of bioinformatics. A synthesis/conceptual analysis approach was followed as elaborated by Hurd (1983). The approach takes into account the following: bioinfonnatics educational needs and goals as expressed by different authorities, five undergraduate bioinformatics degrees case-studies, educational implications of bioinformatics as a technoscience and approaches to curriculum design promoting interdisciplinarity and integration. Given these considerations, guiding concepts emerged and a conceptual framework was elaborated. The practice of bioinformatics was given a closer look, which led to defining tool-integration skills and tool-thinking capacity as crucial areas of the bioinformatics activities spectrum. It was argued, finally, that a process-based curriculum as a variation of a concept-based curriculum (where the concepts are processes) might be more conducive to the teaching of bioinformatics given a foundational first year of integrated science education as envisioned by Bialek and Botstein (2004). Furthermore, the curriculum design needs to define new avenues of communication and learning which bypass the traditional disciplinary barriers of academic settings as undertaken by Tador and Tidmor (2005) for graduate studies.

  12. Promoting inter/multidisciplinary education and research in bioinformatics, systems biology and intelligent computing.

    Science.gov (United States)

    Yang, Mary Qu; Niemierko, Andrzej; Yang, Jack Y; Jin, Yufang

    2009-01-01

    Bioinformatics and systems biology are two booming research areas studying live organisms. Though having different focuses, bioinformatics and systems biology often share same or similar engineering and computer science methods to elucidate the mechanisms of multi-level biological systems. Regulatory mechanisms underlying biological processes involve interactions at cellular, sub-cellular, genomic and genetic levels. Accordingly, we need to bridge the gaps of biomedical researches at different levels of studies and foster the interdisciplinary and multidisciplinary research between both bioinformatics and systems biology domains. The synergic research on integrating bioinformatics and systems biology facilitates the advances in biology and medicine. PMID:20090160

  13. Detecting evolution of bioinformatics with a content and co-authorship analysis.

    Science.gov (United States)

    Song, Min; Yang, Christopher C; Tang, Xuning

    2013-12-01

    Bioinformatics is an interdisciplinary research field that applies advanced computational techniques to biological data. Bibliometrics analysis has recently been adopted to understand the knowledge structure of a research field by citation pattern. In this paper, we explore the knowledge structure of Bioinformatics from the perspective of a core open access Bioinformatics journal, BMC Bioinformatics with trend analysis, the content and co-authorship network similarity, and principal component analysis. Publications in four core journals including Bioinformatics - Oxford Journal and four conferences in Bioinformatics were harvested from DBLP. After converting publications into TF-IDF term vectors, we calculate the content similarity, and we also calculate the social network similarity based on the co-authorship network by utilizing the overlap measure between two co-authorship networks. Key terms is extracted and analyzed with PCA, visualization of the co-authorship network is conducted. The experimental results show that Bioinformatics is fast-growing, dynamic and diversified. The content analysis shows that there is an increasing overlap among Bioinformatics journals in terms of topics and more research groups participate in researching Bioinformatics according to the co-authorship network similarity. PMID:23710427

  14. Web services at the European Bioinformatics Institute-2009.

    Science.gov (United States)

    McWilliam, Hamish; Valentin, Franck; Goujon, Mickael; Li, Weizhong; Narayanasamy, Menaka; Martin, Jenny; Miyar, Teresa; Lopez, Rodrigo

    2009-07-01

    The European Bioinformatics Institute (EMBL-EBI) has been providing access to mainstream databases and tools in bioinformatics since 1997. In addition to the traditional web form based interfaces, APIs exist for core data resources such as EMBL-Bank, Ensembl, UniProt, InterPro, PDB and ArrayExpress. These APIs are based on Web Services (SOAP/REST) interfaces that allow users to systematically access databases and analytical tools. From the user's point of view, these Web Services provide the same functionality as the browser-based forms. However, using the APIs frees the user from web page constraints and are ideal for the analysis of large batches of data, performing text-mining tasks and the casual or systematic evaluation of mathematical models in regulatory networks. Furthermore, these services are widespread and easy to use; require no prior knowledge of the technology and no more than basic experience in programming. In the following we wish to inform of new and updated services as well as briefly describe planned developments to be made available during the course of 2009-2010. PMID:19435877

  15. Bioinformatics analysis of metastasis-related proteins in hepatocellular carcinoma

    Institute of Scientific and Technical Information of China (English)

    Pei-Ming Song; Yang Zhang; Yu-Fei He; Hui-Min Bao; Jian-Hua Luo; Yin-Kun Liu; Peng-Yuan Yang; Xian Chen

    2008-01-01

    AIM: To analyze the metastasis-related proteins in hepatocellular carcinoma (HCC) and discover the biomark-er candidates for diagnosis and therapeutic intervention of HCC metastasis with bioinformatics tools.METHODS: Metastasis-related proteins were determined by stable isotope labeling and MS analysis and analyzed with bioinformatics resources, including Phobius, Kyoto encyclopedia of genes and genomes (KEGG), online mendelian inheritance in man (OHIH) and human protein reference database (HPRD).RESULTS: All the metastasis-related proteins were linked to 83 pathways in KEGG, including MAPK and p53 signal pathways. Protein-protein interaction network showed that all the metastasis-related proteins were categorized into 19 function groups, including cell cycle, apoptosis and signal transcluction. OMIM analysis linked these proteins to 186 OMIM entries.CONCLUSION: Metastasis-related proteins provide HCC cells with biological advantages in cell proliferation, migration and angiogenesis, and facilitate metastasis of HCC cells. The bird's eye view can reveal a global charac-teristic of metastasis-related proteins and many differen-tially expressed proteins can be identified as candidates for diagnosis and treatment of HCC.

  16. The Bioinformatics of Integrative Medical Insights: Proposals for an International PsychoSocial and Cultural Bioinformatics Project

    Directory of Open Access Journals (Sweden)

    Ernest Rossi

    2006-01-01

    Full Text Available We propose the formation of an International PsychoSocial and Cultural Bioinformatics Project (IPCBP to explore the research foundations of Integrative Medical Insights (IMI on all levels from the molecular-genomic to the psychological, cultural, social, and spiritual. Just as The Human Genome Project identified the molecular foundations of modern medicine with the new technology of sequencing DNA during the past decade, the IPCBP would extend and integrate this neuroscience knowledge base with the technology of gene expression via DNA/proteomic microarray research and brain imaging in development, stress, healing, rehabilitation, and the psychotherapeutic facilitation of existentional wellness. We anticipate that the IPCBP will require a unique international collaboration of, academic institutions, researchers, and clinical practioners for the creation of a new neuroscience of mind-body communication, brain plasticity, memory, learning, and creative processing during optimal experiential states of art, beauty, and truth. We illustrate this emerging integration of bioinformatics with medicine with a videotape of the classical 4-stage creative process in a neuroscience approach to psychotherapy.

  17. BATMAN-TCM: a Bioinformatics Analysis Tool for Molecular mechANism of Traditional Chinese Medicine

    Science.gov (United States)

    Liu, Zhongyang; Guo, Feifei; Wang, Yong; Li, Chun; Zhang, Xinlei; Li, Honglei; Diao, Lihong; Gu, Jiangyong; Wang, Wei; Li, Dong; He, Fuchu

    2016-02-01

    Traditional Chinese Medicine (TCM), with a history of thousands of years of clinical practice, is gaining more and more attention and application worldwide. And TCM-based new drug development, especially for the treatment of complex diseases is promising. However, owing to the TCM’s diverse ingredients and their complex interaction with human body, it is still quite difficult to uncover its molecular mechanism, which greatly hinders the TCM modernization and internationalization. Here we developed the first online Bioinformatics Analysis Tool for Molecular mechANism of TCM (BATMAN-TCM). Its main functions include 1) TCM ingredients’ target prediction; 2) functional analyses of targets including biological pathway, Gene Ontology functional term and disease enrichment analyses; 3) the visualization of ingredient-target-pathway/disease association network and KEGG biological pathway with highlighted targets; 4) comparison analysis of multiple TCMs. Finally, we applied BATMAN-TCM to Qishen Yiqi dripping Pill (QSYQ) and combined with subsequent experimental validation to reveal the functions of renin-angiotensin system responsible for QSYQ’s cardioprotective effects for the first time. BATMAN-TCM will contribute to the understanding of the “multi-component, multi-target and multi-pathway” combinational therapeutic mechanism of TCM, and provide valuable clues for subsequent experimental validation, accelerating the elucidation of TCM’s molecular mechanism. BATMAN-TCM is available at http://bionet.ncpsb.org/batman-tcm.

  18. Atlas – a data warehouse for integrative bioinformatics

    Directory of Open Access Journals (Sweden)

    Yuen Macaire MS

    2005-02-01

    Full Text Available Abstract Background We present a biological data warehouse called Atlas that locally stores and integrates biological sequences, molecular interactions, homology information, functional annotations of genes, and biological ontologies. The goal of the system is to provide data, as well as a software infrastructure for bioinformatics research and development. Description The Atlas system is based on relational data models that we developed for each of the source data types. Data stored within these relational models are managed through Structured Query Language (SQL calls that are implemented in a set of Application Programming Interfaces (APIs. The APIs include three languages: C++, Java, and Perl. The methods in these API libraries are used to construct a set of loader applications, which parse and load the source datasets into the Atlas database, and a set of toolbox applications which facilitate data retrieval. Atlas stores and integrates local instances of GenBank, RefSeq, UniProt, Human Protein Reference Database (HPRD, Biomolecular Interaction Network Database (BIND, Database of Interacting Proteins (DIP, Molecular Interactions Database (MINT, IntAct, NCBI Taxonomy, Gene Ontology (GO, Online Mendelian Inheritance in Man (OMIM, LocusLink, Entrez Gene and HomoloGene. The retrieval APIs and toolbox applications are critical components that offer end-users flexible, easy, integrated access to this data. We present use cases that use Atlas to integrate these sources for genome annotation, inference of molecular interactions across species, and gene-disease associations. Conclusion The Atlas biological data warehouse serves as data infrastructure for bioinformatics research and development. It forms the backbone of the research activities in our laboratory and facilitates the integration of disparate, heterogeneous biological sources of data enabling new scientific inferences. Atlas achieves integration of diverse data sets at two levels. First

  19. Exploring Cystic Fibrosis Using Bioinformatics Tools: A Module Designed for the Freshman Biology Course

    Science.gov (United States)

    Zhang, Xiaorong

    2011-01-01

    We incorporated a bioinformatics component into the freshman biology course that allows students to explore cystic fibrosis (CF), a common genetic disorder, using bioinformatics tools and skills. Students learn about CF through searching genetic databases, analyzing genetic sequences, and observing the three-dimensional structures of proteins…

  20. PayDIBI: Pay-as-you-go data integration for bioinformatics

    NARCIS (Netherlands)

    Wanders, B.

    2012-01-01

    Background: Scientific research in bio-informatics is often data-driven and supported by biolog- ical databases. In a growing number of research projects, researchers like to ask questions that require the combination of information from more than one database. Most bio-informatics papers do not det

  1. Computer Programming and Biomolecular Structure Studies: A Step beyond Internet Bioinformatics

    Science.gov (United States)

    Likic, Vladimir A.

    2006-01-01

    This article describes the experience of teaching structural bioinformatics to third year undergraduate students in a subject titled "Biomolecular Structure and Bioinformatics." Students were introduced to computer programming and used this knowledge in a practical application as an alternative to the well established Internet bioinformatics…

  2. A Portable Bioinformatics Course for Upper-Division Undergraduate Curriculum in Sciences

    Science.gov (United States)

    Floraino, Wely B.

    2008-01-01

    This article discusses the challenges that bioinformatics education is facing and describes a bioinformatics course that is successfully taught at the California State Polytechnic University, Pomona, to the fourth year undergraduate students in biological sciences, chemistry, and computer science. Information on lecture and computer practice…

  3. Making Bioinformatics Projects a Meaningful Experience in an Undergraduate Biotechnology or Biomedical Science Programme

    Science.gov (United States)

    Sutcliffe, Iain C.; Cummings, Stephen P.

    2007-01-01

    Bioinformatics has emerged as an important discipline within the biological sciences that allows scientists to decipher and manage the vast quantities of data (such as genome sequences) that are now available. Consequently, there is an obvious need to provide graduates in biosciences with generic, transferable skills in bioinformatics. We present…

  4. Integration of Bioinformatics into an Undergraduate Biology Curriculum and the Impact on Development of Mathematical Skills

    Science.gov (United States)

    Wightman, Bruce; Hark, Amy T.

    2012-01-01

    The development of fields such as bioinformatics and genomics has created new challenges and opportunities for undergraduate biology curricula. Students preparing for careers in science, technology, and medicine need more intensive study of bioinformatics and more sophisticated training in the mathematics on which this field is based. In this…

  5. A Summer Program Designed to Educate College Students for Careers in Bioinformatics

    Science.gov (United States)

    Krilowicz, Beverly; Johnston, Wendie; Sharp, Sandra B.; Warter-Perez, Nancy; Momand, Jamil

    2007-01-01

    A summer program was created for undergraduates and graduate students that teaches bioinformatics concepts, offers skills in professional development, and provides research opportunities in academic and industrial institutions. We estimate that 34 of 38 graduates (89%) are in a career trajectory that will use bioinformatics. Evidence from…

  6. Comparative Proteome Bioinformatics: Identification of Phosphotyrosine Signaling Proteins in the Unicellular Protozoan Ciliate Tetrahymena

    DEFF Research Database (Denmark)

    Gammeltoft, Steen; Christensen, Søren Tvorup; Joachimiak, Marcin;

    2005-01-01

    Tetrahymena, bioinformatics, cilia, evolution, signaling, TtPTK1, PTK, Grb2, SH-PTP 2, Plcy, Src, PTP, PI3K, SH2, SH3, PH......Tetrahymena, bioinformatics, cilia, evolution, signaling, TtPTK1, PTK, Grb2, SH-PTP 2, Plcy, Src, PTP, PI3K, SH2, SH3, PH...

  7. Bioinformatics in Middle East Program Curricula--A Focus on the Arabian Gulf

    Science.gov (United States)

    Loucif, Samia

    2014-01-01

    The purpose of this paper is to investigate the inclusion of bioinformatics in program curricula in the Middle East, focusing on educational institutions in the Arabian Gulf. Bioinformatics is a multidisciplinary field which has emerged in response to the need for efficient data storage and retrieval, and accurate and fast computational and…

  8. Incorporating a Collaborative Web-Based Virtual Laboratory in an Undergraduate Bioinformatics Course

    Science.gov (United States)

    Weisman, David

    2010-01-01

    Face-to-face bioinformatics courses commonly include a weekly, in-person computer lab to facilitate active learning, reinforce conceptual material, and teach practical skills. Similarly, fully-online bioinformatics courses employ hands-on exercises to achieve these outcomes, although students typically perform this work offsite. Combining a…

  9. BioStar: an online question & answer resource for the bioinformatics community

    Science.gov (United States)

    Although the era of big data has produced many bioinformatics tools and databases, using them effectively often requires specialized knowledge. Many groups lack bioinformatics expertise, and frequently find that software documentation is inadequate and local colleagues may be overburdened or unfamil...

  10. Bioinformatics in High School Biology Curricula: A Study of State Science Standards

    Science.gov (United States)

    Wefer, Stephen H.; Sheppard, Keith

    2008-01-01

    The proliferation of bioinformatics in modern biology marks a modern revolution in science that promises to influence science education at all levels. This study analyzed secondary school science standards of 49 U.S. states (Iowa has no science framework) and the District of Columbia for content related to bioinformatics. The bioinformatics…

  11. Teaching Bioinformatics and Neuroinformatics by Using Free Web-Based Tools

    Science.gov (United States)

    Grisham, William; Schottler, Natalie A.; Valli-Marill, Joanne; Beck, Lisa; Beatty, Jackson

    2010-01-01

    This completely computer-based module's purpose is to introduce students to bioinformatics resources. We present an easy-to-adopt module that weaves together several important bioinformatic tools so students can grasp how these tools are used in answering research questions. Students integrate information gathered from websites dealing with…

  12. Visualizing and Sharing Results in Bioinformatics Projects: GBrowse and GenBank Exports

    Science.gov (United States)

    Effective tools for presenting and sharing data are necessary for collaborative projects, typical for bioinformatics. In order to facilitate sharing our data with other genomics, molecular biology, and bioinformatics researchers, we have developed software to export our data to GenBank and combined ...

  13. The S-Star Trial Bioinformatics Course: An On-line Learning Success

    Science.gov (United States)

    Lim, Yun Ping; Hoog, Jan-Olov; Gardner, Phyllis; Ranganathan, Shoba; Andersson, Siv; Subbiah, Subramanian; Tan, Tin Wee; Hide, Winston; Weiss, Anthony S.

    2003-01-01

    The S-Star Trial Bioinformatics on-line course (www.s-star.org) is a global experiment in bioinformatics distance education. Six universities from five continents have participated in this project. One hundred and fifty students participated in the first trial course of which 96 followed through the entire course and 70 fulfilled the overall…

  14. ISEV position paper: extracellular vesicle RNA analysis and bioinformatics

    Directory of Open Access Journals (Sweden)

    Andrew F. Hill

    2013-12-01

    Full Text Available Extracellular vesicles (EVs are the collective term for the various vesicles that are released by cells into the extracellular space. Such vesicles include exosomes and microvesicles, which vary by their size and/or protein and genetic cargo. With the discovery that EVs contain genetic material in the form of RNA (evRNA has come the increased interest in these vesicles for their potential use as sources of disease biomarkers and potential therapeutic agents. Rapid developments in the availability of deep sequencing technologies have enabled the study of EV-related RNA in detail. In October 2012, the International Society for Extracellular Vesicles (ISEV held a workshop on “evRNA analysis and bioinformatics.” Here, we report the conclusions of one of the roundtable discussions where we discussed evRNA analysis technologies and provide some guidelines to researchers in the field to consider when performing such analysis.

  15. Bioinformatics and the Politics of Innovation in the Life Sciences

    Science.gov (United States)

    Zhou, Yinhua; Datta, Saheli; Salter, Charlotte

    2016-01-01

    The governments of China, India, and the United Kingdom are unanimous in their belief that bioinformatics should supply the link between basic life sciences research and its translation into health benefits for the population and the economy. Yet at the same time, as ambitious states vying for position in the future global bioeconomy they differ considerably in the strategies adopted in pursuit of this goal. At the heart of these differences lies the interaction between epistemic change within the scientific community itself and the apparatus of the state. Drawing on desk-based research and thirty-two interviews with scientists and policy makers in the three countries, this article analyzes the politics that shape this interaction. From this analysis emerges an understanding of the variable capacities of different kinds of states and political systems to work with science in harnessing the potential of new epistemic territories in global life sciences innovation.

  16. NMR structure improvement: A structural bioinformatics & visualization approach

    Science.gov (United States)

    Block, Jeremy N.

    The overall goal of this project is to enhance the physical accuracy of individual models in macromolecular NMR (Nuclear Magnetic Resonance) structures and the realism of variation within NMR ensembles of models, while improving agreement with the experimental data. A secondary overall goal is to combine synergistically the best aspects of NMR and crystallographic methodologies to better illuminate the underlying joint molecular reality. This is accomplished by using the powerful method of all-atom contact analysis (describing detailed sterics between atoms, including hydrogens); new graphical representations and interactive tools in 3D and virtual reality; and structural bioinformatics approaches to the expanded and enhanced data now available. The resulting better descriptions of macromolecular structure and its dynamic variation enhances the effectiveness of the many biomedical applications that depend on detailed molecular structure, such as mutational analysis, homology modeling, molecular simulations, protein design, and drug design.

  17. Integrative content-driven concepts for bioinformatics ``beyond the cell"

    Indian Academy of Sciences (India)

    Edgar Wingender; Torsten Crass; Jennifer D Hogan; Alexander E Kel; Olga V Kel-Margoulis; Anatolij P Potapov

    2007-01-01

    Bioinformatics has delivered great contributions to genome and genomics research, without which the world-wide success of this and other global (‘omics’) approaches would not have been possible. More recently, it has developed further towards the analysis of different kinds of networks thus laying the foundation for comprehensive description, analysis and manipulation of whole living systems in modern ``systems biology”. The next step which is necessary for developing a systems biology that deals with systemic phenomena is to expand the existing and develop new methodologies that are appropriate to characterize intercellular processes and interactions without omitting the causal underlying molecular mechanisms. Modelling the processes on the different levels of complexity involved requires a comprehensive integration of information on gene regulatory events, signal transduction pathways, protein interaction and metabolic networks as well as cellular functions in the respective tissues/organs.

  18. Technical Perspectives on Knowledge Management in Bioinformatics Workflow Systems

    Directory of Open Access Journals (Sweden)

    Walaa N. Ismail

    2015-01-01

    Full Text Available Workflow systems by it’s nature can help bioin-formaticians to plan for their experiments, store, capture and analysis of the runtime generated data. On the other hand, the life science research usually produces new knowledge at an increasing speed; Knowledge such as papers, databases and other systems knowledge that a researcher needs to deal with is actually a complex task that needs much of efforts and time. Thus the management of knowledge is therefore an important issue for life scientists. Approaches has been developed to organize biological knowledge sources and to record provenance knowledge of an experiment into a readily resource are presently being carried out. This article focuses on the knowledge management of in silico experimentation in bioinformatics workflow systems.

  19. An Adaptive Hybrid Multiprocessor technique for bioinformatics sequence alignment

    KAUST Repository

    Bonny, Talal

    2012-07-28

    Sequence alignment algorithms such as the Smith-Waterman algorithm are among the most important applications in the development of bioinformatics. Sequence alignment algorithms must process large amounts of data which may take a long time. Here, we introduce our Adaptive Hybrid Multiprocessor technique to accelerate the implementation of the Smith-Waterman algorithm. Our technique utilizes both the graphics processing unit (GPU) and the central processing unit (CPU). It adapts to the implementation according to the number of CPUs given as input by efficiently distributing the workload between the processing units. Using existing resources (GPU and CPU) in an efficient way is a novel approach. The peak performance achieved for the platforms GPU + CPU, GPU + 2CPUs, and GPU + 3CPUs is 10.4 GCUPS, 13.7 GCUPS, and 18.6 GCUPS, respectively (with the query length of 511 amino acid). © 2010 IEEE.

  20. Hydroxysteroid dehydrogenases (HSDs) in bacteria: a bioinformatic perspective.

    Science.gov (United States)

    Kisiela, Michael; Skarka, Adam; Ebert, Bettina; Maser, Edmund

    2012-03-01

    Steroidal compounds including cholesterol, bile acids and steroid hormones play a central role in various physiological processes such as cell signaling, growth, reproduction, and energy homeostasis. Hydroxysteroid dehydrogenases (HSDs), which belong to the superfamily of short-chain dehydrogenases/reductases (SDR) or aldo-keto reductases (AKR), are important enzymes involved in the steroid hormone metabolism. HSDs function as an enzymatic switch that controls the access of receptor-active steroids to nuclear hormone receptors and thereby mediate a fine-tuning of the steroid response. The aim of this study was the identification of classified functional HSDs and the bioinformatic annotation of these proteins in all complete sequenced bacterial genomes followed by a phylogenetic analysis. For the bioinformatic annotation we constructed specific hidden Markov models in an iterative approach to provide a reliable identification for the specific catalytic groups of HSDs. Here, we show a detailed phylogenetic analysis of 3α-, 7α-, 12α-HSDs and two further functional related enzymes (3-ketosteroid-Δ(1)-dehydrogenase, 3-ketosteroid-Δ(4)(5α)-dehydrogenase) from the superfamily of SDRs. For some bacteria that have been previously reported to posses a specific HSD activity, we could annotate the corresponding HSD protein. The dominating phyla that were identified to express HSDs were that of Actinobacteria, Proteobacteria, and Firmicutes. Moreover, some evolutionarily more ancient microorganisms (e.g., Cyanobacteria and Euryachaeota) were found as well. A large number of HSD-expressing bacteria constitute the normal human gastro-intestinal flora. Another group of bacteria were originally isolated from natural habitats like seawater, soil, marine and permafrost sediments. These bacteria include polycyclic aromatic hydrocarbons-degrading species such as Pseudomonas, Burkholderia and Rhodococcus. In conclusion, HSDs are found in a wide variety of microorganisms including

  1. Putative lipoproteins identified by bioinformatic genome analysis of Leifsonia xyli ssp. xyli, the causative agent of sugarcane ratoon stunting disease.

    Science.gov (United States)

    Sutcliffe, Iain C; Hutchings, Matthew I

    2007-01-01

    SUMMARY Leifsonia xyli ssp. xyli is the causative agent of ratoon stunting disease, a major cause of economic loss in sugarcane crops. Understanding of the biology of this pathogen has been hampered by its fastidious growth characteristics in vitro. However, the recent release of a genome sequence for this organism has allowed significant novel insights. Further to this, we have performed a bioinformatic analysis of the lipoproteins encoded in the L. xyli genome. These analyses suggest that lipoproteins represent c. 2.0% of the L. xyli predicted proteome. Functional analyses suggest that lipoproteins make an important contribution to the physiology of the pathogen and may influence its ability to cause disease in planta. PMID:20507484

  2. Facilitating the use of large-scale biological data and tools in the era of translational bioinformatics

    DEFF Research Database (Denmark)

    Kouskoumvekaki, Irene; Shublaq, Nour; Brunak, Søren

    2014-01-01

    access to user-friendly solutions to navigate, integrate and extract information out of biological databases, as well as to combine tools and data resources in bioinformatics workflows. In this review, we present services that assist biomedical scientists in incorporating bioinformatics tools...... that facilitate the integration of biomedical ontologies and bioinformatics tools in computational workflows....

  3. Buying in to bioinformatics: an introduction to commercial sequence analysis software.

    Science.gov (United States)

    Smith, David Roy

    2015-07-01

    Advancements in high-throughput nucleotide sequencing techniques have brought with them state-of-the-art bioinformatics programs and software packages. Given the importance of molecular sequence data in contemporary life science research, these software suites are becoming an essential component of many labs and classrooms, and as such are frequently designed for non-computer specialists and marketed as one-stop bioinformatics toolkits. Although beautifully designed and powerful, user-friendly bioinformatics packages can be expensive and, as more arrive on the market each year, it can be difficult for researchers, teachers and students to choose the right software for their needs, especially if they do not have a bioinformatics background. This review highlights some of the currently available and most popular commercial bioinformatics packages, discussing their prices, usability, features and suitability for teaching. Although several commercial bioinformatics programs are arguably overpriced and overhyped, many are well designed, sophisticated and, in my opinion, worth the investment. If you are just beginning your foray into molecular sequence analysis or an experienced genomicist, I encourage you to explore proprietary software bundles. They have the potential to streamline your research, increase your productivity, energize your classroom and, if anything, add a bit of zest to the often dry detached world of bioinformatics. PMID:25183247

  4. A web services choreography scenario for interoperating bioinformatics applications

    Directory of Open Access Journals (Sweden)

    Cheung David W

    2004-03-01

    Full Text Available Abstract Background Very often genome-wide data analysis requires the interoperation of multiple databases and analytic tools. A large number of genome databases and bioinformatics applications are available through the web, but it is difficult to automate interoperation because: 1 the platforms on which the applications run are heterogeneous, 2 their web interface is not machine-friendly, 3 they use a non-standard format for data input and output, 4 they do not exploit standards to define application interface and message exchange, and 5 existing protocols for remote messaging are often not firewall-friendly. To overcome these issues, web services have emerged as a standard XML-based model for message exchange between heterogeneous applications. Web services engines have been developed to manage the configuration and execution of a web services workflow. Results To demonstrate the benefit of using web services over traditional web interfaces, we compare the two implementations of HAPI, a gene expression analysis utility developed by the University of California San Diego (UCSD that allows visual characterization of groups or clusters of genes based on the biomedical literature. This utility takes a set of microarray spot IDs as input and outputs a hierarchy of MeSH Keywords that correlates to the input and is grouped by Medical Subject Heading (MeSH category. While the HTML output is easy for humans to visualize, it is difficult for computer applications to interpret semantically. To facilitate the capability of machine processing, we have created a workflow of three web services that replicates the HAPI functionality. These web services use document-style messages, which means that messages are encoded in an XML-based format. We compared three approaches to the implementation of an XML-based workflow: a hard coded Java application, Collaxa BPEL Server and Taverna Workbench. The Java program functions as a web services engine and interoperates

  5. Cloning, expression and bioinformatics analysis of ATP sulfurylase from Acidithiobacillus ferrooxidans ATCC 23270 in Escherichia coli.

    Science.gov (United States)

    Jaramillo, Michael L; Abanto, Michel; Quispe, Ruth L; Calderón, Julio; Del Valle, Luís J; Talledo, Miguel; Ramírez, Pablo

    2012-01-01

    Molecular studies of enzymes involved in sulfite oxidation in Acidithiobacillus ferrooxidans have not yet been developed, especially in the ATP sulfurylase (ATPS) of these acidophilus tiobacilli that have importance in biomining. This enzyme synthesizes ATP and sulfate from adenosine phosphosulfate (APS) and pyrophosphate (PPi), final stage of the sulfite oxidation by these organisms in order to obtain energy. The atpS gene (1674 bp) encoding the ATPS from Acidithiobacillus ferrooxidans ATCC 23270 was amplified using PCR, cloned in the pET101-TOPO plasmid, sequenced and expressed in Escherichia coli obtaining a 63.5 kDa ATPS recombinant protein according to SDS-PAGE analysis. The bioinformatics and phylogenetic analyses determined that the ATPS from A. ferrooxidans presents ATP sulfurylase (ATS) and APS kinase (ASK) domains similar to ATPS of Aquifex aeolicus, probably of a more ancestral origin. Enzyme activity towards ATP formation was determined by quantification of ATP formed from E. coli cell extracts, using a bioluminescence assay based on light emission by the luciferase enzyme. Our results demonstrate that the recombinant ATP sulfurylase from A. ferrooxidans presents an enzymatic activity for the formation of ATP and sulfate, and possibly is a bifunctional enzyme due to its high homology to the ASK domain from A. aeolicus and true kinases. PMID:23055613

  6. Identification of novel genes and pathways in carotid atheroma using integrated bioinformatic methods.

    Science.gov (United States)

    Nai, Wenqing; Threapleton, Diane; Lu, Jingbo; Zhang, Kewei; Wu, Hongyuan; Fu, You; Wang, Yuanyuan; Ou, Zejin; Shan, Lanlan; Ding, Yan; Yu, Yanlin; Dai, Meng

    2016-01-01

    Atherosclerosis is the primary cause of cardiovascular events and its molecular mechanism urgently needs to be clarified. In our study, atheromatous plaques (ATH) and macroscopically intact tissue (MIT) sampled from 32 patients were compared and an integrated series of bioinformatic microarray analyses were used to identify altered genes and pathways. Our work showed 816 genes were differentially expressed between ATH and MIT, including 443 that were up-regulated and 373 that were down-regulated in ATH tissues. GO functional-enrichment analysis for differentially expressed genes (DEGs) indicated that genes related to the "immune response" and "muscle contraction" were altered in ATHs. KEGG pathway-enrichment analysis showed that up-regulated DEGs were significantly enriched in the "FcεRI-mediated signaling pathway", while down-regulated genes were significantly enriched in the "transforming growth factor-β signaling pathway". Protein-protein interaction network and module analysis demonstrated that VAV1, SYK, LYN and PTPN6 may play critical roles in the network. Additionally, similar observations were seen in a validation study where SYK, LYN and PTPN6 were markedly elevated in ATH. All in all, identification of these genes and pathways not only provides new insights into the pathogenesis of atherosclerosis, but may also aid in the development of prognostic and therapeutic biomarkers for advanced atheroma. PMID:26742467

  7. Cancer bioinformatics: detection of chromatin states,SNP-containing motifs, and functional enrichment modules

    Institute of Scientific and Technical Information of China (English)

    Xiaobo Zhou

    2013-01-01

    In this editorial preface,I briefly review cancer bioinformatics and introduce the four articles in this special issue highlighting important applications of the field:detection of chromatin states; detection of SNP-containing motifs and association with transcription factor-binding sites; improvements in functional enrichment modules; and gene association studies on aging and cancer.We expect this issue to provide bioinformatics scientists,cancer biologists,and clinical doctors with a better understanding of how cancer bioinformatics can be used to identify candidate biomarkers and targets and to conduct functional analysis.

  8. Fifteen years SIB Swiss Institute of Bioinformatics: life science databases, tools and support.

    Science.gov (United States)

    Stockinger, Heinz; Altenhoff, Adrian M; Arnold, Konstantin; Bairoch, Amos; Bastian, Frederic; Bergmann, Sven; Bougueleret, Lydie; Bucher, Philipp; Delorenzi, Mauro; Lane, Lydie; Le Mercier, Philippe; Lisacek, Frédérique; Michielin, Olivier; Palagi, Patricia M; Rougemont, Jacques; Schwede, Torsten; von Mering, Christian; van Nimwegen, Erik; Walther, Daniel; Xenarios, Ioannis; Zavolan, Mihaela; Zdobnov, Evgeny M; Zoete, Vincent; Appel, Ron D

    2014-07-01

    The SIB Swiss Institute of Bioinformatics (www.isb-sib.ch) was created in 1998 as an institution to foster excellence in bioinformatics. It is renowned worldwide for its databases and software tools, such as UniProtKB/Swiss-Prot, PROSITE, SWISS-MODEL, STRING, etc, that are all accessible on ExPASy.org, SIB's Bioinformatics Resource Portal. This article provides an overview of the scientific and training resources SIB has consistently been offering to the life science community for more than 15 years. PMID:24792157

  9. Bioinformatics, the Clearing-House Mechanism and the Convention on Biological Diversity

    Directory of Open Access Journals (Sweden)

    Marcos R. Silva

    2004-01-01

    Full Text Available This paper will discuss the development and role of the Convention’s Clearing-House Mechanism (CHM, its synergies with the Biosafety Clearing-House (BCH, and its support for bioinformatic initiatives in light of the Convention’s programme areas and cross-cutting issues. Within this context, the paper will also examine the significance of bioinformatics in assisting Parties to implement obligations under the Convention and the role of the CHM in facilitating activities by Parties and Governments to exploit the benefits arising from the evolving bioinformatics global infrastructure.

  10. Cancer bioinformatics: detection of chromatin states, SNP-containing motifs, and functional enrichment modules

    Directory of Open Access Journals (Sweden)

    Xiaobo Zhou

    2013-04-01

    Full Text Available In this editorial preface, I briefly review cancer bioinformatics and introduce the four articles in this special issue highlighting important applications of the field: detection of chromatin states; detection of SNP-containing motifs and association with transcription factor-binding sites; improvements in functional enrichment modules; and gene association studies on aging and cancer. We expect this issue to provide bioinformatics scientists, cancer biologists, and clinical doctors with a better understanding of how cancer bioinformatics can be used to identify candidate biomarkers and targets and to conduct functional analysis.

  11. Bioinformatic Prediction of WSSV-Host Protein-Protein Interaction

    Directory of Open Access Journals (Sweden)

    Zheng Sun

    2014-01-01

    Full Text Available WSSV is one of the most dangerous pathogens in shrimp aquaculture. However, the molecular mechanism of how WSSV interacts with shrimp is still not very clear. In the present study, bioinformatic approaches were used to predict interactions between proteins from WSSV and shrimp. The genome data of WSSV (NC_003225.1 and the constructed transcriptome data of F. chinensis were used to screen potentially interacting proteins by searching in protein interaction databases, including STRING, Reactome, and DIP. Forty-four pairs of proteins were suggested to have interactions between WSSV and the shrimp. Gene ontology analysis revealed that 6 pairs of these interacting proteins were classified into “extracellular region” or “receptor complex” GO-terms. KEGG pathway analysis showed that they were involved in the “ECM-receptor interaction pathway.” In the 6 pairs of interacting proteins, an envelope protein called “collagen-like protein” (WSSV-CLP encoded by an early virus gene “wsv001” in WSSV interacted with 6 deduced proteins from the shrimp, including three integrin alpha (ITGA, two integrin beta (ITGB, and one syndecan (SDC. Sequence analysis on WSSV-CLP, ITGA, ITGB, and SDC revealed that they possessed the sequence features for protein-protein interactions. This study might provide new insights into the interaction mechanisms between WSSV and shrimp.

  12. Exploiting graphics processing units for computational biology and bioinformatics.

    Science.gov (United States)

    Payne, Joshua L; Sinnott-Armstrong, Nicholas A; Moore, Jason H

    2010-09-01

    Advances in the video gaming industry have led to the production of low-cost, high-performance graphics processing units (GPUs) that possess more memory bandwidth and computational capability than central processing units (CPUs), the standard workhorses of scientific computing. With the recent release of generalpurpose GPUs and NVIDIA's GPU programming language, CUDA, graphics engines are being adopted widely in scientific computing applications, particularly in the fields of computational biology and bioinformatics. The goal of this article is to concisely present an introduction to GPU hardware and programming, aimed at the computational biologist or bioinformaticist. To this end, we discuss the primary differences between GPU and CPU architecture, introduce the basics of the CUDA programming language, and discuss important CUDA programming practices, such as the proper use of coalesced reads, data types, and memory hierarchies. We highlight each of these topics in the context of computing the all-pairs distance between instances in a dataset, a common procedure in numerous disciplines of scientific computing. We conclude with a runtime analysis of the GPU and CPU implementations of the all-pairs distance calculation. We show our final GPU implementation to outperform the CPU implementation by a factor of 1700. PMID:20658333

  13. Functional proteomics with new mass spectrometric and bioinformatics tools

    International Nuclear Information System (INIS)

    A comprehensive range of mass spectrometric tools is required to investigate todays life science applications and a strong focus is on addressing the needs of functional proteomics. Application examples are given showing the streamlined process of protein identification from low femtomole amounts of digests. Sample preparation is achieved with a convertible robot for automated 2D gel picking, and MALDI target dispensing. MALDI-TOF or ESI-MS subsequent to enzymatic digestion. A choice of mass spectrometers including Q-q-TOF with multipass capability, MALDI-MS/MS with unsegmented PSD, Ion Trap and FT-MS are discussed for their respective strengths and applications. Bioinformatics software that allows both database work and novel peptide mass spectra interpretation is reviewed. The automated database searching uses either entire digest LC-MSn ESI Ion Trap data or MALDI MS and MS/MS spectra. It is shown how post translational modifications are interactively uncovered and de-novo sequencing of peptides is facilitated

  14. Bioinformatic approaches reveal metagenomic characterization of soil microbial community.

    Directory of Open Access Journals (Sweden)

    Zhuofei Xu

    Full Text Available As is well known, soil is a complex ecosystem harboring the most prokaryotic biodiversity on the Earth. In recent years, the advent of high-throughput sequencing techniques has greatly facilitated the progress of soil ecological studies. However, how to effectively understand the underlying biological features of large-scale sequencing data is a new challenge. In the present study, we used 33 publicly available metagenomes from diverse soil sites (i.e. grassland, forest soil, desert, Arctic soil, and mangrove sediment and integrated some state-of-the-art computational tools to explore the phylogenetic and functional characterizations of the microbial communities in soil. Microbial composition and metabolic potential in soils were comprehensively illustrated at the metagenomic level. A spectrum of metagenomic biomarkers containing 46 taxa and 33 metabolic modules were detected to be significantly differential that could be used as indicators to distinguish at least one of five soil communities. The co-occurrence associations between complex microbial compositions and functions were inferred by network-based approaches. Our results together with the established bioinformatic pipelines should provide a foundation for future research into the relation between soil biodiversity and ecosystem function.

  15. Bioinformatic approaches reveal metagenomic characterization of soil microbial community.

    Science.gov (United States)

    Xu, Zhuofei; Hansen, Martin Asser; Hansen, Lars H; Jacquiod, Samuel; Sørensen, Søren J

    2014-01-01

    As is well known, soil is a complex ecosystem harboring the most prokaryotic biodiversity on the Earth. In recent years, the advent of high-throughput sequencing techniques has greatly facilitated the progress of soil ecological studies. However, how to effectively understand the underlying biological features of large-scale sequencing data is a new challenge. In the present study, we used 33 publicly available metagenomes from diverse soil sites (i.e. grassland, forest soil, desert, Arctic soil, and mangrove sediment) and integrated some state-of-the-art computational tools to explore the phylogenetic and functional characterizations of the microbial communities in soil. Microbial composition and metabolic potential in soils were comprehensively illustrated at the metagenomic level. A spectrum of metagenomic biomarkers containing 46 taxa and 33 metabolic modules were detected to be significantly differential that could be used as indicators to distinguish at least one of five soil communities. The co-occurrence associations between complex microbial compositions and functions were inferred by network-based approaches. Our results together with the established bioinformatic pipelines should provide a foundation for future research into the relation between soil biodiversity and ecosystem function. PMID:24691166

  16. Using Bioinformatic Approaches to Identify Pathways Targeted by Human Leukemogens

    Directory of Open Access Journals (Sweden)

    Luoping Zhang

    2012-07-01

    Full Text Available We have applied bioinformatic approaches to identify pathways common to chemical leukemogens and to determine whether leukemogens could be distinguished from non-leukemogenic carcinogens. From all known and probable carcinogens classified by IARC and NTP, we identified 35 carcinogens that were associated with leukemia risk in human studies and 16 non-leukemogenic carcinogens. Using data on gene/protein targets available in the Comparative Toxicogenomics Database (CTD for 29 of the leukemogens and 11 of the non-leukemogenic carcinogens, we analyzed for enrichment of all 250 human biochemical pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG database. The top pathways targeted by the leukemogens included metabolism of xenobiotics by cytochrome P450, glutathione metabolism, neurotrophin signaling pathway, apoptosis, MAPK signaling, Toll-like receptor signaling and various cancer pathways. The 29 leukemogens formed 18 distinct clusters comprising 1 to 3 chemicals that did not correlate with known mechanism of action or with structural similarity as determined by 2D Tanimoto coefficients in the PubChem database. Unsupervised clustering and one-class support vector machines, based on the pathway data, were unable to distinguish the 29 leukemogens from 11 non-leukemogenic known and probable IARC carcinogens. However, using two-class random forests to estimate leukemogen and non-leukemogen patterns, we estimated a 76% chance of distinguishing a random leukemogen/non-leukemogen pair from each other.

  17. MEMOSys: Bioinformatics platform for genome-scale metabolic models

    Directory of Open Access Journals (Sweden)

    Agren Rasmus

    2011-01-01

    Full Text Available Abstract Background Recent advances in genomic sequencing have enabled the use of genome sequencing in standard biological and biotechnological research projects. The challenge is how to integrate the large amount of data in order to gain novel biological insights. One way to leverage sequence data is to use genome-scale metabolic models. We have therefore designed and implemented a bioinformatics platform which supports the development of such metabolic models. Results MEMOSys (MEtabolic MOdel research and development System is a versatile platform for the management, storage, and development of genome-scale metabolic models. It supports the development of new models by providing a built-in version control system which offers access to the complete developmental history. Moreover, the integrated web board, the authorization system, and the definition of user roles allow collaborations across departments and institutions. Research on existing models is facilitated by a search system, references to external databases, and a feature-rich comparison mechanism. MEMOSys provides customizable data exchange mechanisms using the SBML format to enable analysis in external tools. The web application is based on the Java EE framework and offers an intuitive user interface. It currently contains six annotated microbial metabolic models. Conclusions We have developed a web-based system designed to provide researchers a novel application facilitating the management and development of metabolic models. The system is freely available at http://www.icbi.at/MEMOSys.

  18. Bioinformatics Analysis of MAPKKK Family Genes in Medicago truncatula

    Directory of Open Access Journals (Sweden)

    Wei Li

    2016-04-01

    Full Text Available Mitogen‐activated protein kinase kinase kinase (MAPKKK is a component of the MAPK cascade pathway that plays an important role in plant growth, development, and response to abiotic stress, the functions of which have been well characterized in several plant species, such as Arabidopsis, rice, and maize. In this study, we performed genome‐wide and systemic bioinformatics analysis of MAPKKK family genes in Medicago truncatula. In total, there were 73 MAPKKK family members identified by search of homologs, and they were classified into three subfamilies, MEKK, ZIK, and RAF. Based on the genomic duplication function, 72 MtMAPKKK genes were located throughout all chromosomes, but they cluster in different chromosomes. Using microarray data and high‐throughput sequencing‐data, we assessed their expression profiles in growth and development processes; these results provided evidence for exploring their important functions in developmental regulation, especially in the nodulation process. Furthermore, we investigated their expression in abiotic stresses by RNA‐seq, which confirmed their critical roles in signal transduction and regulation processes under stress. In summary, our genome‐wide, systemic characterization and expressional analysis of MtMAPKKK genes will provide insights that will be useful for characterizing the molecular functions of these genes in M. truncatula.

  19. Phylogenetic diversity (PD and biodiversity conservation: some bioinformatics challenges

    Directory of Open Access Journals (Sweden)

    Daniel P. Faith

    2006-01-01

    Full Text Available Biodiversity conservation addresses information challenges through estimations encapsulated in measures of diversity. A quantitative measure of phylogenetic diversity, “PD”, has been defined as the minimum total length of all the phylogenetic branches required to span a given set of taxa on the phylogenetic tree (Faith 1992a. While a recent paper incorrectly characterizes PD as not including information about deeper phylogenetic branches, PD applications over the past decade document the proper incorporation of shared deep branches when assessing the total PD of a set of taxa. Current PD applications to macroinvertebrate taxa in streams of New South Wales, Australia illustrate the practical importance of this definition. Phylogenetic lineages, often corresponding to new, “cryptic”, taxa, are restricted to a small number of stream localities. A recent case of human impact causing loss of taxa in one locality implies a higher PD value for another locality, because it now uniquely represents a deeper branch. This molecular-based phylogenetic pattern supports the use of DNA barcoding programs for biodiversity conservation planning. Here, PD assessments side-step the contentious use of barcoding-based “species” designations. Bio-informatics challenges include combining different phylogenetic evidence, optimization problems for conservation planning, and effective integration of phylogenetic information with environmental and socio-economic data.

  20. Fundamentals of bioinformatics and computational biology methods and exercises in matlab

    CERN Document Server

    Singh, Gautam B

    2015-01-01

    This book offers comprehensive coverage of all the core topics of bioinformatics, and includes practical examples completed using the MATLAB bioinformatics toolbox™. It is primarily intended as a textbook for engineering and computer science students attending advanced undergraduate and graduate courses in bioinformatics and computational biology. The book develops bioinformatics concepts from the ground up, starting with an introductory chapter on molecular biology and genetics. This chapter will enable physical science students to fully understand and appreciate the ultimate goals of applying the principles of information technology to challenges in biological data management, sequence analysis, and systems biology. The first part of the book also includes a survey of existing biological databases, tools that have become essential in today’s biotechnology research. The second part of the book covers methodologies for retrieving biological information, including fundamental algorithms for sequence compar...

  1. Bioinformatics in high school biology curricula: a study of state science standards.

    Science.gov (United States)

    Wefer, Stephen H; Sheppard, Keith

    2008-01-01

    The proliferation of bioinformatics in modern biology marks a modern revolution in science that promises to influence science education at all levels. This study analyzed secondary school science standards of 49 U.S. states (Iowa has no science framework) and the District of Columbia for content related to bioinformatics. The bioinformatics content of each state's biology standards was analyzed and categorized into nine areas: Human Genome Project/genomics, forensics, evolution, classification, nucleotide variations, medicine, computer use, agriculture/food technology, and science technology and society/socioscientific issues. Findings indicated a generally low representation of bioinformatics-related content, which varied substantially across the different areas, with Human Genome Project/genomics and computer use being the lowest (8%), and evolution being the highest (64%) among states' science frameworks. This essay concludes with recommendations for reworking/rewording existing standards to facilitate the goal of promoting science literacy among secondary school students. PMID:18316818

  2. Predicted protein-protein interactions in the moss Physcomitrella patens: a new bioinformatic resource

    OpenAIRE

    Schuette, Scott; Piatkowski, Brian; Corley, Aaron; Lang, Daniel; Geisler, Matt

    2015-01-01

    Background Physcomitrella patens, a haploid dominant plant, is fast becoming a useful molecular genetics and bioinformatics tool due to its key phylogenetic position as a bryophyte in the post-genomic era. Genome sequences from select reference species were compared bioinformatically to Physcomitrella patens using reciprocal blasts with the InParanoid software package. A reference protein interaction database assembled using MySQL by compiling BioGrid, BIND, DIP, and Intact databases was quer...

  3. The Bioinformatics Links Directory: a Compilation of Molecular Biology Web Servers

    OpenAIRE

    Fox, Joanne A.; Butland, Stefanie L.; McMillan, Scott; Campbell, Graeme; Ouellette, B. F. Francis

    2005-01-01

    The Bioinformatics Links Directory is an online community resource that contains a directory of freely available tools, databases, and resources for bioinformatics and molecular biology research. The listing of the servers published in this and previous issues of Nucleic Acids Research together with other useful tools and websites represents a rich repository of resources that are openly provided to the research community using internet technologies. The 166 servers highlighted in the 2005 We...

  4. Overview of Random Forest Methodology and Practical Guidance with Emphasis on Computational Biology and Bioinformatics

    OpenAIRE

    Boulesteix, Anne-Laure; Janitza, Silke; Kruppa, Jochen; König, Inke R.

    2012-01-01

    The Random Forest (RF) algorithm by Leo Breiman has become a standard data analysis tool in bioinformatics. It has shown excellent performance in settings where the number of variables is much larger than the number of observations, can cope with complex interaction structures as well as highly correlated variables and returns measures of variable importance. This paper synthesizes ten years of RF development with emphasis on applications to bioinformatics and computational biology. Specia...

  5. Forensic Bioinformatics: An innovative technological advancement in the field of Forensic Medicine and Diagnosis

    OpenAIRE

    Kumar Ajay; Singh Neetu; Gaurav S.S

    2012-01-01

    Background: The role of Bioinformatics in this modern age of technology advancement can not be over-emphasized. Aim: This study reviews the principle, techniques, and applications of Forensic Bioinformatics. Methods and Materials: Literature searches were done to identify relevant studies. Results: The concepts of sequence annotation and whole genome sequencing were possible due to the assimilation of software based tools which are exclusively responsible for the segregation of bulk genomic d...

  6. A New Technique to Manage Big Bioinformatics Data Using Genetic Algorithms

    Directory of Open Access Journals (Sweden)

    Huda Jalil Dikhil

    2016-10-01

    Full Text Available The continuous growth of data, mainly the medical data at laboratories becomes very complex to use and to manage by using traditional ways. So, the researchers start studying genetic information field which increased in the past thirty years in bioinformatics domain (the computer science field, genetic biology field, and DNA. This growth of data becomes known as big bioinformatics data. Thus, efficient algorithms such as Genetic Algorithms are needed to deal with this big and vast amount of bioinformatics data in genetic laboratories. So the researchers proposed two models to manage the big bioinformatics data in addition to the traditional model. The first model by applying Genetic Algorithms before MapReduce, the second model by applying Genetic Algorithms after the MapReduce, and the original or the traditional model by applying only MapReduce without using Genetic Algorithms. The three models were implemented and evaluated using big bioinformatics data collected from the Duchenne Muscular Dystrophy (DMD disorder. The researchers conclude that the second model is the best one among the three models in reducing the size of the data, in execution time, and in addition to the ability to manage and summarize big bioinformatics data. Finally by comparing the percentage errors of the second model with the first model and the traditional model, the researchers obtained the following results 1.136%, 10.227%, and 11.363% respectively. So the second model is the most accurate model with the less percentage error.

  7. Bioinformatic Identification and Analysis of Extensins in the Plant Kingdom.

    Directory of Open Access Journals (Sweden)

    Xiao Liu

    Full Text Available Extensins (EXTs are a family of plant cell wall hydroxyproline-rich glycoproteins (HRGPs that are implicated to play important roles in plant growth, development, and defense. Structurally, EXTs are characterized by the repeated occurrence of serine (Ser followed by three to five prolines (Pro residues, which are hydroxylated as hydroxyproline (Hyp and glycosylated. Some EXTs have Tyrosine (Tyr-X-Tyr (where X can be any amino acid motifs that are responsible for intramolecular or intermolecular cross-linkings. EXTs can be divided into several classes: classical EXTs, short EXTs, leucine-rich repeat extensins (LRXs, proline-rich extensin-like receptor kinases (PERKs, formin-homolog EXTs (FH EXTs, chimeric EXTs, and long chimeric EXTs. To guide future research on the EXTs and understand evolutionary history of EXTs in the plant kingdom, a bioinformatics study was conducted to identify and classify EXTs from 16 fully sequenced plant genomes, including Ostreococcus lucimarinus, Chlamydomonas reinhardtii, Volvox carteri, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Pinus taeda, Picea abies, Brachypodium distachyon, Zea mays, Oryza sativa, Glycine max, Medicago truncatula, Brassica rapa, Solanum lycopersicum, and Solanum tuberosum, to supplement data previously obtained from Arabidopsis thaliana and Populus trichocarpa. A total of 758 EXTs were newly identified, including 87 classical EXTs, 97 short EXTs, 61 LRXs, 75 PERKs, 54 FH EXTs, 38 long chimeric EXTs, and 346 other chimeric EXTs. Several notable findings were made: (1 classical EXTs were likely derived after the terrestrialization of plants; (2 LRXs, PERKs, and FHs were derived earlier than classical EXTs; (3 monocots have few classical EXTs; (4 Eudicots have the greatest number of classical EXTs and Tyr-X-Tyr cross-linking motifs are predominantly in classical EXTs; (5 green algae have no classical EXTs but have a number of long chimeric EXTs that are absent in

  8. Comparison of acceleration techniques forselected low-level bioinformatics operations

    Directory of Open Access Journals (Sweden)

    Daniel eLangenkaemper

    2016-02-01

    Full Text Available Within the recent years clock rates of modern processors stagnated while the demand for computing power continued to grow. This applied particularly for the fields of life sciences and bioinformatics, where new technologies keep on creating rapidly growing piles of raw data with increasing speed. The number of cores per processor increased in an attempt to compensate for slight increments of clock rates. This technological shift demands changes in software development, especially in the field of high performance computing where parallelization techniques are gaining in importance due to the pressing issue of large sized datasets generated by e.g. modern genomics.This paper presents an overview of state-of-the-art manual and automatic acceleration techniques and lists some applications employing these in different areas of sequence informatics. Furthermore we provide examples for automatic acceleration of two use cases to show typical problems and gains of transforming a serial application to a parallel one. The paper should aid the reader in deciding for a certain techniques for the problem at hand.We compare four different state-of-the-art automatic acceleration approaches (OpenMP, PluTo-SICA, PPCG, and OpenACC. Their performance as well as their applicability for selected use cases is discussed. While optimizations targeting the CPU worked better in the complex k-mer use case, optimizers for Graphics Processing Units (GPUs performed better in the matrix multiplication example. But performance is only superior at a certain problem size due to data migration overhead.We show that automatic code parallelization is feasible with current compiler software and yields significant increases in execution speed. Automatic optimizers for CPU are mature and usually

  9. Bioinformatic Identification and Analysis of Extensins in the Plant Kingdom.

    Science.gov (United States)

    Liu, Xiao; Wolfe, Richard; Welch, Lonnie R; Domozych, David S; Popper, Zoë A; Showalter, Allan M

    2016-01-01

    Extensins (EXTs) are a family of plant cell wall hydroxyproline-rich glycoproteins (HRGPs) that are implicated to play important roles in plant growth, development, and defense. Structurally, EXTs are characterized by the repeated occurrence of serine (Ser) followed by three to five prolines (Pro) residues, which are hydroxylated as hydroxyproline (Hyp) and glycosylated. Some EXTs have Tyrosine (Tyr)-X-Tyr (where X can be any amino acid) motifs that are responsible for intramolecular or intermolecular cross-linkings. EXTs can be divided into several classes: classical EXTs, short EXTs, leucine-rich repeat extensins (LRXs), proline-rich extensin-like receptor kinases (PERKs), formin-homolog EXTs (FH EXTs), chimeric EXTs, and long chimeric EXTs. To guide future research on the EXTs and understand evolutionary history of EXTs in the plant kingdom, a bioinformatics study was conducted to identify and classify EXTs from 16 fully sequenced plant genomes, including Ostreococcus lucimarinus, Chlamydomonas reinhardtii, Volvox carteri, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Pinus taeda, Picea abies, Brachypodium distachyon, Zea mays, Oryza sativa, Glycine max, Medicago truncatula, Brassica rapa, Solanum lycopersicum, and Solanum tuberosum, to supplement data previously obtained from Arabidopsis thaliana and Populus trichocarpa. A total of 758 EXTs were newly identified, including 87 classical EXTs, 97 short EXTs, 61 LRXs, 75 PERKs, 54 FH EXTs, 38 long chimeric EXTs, and 346 other chimeric EXTs. Several notable findings were made: (1) classical EXTs were likely derived after the terrestrialization of plants; (2) LRXs, PERKs, and FHs were derived earlier than classical EXTs; (3) monocots have few classical EXTs; (4) Eudicots have the greatest number of classical EXTs and Tyr-X-Tyr cross-linking motifs are predominantly in classical EXTs; (5) green algae have no classical EXTs but have a number of long chimeric EXTs that are absent in

  10. Automatic Discovery and Inferencing of Complex Bioinformatics Web Interfaces

    Energy Technology Data Exchange (ETDEWEB)

    Ngu, A; Rocco, D; Critchlow, T; Buttler, D

    2003-12-22

    The World Wide Web provides a vast resource to genomics researchers in the form of web-based access to distributed data sources--e.g. BLAST sequence homology search interfaces. However, the process for seeking the desired scientific information is still very tedious and frustrating. While there are several known servers on genomic data (e.g., GeneBank, EMBL, NCBI), that are shared and accessed frequently, new data sources are created each day in laboratories all over the world. The sharing of these newly discovered genomics results are hindered by the lack of a common interface or data exchange mechanism. Moreover, the number of autonomous genomics sources and their rate of change out-pace the speed at which they can be manually identified, meaning that the available data is not being utilized to its full potential. An automated system that can find, classify, describe and wrap new sources without tedious and low-level coding of source specific wrappers is needed to assist scientists to access to hundreds of dynamically changing bioinformatics web data sources through a single interface. A correct classification of any kind of Web data source must address both the capability of the source and the conversation/interaction semantics which is inherent in the design of the Web data source. In this paper, we propose an automatic approach to classify Web data sources that takes into account both the capability and the conversational semantics of the source. The ability to discover the interaction pattern of a Web source leads to increased accuracy in the classification process. At the same time, it facilitates the extraction of process semantics, which is necessary for the automatic generation of wrappers that can interact correctly with the sources.

  11. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis.

    Science.gov (United States)

    Noar, Roslyn D; Daub, Margaret E

    2016-01-01

    Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity) for six of the PKS sequences. One of the PKS sequences was not similar (banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that they may encode polyketides important in pathogenicity. PMID:27388157

  12. What can bioinformatics do for Natural History museums?

    Directory of Open Access Journals (Sweden)

    Becerra, José María

    2003-06-01

    Full Text Available We propose the founding of a Natural History bioinformatics framework, which would solve one of the main problems in Natural History: data which is scattered around in many incompatible systems (not only computer systems, but also paper ones. This framework consists of computer resources (hardware and software, methodologies that ease the circulation of data, and staff expert in dealing with computers, who will develop software solutions to the problems encountered by naturalists. This system is organized in three layers: acquisition, data and analysis. Each layer is described, and an account of the elements that constitute it given.

    Se presentan las bases de una estructura bioinformática para Historia Natural, que trata de resolver uno de los principales problemas en ésta: la presencia de datos distribuidos a lo largo de muchos sistemas incompatibles entre sí (y no sólo hablamos de sistemas informáticos, sino también en papel. Esta estructura se sustenta en recursos informáticos (en sus dos vertientes: hardware y software, en metodologías que permitan la fácil circulación de los datos, y personal experto en el uso de ordenadores que se encargue de desarrollar soluciones software a los problemas que plantean los naturalistas. Este sistema estaría organizado en tres capas: de adquisición, de datos y de análisis. Cada una de estas capas se describe, indicando los elementos que la componen.

  13. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis.

    Directory of Open Access Journals (Sweden)

    Roslyn D Noar

    Full Text Available Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that

  14. Comparative genome sequencing of drosophila pseudoobscura: Chromosomal, gene and cis-element evolution

    Energy Technology Data Exchange (ETDEWEB)

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Todd, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catherine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenee; Verduzco, Daniel; Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

    2004-04-01

    The genome sequence of a second fruit fly, D. pseudoobscura, presents an opportunity for comparative analysis of a primary model organism D. melanogaster. The vast majority of Drosophila genes have remained on the same arm, but within each arm gene order has been extensively reshuffled leading to the identification of approximately 1300 syntenic blocks. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 35 My since divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome wide average consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than control sequences between the species but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a picture of repeat mediated chromosomal rearrangement, and high co-adaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila.

  15. Insight into GATA1 transcriptional activity through interrogation of cis elements disrupted in human erythroid disorders.

    Science.gov (United States)

    Wakabayashi, Aoi; Ulirsch, Jacob C; Ludwig, Leif S; Fiorini, Claudia; Yasuda, Makiko; Choudhuri, Avik; McDonel, Patrick; Zon, Leonard I; Sankaran, Vijay G

    2016-04-19

    Whole-exome sequencing has been incredibly successful in identifying causal genetic variants and has revealed a number of novel genes associated with blood and other diseases. One limitation of this approach is that it overlooks mutations in noncoding regulatory elements. Furthermore, the mechanisms by which mutations in transcriptionalcis-regulatory elements result in disease remain poorly understood. Here we used CRISPR/Cas9 genome editing to interrogate three such elements harboring mutations in human erythroid disorders, which in all cases are predicted to disrupt a canonical binding motif for the hematopoietic transcription factor GATA1. Deletions of as few as two to four nucleotides resulted in a substantial decrease (>80%) in target gene expression. Isolated deletions of the canonical GATA1 binding motif completely abrogated binding of the cofactor TAL1, which binds to a separate motif. Having verified the functionality of these three GATA1 motifs, we demonstrate strong evolutionary conservation of GATA1 motifs in regulatory elements proximal to other genes implicated in erythroid disorders, and show that targeted disruption of such elements results in altered gene expression. By modeling transcription factor binding patterns, we show that multiple transcription factors are associated with erythroid gene expression, and have created predictive maps modeling putative disruptions of their binding sites at key regulatory elements. Our study provides insight into GATA1 transcriptional activity and may prove a useful resource for investigating the pathogenicity of noncoding variants in human erythroid disorders. PMID:27044088

  16. Human Papillomavirus Type 18 cis-Elements Crucial for Segregation and Latency.

    Directory of Open Access Journals (Sweden)

    Mart Ustav

    Full Text Available Stable maintenance replication is characteristic of the latency phase of HPV infection, during which the viral genomes are actively maintained as extrachromosomal genetic elements in infected proliferating basal keratinocytes. Active replication in the S-phase and segregation of the genome into daughter cells in mitosis are required for stable maintenance replication. Most of our knowledge about papillomavirus genome segregation has come from studies of bovine papillomavirus type 1 (BPV-1, which have demonstrated that the E2 protein cooperates with cellular trans-factors and that E2 binding sites act as cis-regulatory elements in the viral genome that are essential for the segregation process. However, the genomic organization of the regulatory region in HPVs, and the properties of the viral proteins are different from those of their BPV-1 counterparts. We have designed a segregation assay for HPV-18 and used it to demonstrate that the E2 protein performs segregation in combination with at least two E2 binding sites. The cooperative binding of the E2 protein to two E2 binding sites is a major determinant of HPV-18 genome segregation, as demonstrated by the change in spacing between adjacent binding sites #1 and #2 in the HPV-18 Upstream Regulatory Region (URR. Duplication or triplication of the natural 4 bp 5'-CGGG-3' spacer between the E2 binding sites increased the cooperative binding of the E2 molecules as well as E2-dependent segregation. Removal of any spacing between these sites eliminated cooperative binding of the E2 protein and disabled segregation of the URR and HPV-18 genome. Transfer of these configurations of the E2 binding sites into viral genomes confirmed the role of the E2 protein and binding sites #1 and #2 in the segregation process. Additional analysis demonstrated that these sites also play an important role in the transcriptional regulation of viral gene expression from different HPV-18 promoters.

  17. Human Papillomavirus Type 18 cis-Elements Crucial for Segregation and Latency.

    Science.gov (United States)

    Ustav, Mart; Castaneda, Fernando Rodriguez; Reinson, Tormi; Männik, Andres; Ustav, Mart

    2015-01-01

    Stable maintenance replication is characteristic of the latency phase of HPV infection, during which the viral genomes are actively maintained as extrachromosomal genetic elements in infected proliferating basal keratinocytes. Active replication in the S-phase and segregation of the genome into daughter cells in mitosis are required for stable maintenance replication. Most of our knowledge about papillomavirus genome segregation has come from studies of bovine papillomavirus type 1 (BPV-1), which have demonstrated that the E2 protein cooperates with cellular trans-factors and that E2 binding sites act as cis-regulatory elements in the viral genome that are essential for the segregation process. However, the genomic organization of the regulatory region in HPVs, and the properties of the viral proteins are different from those of their BPV-1 counterparts. We have designed a segregation assay for HPV-18 and used it to demonstrate that the E2 protein performs segregation in combination with at least two E2 binding sites. The cooperative binding of the E2 protein to two E2 binding sites is a major determinant of HPV-18 genome segregation, as demonstrated by the change in spacing between adjacent binding sites #1 and #2 in the HPV-18 Upstream Regulatory Region (URR). Duplication or triplication of the natural 4 bp 5'-CGGG-3' spacer between the E2 binding sites increased the cooperative binding of the E2 molecules as well as E2-dependent segregation. Removal of any spacing between these sites eliminated cooperative binding of the E2 protein and disabled segregation of the URR and HPV-18 genome. Transfer of these configurations of the E2 binding sites into viral genomes confirmed the role of the E2 protein and binding sites #1 and #2 in the segregation process. Additional analysis demonstrated that these sites also play an important role in the transcriptional regulation of viral gene expression from different HPV-18 promoters. PMID:26288015

  18. Bioinformatics pipelines for targeted resequencing and whole-exome sequencing of human and mouse genomes: a virtual appliance approach for instant deployment.

    Directory of Open Access Journals (Sweden)

    Jason Li

    Full Text Available Targeted resequencing by massively parallel sequencing has become an effective and affordable way to survey small to large portions of the genome for genetic variation. Despite the rapid development in open source software for analysis of such data, the practical implementation of these tools through construction of sequencing analysis pipelines still remains a challenging and laborious activity, and a major hurdle for many small research and clinical laboratories. We developed TREVA (Targeted REsequencing Virtual Appliance, making pre-built pipelines immediately available as a virtual appliance. Based on virtual machine technologies, TREVA is a solution for rapid and efficient deployment of complex bioinformatics pipelines to laboratories of all sizes, enabling reproducible results. The analyses that are supported in TREVA include: somatic and germline single-nucleotide and insertion/deletion variant calling, copy number analysis, and cohort-based analyses such as pathway and significantly mutated genes analyses. TREVA is flexible and easy to use, and can be customised by Linux-based extensions if required. TREVA can also be deployed on the cloud (cloud computing, enabling instant access without investment overheads for additional hardware. TREVA is available at http://bioinformatics.petermac.org/treva/.

  19. Will solid-state drives accelerate your bioinformatics? In-depth profiling, performance analysis and beyond.

    Science.gov (United States)

    Lee, Sungmin; Min, Hyeyoung; Yoon, Sungroh

    2016-07-01

    A wide variety of large-scale data have been produced in bioinformatics. In response, the need for efficient handling of biomedical big data has been partly met by parallel computing. However, the time demand of many bioinformatics programs still remains high for large-scale practical uses because of factors that hinder acceleration by parallelization. Recently, new generations of storage devices have emerged, such as NAND flash-based solid-state drives (SSDs), and with the renewed interest in near-data processing, they are increasingly becoming acceleration methods that can accompany parallel processing. In certain cases, a simple drop-in replacement of hard disk drives by SSDs results in dramatic speedup. Despite the various advantages and continuous cost reduction of SSDs, there has been little review of SSD-based profiling and performance exploration of important but time-consuming bioinformatics programs. For an informative review, we perform in-depth profiling and analysis of 23 key bioinformatics programs using multiple types of devices. Based on the insight we obtain from this research, we further discuss issues related to design and optimize bioinformatics algorithms and pipelines to fully exploit SSDs. The programs we profile cover traditional and emerging areas of importance, such as alignment, assembly, mapping, expression analysis, variant calling and metagenomics. We explain how acceleration by parallelization can be combined with SSDs for improved performance and also how using SSDs can expedite important bioinformatics pipelines, such as variant calling by the Genome Analysis Toolkit and transcriptome analysis using RNA sequencing. We hope that this review can provide useful directions and tips to accompany future bioinformatics algorithm design procedures that properly consider new generations of powerful storage devices. PMID:26330577

  20. Bioinformatics Analyses of the Role of Vascular Endothelial Growth Factor in Patients with Non-Small Cell Lung Cancer.

    Directory of Open Access Journals (Sweden)

    Ying Wang

    Full Text Available This study was aimed to identify the expression pattern of vascular endothelial growth factor (VEGF in non-small cell lung cancer (NSCLC and to explore its potential correlation with the progression of NSCLC.Gene expression profile GSE39345 was downloaded from the Gene Expression Omnibus database. Twenty healthy controls and 32 NSCLC samples before chemotherapy were analyzed to identify the differentially expressed genes (DEGs. Then pathway enrichment analysis of the DEGs was performed and protein-protein interaction networks were constructed. Particularly, VEGF genes and the VEGF signaling pathway were analyzed. The sub-network was constructed followed by functional enrichment analysis.Total 1666 up-regulated and 1542 down-regulated DEGs were identified. The down-regulated DEGs were mainly enriched in the pathways associated with cancer. VEGFA and VEGFB were found to be the initiating factor of VEGF signaling pathway. In addition, in the epidermal growth factor receptor (EGFR, VEGFA and VEGFB associated sub-network, kinase insert domain receptor (KDR, fibronectin 1 (FN1, transforming growth factor beta induced (TGFBI and proliferating cell nuclear antigen (PCNA were found to interact with at least two of the three hub genes. The DEGs in this sub-network were mainly enriched in Gene Ontology terms related to cell proliferation.EGFR, KDR, FN1, TGFBI and PCNA may interact with VEGFA to play important roles in NSCLC tumorigenesis. These genes and corresponding proteins may have the potential to be used as the targets for either diagnosis or treatment of patients with NSCLC.

  1. Integrated bioinformatics to decipher the ascorbic acid metabolic network in tomato.

    Science.gov (United States)

    Ruggieri, Valentino; Bostan, Hamed; Barone, Amalia; Frusciante, Luigi; Chiusano, Maria Luisa

    2016-07-01

    Ascorbic acid is involved in a plethora of reactions in both plant and animal metabolism. It plays an essential role neutralizing free radicals and acting as enzyme co-factor in several reaction. Since humans are ascorbate auxotrophs, enhancing the nutritional quality of a widely consumed vegetable like tomato is a desirable goal. Although the main reactions of the ascorbate biosynthesis, recycling and translocation pathways have been characterized, the assignment of tomato genes to each enzymatic step of the entire network has never been reported to date. By integrating bioinformatics approaches, omics resources and transcriptome collections today available for tomato, this study provides an overview on the architecture of the ascorbate pathway. In particular, 237 tomato loci were associated with the different enzymatic steps of the network, establishing the first comprehensive reference collection of candidate genes based on the recently released tomato gene annotation. The co-expression analyses performed by using RNA-Seq data supported the functional investigation of main expression patterns for the candidate genes and highlighted a coordinated spatial-temporal regulation of genes of the different pathways across tissues and developmental stages. Taken together these results provide evidence of a complex interplaying mechanism and highlight the pivotal role of functional related genes. The definition of genes contributing to alternative pathways and their expression profiles corroborates previous hypothesis on mechanisms of accumulation of ascorbate in the later stages of fruit ripening. Results and evidences here provided may facilitate the development of novel strategies for biofortification of tomato fruit with Vitamin C and offer an example framework for similar studies concerning other metabolic pathways and species. PMID:27007138

  2. Empowered genome community: leveraging a bioinformatics platform as a citizen-scientist collaboration tool.

    Science.gov (United States)

    Wendelsdorf, Katherine; Shah, Sohela

    2015-09-01

    There is on-going effort in the biomedical research community to leverage Next Generation Sequencing (NGS) technology to identify genetic variants that affect our health. The main challenge facing researchers is getting enough samples from individuals either sick or healthy - to be able to reliably identify the few variants that are causal for a phenotype among all other variants typically seen among individuals. At the same time, more and more individuals are having their genome sequenced either out of curiosity or to identify the cause of an illness. These individuals may benefit from of a way to view and understand their data. QIAGEN's Ingenuity Variant Analysis is an online application that allows users with and without extensive bioinformatics training to incorporate information from published experiments, genetic databases, and a variety of statistical models to identify variants, from a long list of candidates, that are most likely causal for a phenotype as well as annotate variants with what is already known about them in the literature and databases. Ingenuity Variant Analysis is also an information sharing platform where users may exchange samples and analyses. The Empowered Genome Community (EGC) is a new program in which QIAGEN is making this on-line tool freely available to any individual who wishes to analyze their own genetic sequence. EGC members are then able to make their data available to other Ingenuity Variant Analysis users to be used in research. Here we present and describe the Empowered Genome Community in detail. We also present a preliminary, proof-of-concept study that utilizes the 200 genomes currently available through the EGC. The goal of this program is to allow individuals to access and understand their own data as well as facilitate citizen-scientist collaborations that can drive research forward and spur quality scientific dialogue in the general public. PMID:27054071

  3. Integration and bioinformatics analysis of DNA-methylated genes associated with drug resistance in ovarian cancer

    Science.gov (United States)

    YAN, BINGBING; YIN, FUQIANG; WANG, QI; ZHANG, WEI; LI, LI

    2016-01-01

    The main obstacle to the successful treatment of ovarian cancer is the development of drug resistance to combined chemotherapy. Among all the factors associated with drug resistance, DNA methylation apparently plays a critical role. In this study, we performed an integrative analysis of the 26 DNA-methylated genes associated with drug resistance in ovarian cancer, and the genes were further evaluated by comprehensive bioinformatics analysis including gene/protein interaction, biological process enrichment and annotation. The results from the protein interaction analyses revealed that at least 20 of these 26 methylated genes are present in the protein interaction network, indicating that they interact with each other, have a correlation in function, and may participate as a whole in the regulation of ovarian cancer drug resistance. There is a direct interaction between the phosphatase and tensin homolog (PTEN) gene and at least half of the other genes, indicating that PTEN may possess core regulatory functions among these genes. Biological process enrichment and annotation demonstrated that most of these methylated genes were significantly associated with apoptosis, which is possibly an essential way for these genes to be involved in the regulation of multidrug resistance in ovarian cancer. In addition, a comprehensive analysis of clinical factors revealed that the methylation level of genes that are associated with the regulation of drug resistance in ovarian cancer was significantly correlated with the prognosis of ovarian cancer. Overall, this study preliminarily explains the potential correlation between the genes with DNA methylation and drug resistance in ovarian cancer. This finding has significance for our understanding of the regulation of resistant ovarian cancer by methylated genes, the treatment of ovarian cancer, and improvement of the prognosis of ovarian cancer. PMID:27347118

  4. Empowered genome community: leveraging a bioinformatics platform as a citizen–scientist collaboration tool

    Directory of Open Access Journals (Sweden)

    Katherine Wendelsdorf

    2015-09-01

    Full Text Available There is on-going effort in the biomedical research community to leverage Next Generation Sequencing (NGS technology to identify genetic variants that affect our health. The main challenge facing researchers is getting enough samples from individuals either sick or healthy – to be able to reliably identify the few variants that are causal for a phenotype among all other variants typically seen among individuals. At the same time, more and more individuals are having their genome sequenced either out of curiosity or to identify the cause of an illness. These individuals may benefit from of a way to view and understand their data. QIAGEN's Ingenuity Variant Analysis is an online application that allows users with and without extensive bioinformatics training to incorporate information from published experiments, genetic databases, and a variety of statistical models to identify variants, from a long list of candidates, that are most likely causal for a phenotype as well as annotate variants with what is already known about them in the literature and databases. Ingenuity Variant Analysis is also an information sharing platform where users may exchange samples and analyses. The Empowered Genome Community (EGC is a new program in which QIAGEN is making this on-line tool freely available to any individual who wishes to analyze their own genetic sequence. EGC members are then able to make their data available to other Ingenuity Variant Analysis users to be used in research. Here we present and describe the Empowered Genome Community in detail. We also present a preliminary, proof-of-concept study that utilizes the 200 genomes currently available through the EGC. The goal of this program is to allow individuals to access and understand their own data as well as facilitate citizen–scientist collaborations that can drive research forward and spur quality scientific dialogue in the general public.

  5. "On the job" learning: A bioinformatics course incorporating undergraduates in actual research projects and manuscript submissions.

    Science.gov (United States)

    Smith, Jason T; Harris, Justine C; Lopez, Oscar J; Valverde, Laura; Borchert, Glen M

    2015-01-01

    The sequencing of whole genomes and the analysis of genetic information continues to fundamentally change biological and medical research. Unfortunately, the people best suited to interpret this data (biologically trained researchers) are commonly discouraged by their own perceived computational limitations. To address this, we developed a course to help alleviate this constraint. Remarkably, in addition to equipping our undergraduates with an informatic toolset, we found our course design helped prepare our students for collaborative research careers in unexpected ways. Instead of simply offering a traditional lecture- or laboratory-based course, we chose a guided inquiry method, where an instructor-selected research question is examined by students in a collaborative analysis with students contributing to experimental design, data collection, and manuscript reporting. While students learn the skills needed to conduct bioinformatic research throughout all sections of the course, importantly, students also gain experience in working as a team and develop important communication skills through working with their partner and the class as a whole, and by contributing to an original research article. Remarkably, in its first three semesters, this novel computational genetics course has generated 45 undergraduate authorships across three peer-reviewed articles. More importantly, the students that took this course acquired a positive research experience, newfound informatics technical proficiency, unprecedented familiarity with manuscript preparation, and an earned sense of achievement. Although this course deals with analyses of genetic systems, we suggest the basic concept of integrating actual research projects into a 16-week undergraduate course could be applied to numerous other research-active academic fields. PMID:25643604

  6. Rapid cloning and bioinformatic analysis of spinach Y chromosome-specific EST sequences

    Indian Academy of Sciences (India)

    Chuan-Liang Deng; Wei-Li Zhang; Ying Cao; Shao-Jing Wang; Shu-Fen Li; Wu-Jun Gao; Long-Dou Lu

    2015-12-01

    The genome of spinach single chromosome complement is about 1000 Mbp, which is the model material to study the molecular mechanisms of plant sex differentiation. The cytological study showed that the biggest spinach chromosome (chromosome 1) was taken as spinach sex chromosome. It had three alleles of sex-related , m and . Many researchers have been trying to clone the sex-determining genes and investigated the molecular mechanism of spinach sex differentiation. However, there are no successful cloned reports about these genes. A new technology combining chromosome microdissection with hybridization-specific amplification (HSA) was adopted. The spinach Y chromosome degenerate oligonucleotide primed-PCR (DOP-PCR) products were hybridized with cDNA of the male spinach flowers in florescence. The female spinach genome was taken as blocker and cDNA library specifically expressed in Y chromosome was constructed. Moreover, expressed sequence tag (EST) sequences in cDNA library were cloned, sequenced and bioinformatics was analysed. There were 63 valid EST sequences obtained in this study. The fragment size was between 53 and 486 bp. BLASTn homologous alignment indicated that 12 EST sequences had homologous sequences of nucleic acids, the rest were new sequences. BLASTx homologous alignment indicated that 16 EST sequences had homologous protein-encoding nucleic acid sequence. The spinach Y chromosome-specific EST sequences laid the foundation for cloning the functional genes, specifically expressed in spinach Y chromosome. Meanwhile, the establishment of the technology system in the research provided a reference for rapid cloning of other biological sex chromosome-specific EST sequences.

  7. Bioinformatic prediction, deep sequencing of microRNAs and expression analysis during phenotypic plasticity in the pea aphid, Acyrthosiphon pisum

    Directory of Open Access Journals (Sweden)

    Leterme Nathalie

    2010-05-01

    Full Text Available Abstract Background Post-transcriptional regulation in eukaryotes can be operated through microRNA (miRNAs mediated gene silencing. MiRNAs are small (18-25 nucleotides non-coding RNAs that play crucial role in regulation of gene expression in eukaryotes. In insects, miRNAs have been shown to be involved in multiple mechanisms such as embryonic development, tissue differentiation, metamorphosis or circadian rhythm. Insect miRNAs have been identified in different species belonging to five orders: Coleoptera, Diptera, Hymenoptera, Lepidoptera and Orthoptera. Results We developed high throughput Solexa sequencing and bioinformatic analyses of the genome of the pea aphid Acyrthosiphon pisum in order to identify the first miRNAs from a hemipteran insect. By combining these methods we identified 149 miRNAs including 55 conserved and 94 new miRNAs. Moreover, we investigated the regulation of these miRNAs in different alternative morphs of the pea aphid by analysing the expression of miRNAs across the switch of reproduction mode. Pea aphid microRNA sequences have been posted to miRBase: http://microrna.sanger.ac.uk/sequences/ Conclusions Our study has identified candidates as putative regulators involved in reproductive polyphenism in aphids and opens new avenues for further functional analyses.

  8. Bioinformatics for transporter pharmacogenomics and systems biology: data integration and modeling with UML.

    Science.gov (United States)

    Yan, Qing

    2010-01-01

    Bioinformatics is the rational study at an abstract level that can influence the way we understand biomedical facts and the way we apply the biomedical knowledge. Bioinformatics is facing challenges in helping with finding the relationships between genetic structures and functions, analyzing genotype-phenotype associations, and understanding gene-environment interactions at the systems level. One of the most important issues in bioinformatics is data integration. The data integration methods introduced here can be used to organize and integrate both public and in-house data. With the volume of data and the high complexity, computational decision support is essential for integrative transporter studies in pharmacogenomics, nutrigenomics, epigenetics, and systems biology. For the development of such a decision support system, object-oriented (OO) models can be constructed using the Unified Modeling Language (UML). A methodology is developed to build biomedical models at different system levels and construct corresponding UML diagrams, including use case diagrams, class diagrams, and sequence diagrams. By OO modeling using UML, the problems of transporter pharmacogenomics and systems biology can be approached from different angles with a more complete view, which may greatly enhance the efforts in effective drug discovery and development. Bioinformatics resources of membrane transporters and general bioinformatics databases and tools that are frequently used in transporter studies are also collected here. An informatics decision support system based on the models presented here is available at http://www.pharmtao.com/transporter . The methodology developed here can also be used for other biomedical fields. PMID:20419428

  9. New Directions in Statistical Physics: Econophysics, Bioinformatics, and Pattern Recognition

    International Nuclear Information System (INIS)

    This book contains 18 contributions from different authors. Its subtitle 'Econophysics, Bioinformatics, and Pattern Recognition' says more precisely what it is about: not so much about central problems of conventional statistical physics like equilibrium phase transitions and critical phenomena, but about its interdisciplinary applications. After a long period of specialization, physicists have, over the last few decades, found more and more satisfaction in breaking out of the limitations set by the traditional classification of sciences. Indeed, this classification had never been strict, and physicists in particular had always ventured into other fields. Helmholtz, in the middle of the 19th century, had considered himself a physicist when working on physiology, stressing that the physics of animate nature is as much a legitimate field of activity as the physics of inanimate nature. Later, Max Delbrueck and Francis Crick did for experimental biology what Schroedinger did for its theoretical foundation. And many of the experimental techniques used in chemistry, biology, and medicine were developed by a steady stream of talented physicists who left their proper discipline to venture out into the wider world of science. The development we have witnessed over the last thirty years or so is different. It started with neural networks where methods could be applied which had been developed for spin glasses, but todays list includes vehicular traffic (driven lattice gases), geology (self-organized criticality), economy (fractal stochastic processes and large scale simulations), engineering (dynamical chaos), and many others. By staying in the physics departments, these activities have transformed the physics curriculum and the view physicists have of themselves. In many departments there are now courses on econophysics or on biological physics, and some universities offer degrees in the physics of traffic or in econophysics. In order to document this change of attitude

  10. High-throughput bioinformatics with the Cyrille2 pipeline system.

    NARCIS (Netherlands)

    Fiers, M.W.E.J.; Burgt, van der A.; Datema, E.; Groot, de J.C.W.; Ham, van R.C.H.J.

    2008-01-01

    Background - Modern omics research involves the application of high-throughput technologies that generate vast volumes of data. These data need to be pre-processed, analyzed and integrated with existing knowledge through the use of diverse sets of software tools, models and databases. The analyses a

  11. Uses and challenges of bioinformatic tools in mass spectrometric-based proteomic brain perturbation studies.

    Science.gov (United States)

    Guingab-Cagmat, Joy D; Cagmat, Emilio B; Kobeissy, Firas H; Anagli, John

    2014-01-01

    Mass spectrometry (MS) has become the method of choice to study the proteome of brain injury. The high throughput nature of MS-based proteomic experiments generates massive amount of mass spectral data presenting great challenges in downstream interpretation. Currently, different bioinformatics platforms are available for functional analysis and data mining of MS-generated proteomic data. These tools provide a way to convert data sets to biologically interpretable results and functional outcomes. In this review, a brief overview of the currently available bioinformatics strategies applied to neuroproteomic studies is presented. Application of commercially available bioinformatics software to different brain injury studies demonstrates integration of the data mining and analysis applications into neuroproteomic workflows that can identify major protein markers as well as highlight the biological processes and molecular functions involved. PMID:24449691

  12. An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics

    International Nuclear Information System (INIS)

    Bioinformatics researchers are increasingly confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce. An overview is given of the current usage within the bioinformatics community of Hadoop, a top-level Apache Software Foundation project, and of associated open source software projects. The concepts behind Hadoop and the associated HBase project are defined, and current bioinformatics software that employ Hadoop is described. The focus is on next-generation sequencing, as the leading application area to date.

  13. Associations between Input and Outcome Variables in an Online High School Bioinformatics Instructional Program

    Science.gov (United States)

    Lownsbery, Douglas S.

    Quantitative data from a completed year of an innovative online high school bioinformatics instructional program were analyzed as part of a descriptive research study. The online instructional program provided the opportunity for high school students to develop content understandings of molecular genetics and to use sophisticated bioinformatics tools and methodologies to conduct authentic research. Quantitative data were analyzed to identify potential associations between independent program variables including implementation setting, gender, and student educational backgrounds and dependent variables indicating success in the program including completion rates for analyzing DNA clones and performance gains from pre-to-post assessments of bioinformatics knowledge. Study results indicate that understanding associations between student educational backgrounds and level of success may be useful for structuring collaborative learning groups and enhancing scaffolding and support during the program to promote higher levels of success for participating students.

  14. A Survey of Bioinformatics Database and Software Usage through Mining the Literature

    Science.gov (United States)

    Nenadic, Goran; Filannino, Michele; Brass, Andy; Robertson, David L.; Stevens, Robert

    2016-01-01

    Computer-based resources are central to much, if not most, biological and medical research. However, while there is an ever expanding choice of bioinformatics resources to use, described within the biomedical literature, little work to date has provided an evaluation of the full range of availability or levels of usage of database and software resources. Here we use text mining to process the PubMed Central full-text corpus, identifying mentions of databases or software within the scientific literature. We provide an audit of the resources contained within the biomedical literature, and a comparison of their relative usage, both over time and between the sub-disciplines of bioinformatics, biology and medicine. We find that trends in resource usage differs between these domains. The bioinformatics literature emphasises novel resource development, while database and software usage within biology and medicine is more stable and conservative. Many resources are only mentioned in the bioinformatics literature, with a relatively small number making it out into general biology, and fewer still into the medical literature. In addition, many resources are seeing a steady decline in their usage (e.g., BLAST, SWISS-PROT), though some are instead seeing rapid growth (e.g., the GO, R). We find a striking imbalance in resource usage with the top 5% of resource names (133 names) accounting for 47% of total usage, and over 70% of resources extracted being only mentioned once each. While these results highlight the dynamic and creative nature of bioinformatics research they raise questions about software reuse, choice and the sharing of bioinformatics practice. Is it acceptable that so many resources are apparently never reused? Finally, our work is a step towards automated extraction of scientific method from text. We make the dataset generated by our study available under the CC0 license here: http://dx.doi.org/10.6084/m9.figshare.1281371. PMID:27331905

  15. A Survey of Bioinformatics Database and Software Usage through Mining the Literature.

    Science.gov (United States)

    Duck, Geraint; Nenadic, Goran; Filannino, Michele; Brass, Andy; Robertson, David L; Stevens, Robert

    2016-01-01

    Computer-based resources are central to much, if not most, biological and medical research. However, while there is an ever expanding choice of bioinformatics resources to use, described within the biomedical literature, little work to date has provided an evaluation of the full range of availability or levels of usage of database and software resources. Here we use text mining to process the PubMed Central full-text corpus, identifying mentions of databases or software within the scientific literature. We provide an audit of the resources contained within the biomedical literature, and a comparison of their relative usage, both over time and between the sub-disciplines of bioinformatics, biology and medicine. We find that trends in resource usage differs between these domains. The bioinformatics literature emphasises novel resource development, while database and software usage within biology and medicine is more stable and conservative. Many resources are only mentioned in the bioinformatics literature, with a relatively small number making it out into general biology, and fewer still into the medical literature. In addition, many resources are seeing a steady decline in their usage (e.g., BLAST, SWISS-PROT), though some are instead seeing rapid growth (e.g., the GO, R). We find a striking imbalance in resource usage with the top 5% of resource names (133 names) accounting for 47% of total usage, and over 70% of resources extracted being only mentioned once each. While these results highlight the dynamic and creative nature of bioinformatics research they raise questions about software reuse, choice and the sharing of bioinformatics practice. Is it acceptable that so many resources are apparently never reused? Finally, our work is a step towards automated extraction of scientific method from text. We make the dataset generated by our study available under the CC0 license here: http://dx.doi.org/10.6084/m9.figshare.1281371. PMID:27331905

  16. Bioinformatics for the synthetic biology of natural products: integrating across the Design-Build-Test cycle.

    Science.gov (United States)

    Carbonell, Pablo; Currin, Andrew; Jervis, Adrian J; Rattray, Nicholas J W; Swainston, Neil; Yan, Cunyu; Takano, Eriko; Breitling, Rainer

    2016-08-27

    Covering: 2000 to 2016Progress in synthetic biology is enabled by powerful bioinformatics tools allowing the integration of the design, build and test stages of the biological engineering cycle. In this review we illustrate how this integration can be achieved, with a particular focus on natural products discovery and production. Bioinformatics tools for the DESIGN and BUILD stages include tools for the selection, synthesis, assembly and optimization of parts (enzymes and regulatory elements), devices (pathways) and systems (chassis). TEST tools include those for screening, identification and quantification of metabolites for rapid prototyping. The main advantages and limitations of these tools as well as their interoperability capabilities are highlighted. PMID:27185383

  17. Rough-fuzzy pattern recognition applications in bioinformatics and medical imaging

    CERN Document Server

    Maji, Pradipta

    2012-01-01

    Learn how to apply rough-fuzzy computing techniques to solve problems in bioinformatics and medical image processing Emphasizing applications in bioinformatics and medical image processing, this text offers a clear framework that enables readers to take advantage of the latest rough-fuzzy computing techniques to build working pattern recognition models. The authors explain step by step how to integrate rough sets with fuzzy sets in order to best manage the uncertainties in mining large data sets. Chapters are logically organized according to the major phases of pattern recognition systems dev

  18. Genomics, proteomics and bioinformatics of human heart failure

    OpenAIRE

    dos Remedios, C. G.; Liew, C C; Allen, P. D.; Winslow, R. L.; Van Eyk, J. E.; Dunn, M J

    2003-01-01

    Unraveling the molecular complexities of human heart failure, particularly end-stage failure, can be achieved by combining multiple investigative approaches. There are several parts to the problem. Each patient is the product of a complex set of genetic variations, different degrees of influence of diets and lifestyles, and usually heart transplantation patients are treated with multiple drugs. The genomic status of the myocardium of any one transplant patient can be analysed using gene array...

  19. A bioinformatics analysis of Lamin-A regulatory network: a perspective on epigenetic involvement in Hutchinson-Gilford progeria syndrome.

    Science.gov (United States)

    Arancio, Walter

    2012-04-01

    Hutchinson-Gilford progeria syndrome (HGPS) is a rare human genetic disease that leads to premature aging. HGPS is caused by mutation in the Lamin-A (LMNA) gene that leads, in affected young individuals, to the accumulation of the progerin protein, usually present only in aging differentiated cells. Bioinformatics analyses of the network of interactions of the LMNA gene and transcripts are presented. The LMNA gene network has been analyzed using the BioGRID database (http://thebiogrid.org/) and related analysis tools such as Osprey (http://biodata.mshri.on.ca/osprey/servlet/Index) and GeneMANIA ( http://genemania.org/). The network of interaction of LMNA transcripts has been further analyzed following the competing endogenous (ceRNA) hypotheses (RNA cross-talk via microRNAs [miRNAs]) and using the miRWalk database and tools (www.ma.uni-heidelberg.de/apps/zmf/mirwalk/). These analyses suggest particular relevance of epigenetic modifiers (via acetylase complexes and specifically HTATIP histone acetylase) and adenosine triphosphate (ATP)-dependent chromatin remodelers (via pBAF, BAF, and SWI/SNF complexes). PMID:22533413

  20. Evaluating the Effectiveness of a Practical Inquiry-Based Learning Bioinformatics Module on Undergraduate Student Engagement and Applied Skills

    Science.gov (United States)

    Brown, James A. L.

    2016-01-01

    A pedagogic intervention, in the form of an inquiry-based peer-assisted learning project (as a practical student-led bioinformatics module), was assessed for its ability to increase students' engagement, practical bioinformatic skills and process-specific knowledge. Elements assessed were process-specific knowledge following module completion,…

  1. BioXSD: the common data-exchange format for everyday bioinformatics web services

    DEFF Research Database (Denmark)

    Kalas, M.; Puntervoll, P.; Joseph, A.;

    2010-01-01

    Motivation: The world-wide community of life scientists has access to a large number of public bioinformatics databases and tools, which are developed and deployed using diverse technologies and designs. More and more of the resources offer programmatic web-service interface. However, efficient use...

  2. CattleTickBase: An integrated Internet-based bioinformatics resource for Rhipicephalus (Boophilus) microplus

    Science.gov (United States)

    The Rhipicephalus microplus genome is large and complex in structure, making a genome sequence difficult to assemble and costly to resource the required bioinformatics. In light of this, a consortium of international collaborators was formed to pool resources to begin sequencing this genome. We have...

  3. Forensic Bioinformatics: An innovative technological advancement in the field of Forensic Medicine and Diagnosis

    Directory of Open Access Journals (Sweden)

    Kumar Ajay

    2012-01-01

    Full Text Available Background: The role of Bioinformatics in this modern age of technology advancement can not be over-emphasized. Aim: This study reviews the principle, techniques, and applications of Forensic Bioinformatics. Methods and Materials: Literature searches were done to identify relevant studies. Results: The concepts of sequence annotation and whole genome sequencing were possible due to the assimilation of software based tools which are exclusively responsible for the segregation of bulk genomic data. DNA profiling produces profiles which are the encrypted sets of numbers that reflect a person's DNA makeup, which can also be used as the person's identifier. Implementation of automated analysis system coupled with latest computer based software’s making the results easy to comprehend. Major application of forensic Bioinformatics in the field of forensic science includes quick, bulk and precise review of the DNA evidence with the intent of finding and drawing attention to recurring problems so that the testing continues to better and more reliable. Present day, Genetic Counsellors are also used the derived information of Genomic data for creating pedigree in case of genetic disorders. Conclusion: It is important that with the usefulness of Forensic Bioinformatics, a far greater commitment to openness and transparency and a greater availability of documents to public scrutiny is recommended.

  4. Evolving Strategies for the Incorporation of Bioinformatics within the Undergraduate Cell Biology Curriculum

    Science.gov (United States)

    Honts, Jerry E.

    2003-01-01

    Recent advances in genomics and structural biology have resulted in an unprecedented increase in biological data available from Internet-accessible databases. In order to help students effectively use this vast repository of information, undergraduate biology students at Drake University were introduced to bioinformatics software and databases in…

  5. A Linked Series of Laboratory Exercises in Molecular Biology Utilizing Bioinformatics and GFP

    Science.gov (United States)

    Medin, Carey L.; Nolin, Katie L.

    2011-01-01

    Molecular biologists commonly use bioinformatics to map and analyze DNA and protein sequences and to align different DNA and protein sequences for comparison. Additionally, biologists can create and view 3D models of protein structures to further understand intramolecular interactions. The primary goal of this 10-week laboratory was to introduce…

  6. Implementation and Assessment of a Molecular Biology and Bioinformatics Undergraduate Degree Program

    Science.gov (United States)

    Pham, Daphne Q. -D.; Higgs, David C.; Statham, Anne; Schleiter, Mary Kay

    2008-01-01

    The Department of Biological Sciences at the University of Wisconsin-Parkside has developed and implemented an innovative, multidisciplinary undergraduate curriculum in Molecular Biology and Bioinformatics (MBB). The objective of the MBB program is to give students a hands-on facility with molecular biology theories and laboratory techniques, an…

  7. Infusing Bioinformatics and Research-Like Experience into a Molecular Biology Laboratory Course

    Science.gov (United States)

    Nogaj, Luiza A.

    2014-01-01

    A nine-week laboratory project designed for a sophomore level molecular biology course is described. Small groups of students (3-4 per group) choose a tumor suppressor gene (TSG) or an oncogene for this project. Each group researches the role of their TSG/oncogene from primary literature articles and uses bioinformatics engines to find the gene…

  8. Bioinformatics-Driven Identification and Examination of Candidate Genes for Non-Alcoholic Fatty Liver Disease

    DEFF Research Database (Denmark)

    Banasik, Karina; Justesen, Johanne M.; Hornbak, Malene;

    2011-01-01

    Objective: Candidate genes for non-alcoholic fatty liver disease (NAFLD) identified by a bioinformatics approach were examined for variant associations to quantitative traits of NAFLD-related phenotypes. Research Design and Methods: By integrating public database text mining, trans-organism protein...

  9. Synergy between Medical Informatics and Bioinformatics: Facilitating Genomic Medicine for Future Health Care

    Czech Academy of Sciences Publication Activity Database

    Martin-Sanchez, F.; Iakovidis, I.; Norager, S.; Maojo, V.; de Groen, P.; Van der Lei, J.; Jones, T.; Abraham-Fuchs, K.; Apweiler, R.; Babic, A.; Baud, R.; Breton, V.; Cinquin, P.; Doupi, P.; Dugas, M.; Eils, R.; Engelbrecht, R.; Ghazal, P.; Jehenson, P.; Kulikowski, C.; Lampe, K.; De Moor, G.; Orphanoudakis, S.; Rossing, N.; Sarachan, B.; Sousa, A.; Spekowius, G.; Thireos, G.; Zahlmann, G.; Zvárová, Jana; Hermosilla, I.; Vicente, F. J.

    2004-01-01

    Roč. 37, - (2004), s. 30-42. ISSN 1532-0464 Institutional research plan: CEZ:AV0Z1030915 Keywords : bioinformatics * medical informatics * genomics * genomic medicine * biomedical informatics Subject RIV: BD - Theory of Information Impact factor: 1.013, year: 2004

  10. Ramping up to the Biology Workbench: A Multi-Stage Approach to Bioinformatics Education

    Science.gov (United States)

    Greene, Kathleen; Donovan, Sam

    2005-01-01

    In the process of designing and field-testing bioinformatics curriculum materials, we have adopted a three-stage, progressive model that emphasizes collaborative scientific inquiry. The elements of the model include: (1) context setting, (2) introduction to concepts, processes, and tools, and (3) development of competent use of technologically…

  11. Bioinformatics analysis identifies several intrinsically disordered human E3 ubiquitin-protein ligases

    DEFF Research Database (Denmark)

    Boomsma, Wouter; Nielsen, Sofie V; Lindorff-Larsen, Kresten;

    2016-01-01

    conduct a bioinformatics analysis to examine >600 human and S. cerevisiae E3 ligases to identify enzymes that are similar to San1 in terms of function and/or mechanism of substrate recognition. An initial sequence-based database search was found to detect candidates primarily based on the homology of...

  12. In-depth analysis of the adipocyte proteome by mass spectrometry and bioinformatics

    DEFF Research Database (Denmark)

    Adachi, Jun; Kumar, Chanchal; Zhang, Yanling;

    2007-01-01

    , mitochondria, membrane, and cytosol of 3T3-L1 adipocytes. We identified 3,287 proteins while essentially eliminating false positives, making this one of the largest high confidence proteomes reported to date. Comprehensive bioinformatics analysis revealed that the adipocyte proteome, despite its specialized...

  13. Incorporating a New Bioinformatics Component into Genetics at a Historically Black College: Outcomes and Lessons

    Science.gov (United States)

    Holtzclaw, J. David; Eisen, Arri; Whitney, Erika M.; Penumetcha, Meera; Hoey, J. Joseph; Kimbro, K. Sean

    2006-01-01

    Many students at minority-serving institutions are underexposed to Internet resources such as the human genome project, PubMed, NCBI databases, and other Web-based technologies because of a lack of financial resources. To change this, we designed and implemented a new bioinformatics component to supplement the undergraduate Genetics course at…

  14. An "in silico" Bioinformatics Laboratory Manual for Bioscience Departments: "Prediction of Glycosylation Sites in Phosphoethanolamine Transferases"

    Science.gov (United States)

    Alyuruk, Hakan; Cavas, Levent

    2014-01-01

    Genomics and proteomics projects have produced a huge amount of raw biological data including DNA and protein sequences. Although these data have been stored in data banks, their evaluation is strictly dependent on bioinformatics tools. These tools have been developed by multidisciplinary experts for fast and robust analysis of biological data.…

  15. Bioinformatics and structural characterization of a hypothetical protein from Streptococcus mutans

    DEFF Research Database (Denmark)

    Nan, Jie; Brostromer, Erik; Liu, Xiang-Yu;

    2009-01-01

    -like molecules. From the interlinking structural and bioinformatics studies, we have concluded that SMU.440 could be involved in polyketide-like antibiotic resistance, providing a better understanding of this hypothetical protein. Besides, the combination of multiple methods in this study can be used as a...

  16. Bioinformatics Education in High School: Implications for Promoting Science, Technology, Engineering, and Mathematics Careers

    Science.gov (United States)

    Kovarik, Dina N.; Patterson, Davis G.; Cohen, Carolyn; Sanders, Elizabeth A.; Peterson, Karen A.; Porter, Sandra G.; Chowning, Jeanne Ting

    2013-01-01

    We investigated the effects of our Bio-ITEST teacher professional development model and bioinformatics curricula on cognitive traits (awareness, engagement, self-efficacy, and relevance) in high school teachers and students that are known to accompany a developing interest in science, technology, engineering, and mathematics (STEM) careers. The…

  17. Strategies for Using Peer-Assisted Learning Effectively in an Undergraduate Bioinformatics Course

    Science.gov (United States)

    Shapiro, Casey; Ayon, Carlos; Moberg-Parker, Jordan; Levis-Fitzgerald, Marc; Sanders, Erin R.

    2013-01-01

    This study used a mixed methods approach to evaluate hybrid peer-assisted learning approaches incorporated into a bioinformatics tutorial for a genome annotation research project. Quantitative and qualitative data were collected from undergraduates who enrolled in a research-based laboratory course during two different academic terms at UCLA.…

  18. A Critical Analysis of Assessment Quality in Genomics and Bioinformatics Education Research

    Science.gov (United States)

    Campbell, Chad E.; Nehm, Ross H.

    2013-01-01

    The growing importance of genomics and bioinformatics methods and paradigms in biology has been accompanied by an explosion of new curricula and pedagogies. An important question to ask about these educational innovations is whether they are having a meaningful impact on students' knowledge, attitudes, or skills. Although assessments are…

  19. Bioinformatic analysis of functional differences between the immunoproteasome and the constitutive proteasome

    DEFF Research Database (Denmark)

    Kesmir, Can; van Noort, V.; de Boer, R.J.;

    2003-01-01

    not yet been quantified how different the specificity of two forms of the proteasome are. The main question, which still lacks direct evidence, is whether the immunoproteasome generates more MHC ligands. Here we use bioinformatics tools to quantify these differences and show that the immunoproteasome...

  20. Alu Insertions and Genetic Diversity: A Preliminary Investigation by an Undergraduate Bioinformatics Class

    Science.gov (United States)

    Elwess, Nancy L.; Duprey, Stephen L.; Harney, Lindesay A.; Langman, Jessie E.; Marino, Tara C.; Martinez, Carolina; McKeon, Lauren L.; Moss, Chantel I. E.; Myrie, Sasha S.; Taylor, Luke Ryan

    2008-01-01

    "Alu"-insertion polymorphisms were used by an undergraduate Bioinformatics class to study how these insertion sites could be the basis for an investigation in human population genetics. Based on the students' investigation, both allele and genotype "Alu" frequencies were determined for African-American and Japanese populations as well as a…

  1. Integration of Proteomics, Bioinformatics and Systems biology in Brain Injury Biomarker Discovery

    Directory of Open Access Journals (Sweden)

    JoyGuingab-Cagmat

    2013-05-01

    Full Text Available Traumatic brain injury (TBI is a major medical crisis without any FDA-approved pharmacological therapies that have been demonstrated to improve functional outcomes. It has been argued that discovery of disease-relevant biomarkers might help to guide successful clinical trials for TBI. Major advances in mass spectrometry (MS have revolutionized the field of proteomic biomarker discovery and facilitated the identification of several candidate markers that are being further evaluated for their efficacy as TBI biomarkers. However, several hurdles have to be overcome even during the discovery phase which is only the first step in the long process of biomarker development. The high throughput nature of MS-based proteomic experiments generates a massive amount of mass spectral data presenting great challenges in downstream interpretation. Currently, different bioinformatics platforms are available for functional analysis and data mining of MS-generated proteomic data. These tools provide a way to convert data sets to biologically interpretable results and functional outcomes. A strategy that has promise in advancing biomarker development involves the triad of proteomics, bioinformatics and systems biology. In this review, a brief overview of how bioinformatics and systems biology tools analyze, transform and interpret complex MS datasets into biologically relevant results is discussed. In addition, challenges and limitations of proteomics, bioinformatics and systems biology in TBI biomarker discovery are presented. A brief survey of researches that utilized these three overlapping disciplines in TBI biomarker discovery is also presented. Finally, examples of TBI biomarkers and their applications are discussed.

  2. Integration of proteomics, bioinformatics, and systems biology in traumatic brain injury biomarker discovery.

    Science.gov (United States)

    Guingab-Cagmat, J D; Cagmat, E B; Hayes, R L; Anagli, J

    2013-01-01

    Traumatic brain injury (TBI) is a major medical crisis without any FDA-approved pharmacological therapies that have been demonstrated to improve functional outcomes. It has been argued that discovery of disease-relevant biomarkers might help to guide successful clinical trials for TBI. Major advances in mass spectrometry (MS) have revolutionized the field of proteomic biomarker discovery and facilitated the identification of several candidate markers that are being further evaluated for their efficacy as TBI biomarkers. However, several hurdles have to be overcome even during the discovery phase which is only the first step in the long process of biomarker development. The high-throughput nature of MS-based proteomic experiments generates a massive amount of mass spectral data presenting great challenges in downstream interpretation. Currently, different bioinformatics platforms are available for functional analysis and data mining of MS-generated proteomic data. These tools provide a way to convert data sets to biologically interpretable results and functional outcomes. A strategy that has promise in advancing biomarker development involves the triad of proteomics, bioinformatics, and systems biology. In this review, a brief overview of how bioinformatics and systems biology tools analyze, transform, and interpret complex MS datasets into biologically relevant results is discussed. In addition, challenges and limitations of proteomics, bioinformatics, and systems biology in TBI biomarker discovery are presented. A brief survey of researches that utilized these three overlapping disciplines in TBI biomarker discovery is also presented. Finally, examples of TBI biomarkers and their applications are discussed. PMID:23750150

  3. Harvard Medical School professor to give lecture on bacterial toxins at Virginia Bioinformatics Institute

    OpenAIRE

    Whyte, Barry James

    2009-01-01

    R. John Collier, Maude and Lillian Presley Professor of Microbiology and Molecular Genetics at Harvard Medical School, will visit the Virginia Bioinformatics Institute at Virginia Tech on May 21 and 22 to discuss his research on the function of bacterial toxins, including how this work can be used to develop countermeasures against anthrax.

  4. The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis.

    Science.gov (United States)

    Alva, Vikram; Nam, Seung-Zin; Söding, Johannes; Lupas, Andrei N

    2016-07-01

    The MPI Bioinformatics Toolkit (http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been available since 2008 and has serviced more than 1.6 million external user queries. The usage of the Toolkit has continued to increase linearly over the years, reaching more than 400 000 queries in 2015. In fact, through the breadth of its tools and their tight interconnection, the Toolkit has become an excellent platform for experimental scientists as well as a useful resource for teaching bioinformatic inquiry to students in the life sciences. In this article, we report on the evolution of the Toolkit over the last ten years, focusing on the expansion of the tool repertoire (e.g. CS-BLAST, HHblits) and on infrastructural work needed to remain operative in a changing web environment. PMID:27131380

  5. Virginia Bioinformatics Institute, collaborators receive $1.45 million to develop petascale computer modeling capabilities

    OpenAIRE

    Whyte, Barry James

    2009-01-01

    The National Science Foundation (NSF) has awarded a four-year, $1.45-million grant to the Network Dynamics and Simulation Science Laboratory at the Virginia Bioinformatics Institute at Virginia Tech and partners to develop petascale computing environments that model billions of individuals in extremely large social and information networks.

  6. Identification of novel highly expressed genes in pancreatic ductal adenocarcinomas through a bioinformatics analysis of expressed sequence tags.

    Science.gov (United States)

    Cao, Dengfeng; Hustinx, Steven R; Sui, Guoping; Bala, P; Sato, Norihiro; Martin, Sean; Maitra, Anirban; Murphy, Kathleen M; Cameron, John L; Yeo, Charles J; Kern, Scott E; Goggins, Michael; Pandey, Akhilesh; Hruban, Ralph H

    2004-11-01

    In most microarray experiments, a significant fraction of the differentially expressed mRNAs identified correspond to expressed sequence tags (ESTs) and are generally discarded from further analyses. We used careful bioinformatics analyses to characterize those ESTs that were found to be highly overexpressed in a series of pancreatic adenocarcinomas. cDNA was prepared from 60 non-neoplastic samples (normal pancreas [n = 20], normal colon [n = 10], or normal duodenal mucosal [n = 30]) and from 64 pancreatic cancers (resected cancers [n = 50] or cancer cell lines [n = 14]) and hybridized to the complete Affymetrix Human Genome U133 GeneChip(R) set (arrays U133A and B) for simultaneous analysis of 45,000 fragments corresponding to 33,000 known genes and 6,000 ESTs. The GeneExpress(R) software system Fold Change Analysis Tool was used and 60 ESTs were identified that were expressed at levels at least 3-fold greater in the pancreatic cancers as compared to normal tissues. Searches against the human genomic sequence and comparative genomic analysis of human and mouse genomes was carried out using basic local alignment search tools (BLAST), BLASTN, and BLASTX, for identifying protein coding genes corresponding to the ESTs. Subsequently, in order to pick the most relevant candidate genes for a more detailed analysis, we looked for domains/motifs in the open reading frames using SMART and Pfam programs. We were able to definitively map 43 of the 60 ESTs to known or novel genes, and 15 of the ESTs could be localized in close proximity to a gene in the human genome although we were unable to establish that the EST was indeed derived from those genes. The differential expression of a subset of genes was confirmed at the protein level by immunohistochemical labeling of tissue microarrays (inhibin beta A [INHBA] and CD29) and/or at the transcript level by RT-PCR (INHBA, AKAP12, ELK3, FOXQ1, EIF5A2, and EFNA5). We conclude that bioinformatics tools can be used to characterize

  7. Assessing Student Understanding of the "New Biology": Development and Evaluation of a Criterion-Referenced Genomics and Bioinformatics Assessment

    Science.gov (United States)

    Campbell, Chad Edward

    Over the past decade, hundreds of studies have introduced genomics and bioinformatics (GB) curricula and laboratory activities at the undergraduate level. While these publications have facilitated the teaching and learning of cutting-edge content, there has yet to be an evaluation of these assessment tools to determine if they are meeting the quality control benchmarks set forth by the educational research community. An analysis of these assessment tools indicated that learning. To remedy this situation the development of a robust GB assessment aligned with the quality control benchmarks was undertaken in order to ensure evidence-based evaluation of student learning outcomes. Content validity is a central piece of construct validity, and it must be used to guide instrument and item development. This study reports on: (1) the correspondence of content validity evidence gathered from independent sources; (2) the process of item development using this evidence; (3) the results from a pilot administration of the assessment; (4) the subsequent modification of the assessment based on the pilot administration results and; (5) the results from the second administration of the assessment. Twenty-nine different subtopics within GB (Appendix B: Genomics and Bioinformatics Expert Survey) were developed based on preliminary GB textbook analyses. These subtopics were analyzed using two methods designed to gather content validity evidence: (1) a survey of GB experts (n=61) and (2) a detailed content analyses of GB textbooks (n=6). By including only the subtopics that were shown to have robust support across these sources, 22 GB subtopics were established for inclusion in the assessment. An expert panel subsequently developed, evaluated, and revised two multiple-choice items to align with each of the 22 subtopics, producing a final item pool of 44 items. These items were piloted with student samples of varying content exposure levels. Both Classical Test Theory (CTT) and Item

  8. Workflows in bioinformatics: meta-analysis and prototype implementation of a workflow generator

    Directory of Open Access Journals (Sweden)

    Thoraval Samuel

    2005-04-01

    Full Text Available Abstract Background Computational methods for problem solving need to interleave information access and algorithm execution in a problem-specific workflow. The structures of these workflows are defined by a scaffold of syntactic, semantic and algebraic objects capable of representing them. Despite the proliferation of GUIs (Graphic User Interfaces in bioinformatics, only some of them provide workflow capabilities; surprisingly, no meta-analysis of workflow operators and components in bioinformatics has been reported. Results We present a set of syntactic components and algebraic operators capable of representing analytical workflows in bioinformatics. Iteration, recursion, the use of conditional statements, and management of suspend/resume tasks have traditionally been implemented on an ad hoc basis and hard-coded; by having these operators properly defined it is possible to use and parameterize them as generic re-usable components. To illustrate how these operations can be orchestrated, we present GPIPE, a prototype graphic pipeline generator for PISE that allows the definition of a pipeline, parameterization of its component methods, and storage of metadata in XML formats. This implementation goes beyond the macro capacities currently in PISE. As the entire analysis protocol is defined in XML, a complete bioinformatic experiment (linked sets of methods, parameters and results can be reproduced or shared among users. Availability: http://if-web1.imb.uq.edu.au/Pise/5.a/gpipe.html (interactive, ftp://ftp.pasteur.fr/pub/GenSoft/unix/misc/Pise/ (download. Conclusion From our meta-analysis we have identified syntactic structures and algebraic operators common to many workflows in bioinformatics. The workflow components and algebraic operators can be assimilated into re-usable software components. GPIPE, a prototype implementation of this framework, provides a GUI builder to facilitate the generation of workflows and integration of heterogeneous

  9. The Revolution in Viral Genomics as Exemplified by the Bioinformatic Analysis of Human Adenoviruses

    Directory of Open Access Journals (Sweden)

    Sarah Torres

    2010-06-01

    Full Text Available Over the past 30 years, genomic and bioinformatic analysis of human adenoviruses has been achieved using a variety of DNA sequencing methods; initially with the use of restriction enzymes and more currently with the use of the GS FLX pyrosequencing technology. Following the conception of DNA sequencing in the 1970s, analysis of adenoviruses has evolved from 100 base pair mRNA fragments to entire genomes. Comparative genomics of adenoviruses made its debut in 1984 when nucleotides and amino acids of coding sequences within the hexon genes of two human adenoviruses (HAdV, HAdV–C2 and HAdV–C5, were compared and analyzed. It was determined that there were three different zones (1-393, 394-1410, 1411-2910 within the hexon gene, of which HAdV–C2 and HAdV–C5 shared zones 1 and 3 with 95% and 89.5% nucleotide identity, respectively. In 1992, HAdV-C5 became the first adenovirus genome to be fully sequenced using the Sanger method. Over the next seven years, whole genome analysis and characterization was completed using bioinformatic tools such as blastn, tblastx, ClustalV and FASTA, in order to determine key proteins in species HAdV-A through HAdV-F. The bioinformatic revolution was initiated with the introduction of a novel species, HAdV-G, that was typed and named by the use of whole genome sequencing and phylogenetics as opposed to traditional serology. HAdV bioinformatics will continue to advance as the latest sequencing technology enables scientists to add to and expand the resource databases. As a result of these advancements, how novel HAdVs are typed has changed. Bioinformatic analysis has become the revolutionary tool that has significantly accelerated the in-depth study of HAdV microevolution through comparative genomics.

  10. Identification, duplication, evolution and expression analyses of caleosins in Brassica plants and Arabidopsis subspecies.

    Science.gov (United States)

    Shen, Yue; Liu, Mingzhe; Wang, Lili; Li, Zhuowei; Taylor, David C; Li, Zhixi; Zhang, Meng

    2016-04-01

    Caleosins are a class of Ca(2+) binding proteins that appear to be ubiquitous in plants. Some of the main proteins embedded in the lipid monolayer of lipid droplets, caleosins, play critical roles in the degradation of storage lipids during germination and in lipid trafficking. Some of them have been shown to have histidine-dependent peroxygenase activity, which is believed to participate in stress responses in Arabidopsis. In the model plant Arabidopsis thaliana, caleosins have been examined extensively. However, little is known on a genome-wide scale about these proteins in other members of the Brassicaceae. In this study, 51 caleosins in Brassica plants and Arabidopsis lyrata were investigated and analyzed in silico. Among them, 31 caleosins, including 7 in A. lyrata, 11 in Brassica oleracea and 13 in Brassica napus, are herein identified for the first time. Segmental duplication was the main form of gene expansion. Alignment, motif and phylogenetic analyses showed that Brassica caleosins belong to either the H-family or the L-family with different motif structures and physicochemical properties. Our findings strongly suggest that L-caleosins are evolved from H-caleosins. Predicted phosphorylation sites were differentially conserved in H-caleosin and L-caleosins, respectively. 'RY-repeat' elements and phytohormone-related cis-elements were identified in different caleosins, which suggest diverse physiological functions. Gene structure analysis indicated that most caleosins (38 out of 44) contained six exons and five introns and their intron phases were highly conserved. Structurally integrated caleosins, such as BrCLO3-3 and BrCLO4-2, showed high expression levels and may have important roles. Some caleosins, such as BrCLO2 and BoCLO8-2, lost motifs of the calcium binding domain, proline knot, potential phosphorylation sites and haem-binding sites. Combined with their low expression, it is suggested that these caleosins may have lost function. PMID:26786939

  11. Toward the Replacement of Animal Experiments through the Bioinformatics-driven Analysis of 'Omics' Data from Human Cell Cultures.

    Science.gov (United States)

    Grafström, Roland C; Nymark, Penny; Hongisto, Vesa; Spjuth, Ola; Ceder, Rebecca; Willighagen, Egon; Hardy, Barry; Kaski, Samuel; Kohonen, Pekka

    2015-11-01

    This paper outlines the work for which Roland Grafström and Pekka Kohonen were awarded the 2014 Lush Science Prize. The research activities of the Grafström laboratory have, for many years, covered cancer biology studies, as well as the development and application of toxicity-predictive in vitro models to determine chemical safety. Through the integration of in silico analyses of diverse types of genomics data (transcriptomic and proteomic), their efforts have proved to fit well into the recently-developed Adverse Outcome Pathway paradigm. Genomics analysis within state-of-the-art cancer biology research and Toxicology in the 21st Century concepts share many technological tools. A key category within the Three Rs paradigm is the Replacement of animals in toxicity testing with alternative methods, such as bioinformatics-driven analyses of data obtained from human cell cultures exposed to diverse toxicants. This work was recently expanded within the pan-European SEURAT-1 project (Safety Evaluation Ultimately Replacing Animal Testing), to replace repeat-dose toxicity testing with data-rich analyses of sophisticated cell culture models. The aims and objectives of the SEURAT project have been to guide the application, analysis, interpretation and storage of 'omics' technology-derived data within the service-oriented sub-project, ToxBank. Particularly addressing the Lush Science Prize focus on the relevance of toxicity pathways, a 'data warehouse' that is under continuous expansion, coupled with the development of novel data storage and management methods for toxicology, serve to address data integration across multiple 'omics' technologies. The prize winners' guiding principles and concepts for modern knowledge management of toxicological data are summarised. The translation of basic discovery results ranged from chemical-testing and material-testing data, to information relevant to human health and environmental safety. PMID:26551289

  12. H3ABioNet, a sustainable pan-African bioinformatics network for human heredity and health in Africa.

    Science.gov (United States)

    Mulder, Nicola J; Adebiyi, Ezekiel; Alami, Raouf; Benkahla, Alia; Brandful, James; Doumbia, Seydou; Everett, Dean; Fadlelmola, Faisal M; Gaboun, Fatima; Gaseitsiwe, Simani; Ghazal, Hassan; Hazelhurst, Scott; Hide, Winston; Ibrahimi, Azeddine; Jaufeerally Fakim, Yasmina; Jongeneel, C Victor; Joubert, Fourie; Kassim, Samar; Kayondo, Jonathan; Kumuthini, Judit; Lyantagaye, Sylvester; Makani, Julie; Mansour Alzohairy, Ahmed; Masiga, Daniel; Moussa, Ahmed; Nash, Oyekanmi; Ouwe Missi Oukem-Boyer, Odile; Owusu-Dabo, Ellis; Panji, Sumir; Patterton, Hugh; Radouani, Fouzia; Sadki, Khalid; Seghrouchni, Fouad; Tastan Bishop, Özlem; Tiffin, Nicki; Ulenga, Nzovu; Adebiyi, Marion; Ahmed, Azza E; Ahmed, Rehab I; Alearts, Maaike; Alibi, Mohamed; Aron, Shaun; Baichoo, Shakuntala; Bendou, Hocine; Botha, Gerrit; Brown, David; Chimusa, Emile; Christoffels, Alan; Cornick, Jennifer; Entfellner, Jean-Baka Domelevo; Fields, Chris; Fischer, Anne; Gamieldien, Junaid; Ghedira, Kais; Ghouila, Amel; Ho Sui, Shannan; Isewon, Itunuoluwa; Isokpehi, Raphael; Dashti, Mahjoubeh Jalali Sefid; Kamng'ona, Arox; Khetani, Radhika S; Kiran, Anmol; Kulohoma, Benard; Kumwenda, Benjamin; Lapine, Dan; Mainzer, Liudmila Sergeevna; Maslamoney, Suresh; Mbiyavanga, Mamana; Meintjes, Ayton; Mlyango, Flora Elias; Mmbando, Bruno; Mohammed, Somia A; Mpangase, Phelelani; Msefula, Chisomo; Mtatiro, Siana Nkya; Mugutso, Dunfunk; Mungloo-Dilmohammud, Zahra; Musicha, Patrick; Nembaware, Victoria; Osamor, Victor Chukwudi; Oyelade, Jelili; Rendon, Gloria; Salazar, Gustavo A; Salifu, Samson Pandam; Sangeda, Raphael; Souiai, Oussema; Van Heusden, Peter; Wele, Mamadou

    2016-02-01

    The application of genomics technologies to medicine and biomedical research is increasing in popularity, made possible by new high-throughput genotyping and sequencing technologies and improved data analysis capabilities. Some of the greatest genetic diversity among humans, animals, plants, and microbiota occurs in Africa, yet genomic research outputs from the continent are limited. The Human Heredity and Health in Africa (H3Africa) initiative was established to drive the development of genomic research for human health in Africa, and through recognition of the critical role of bioinformatics in this process, spurred the establishment of H3ABioNet, a pan-African bioinformatics network for H3Africa. The limitations in bioinformatics capacity on the continent have been a major contributory factor to the lack of notable outputs in high-throughput biology research. Although pockets of high-quality bioinformatics teams have existed previously, the majority of research institutions lack experienced faculty who can train and supervise bioinformatics students. H3ABioNet aims to address this dire need, specifically in the area of human genetics and genomics, but knock-on effects are ensuring this extends to other areas of bioinformatics. Here, we describe the emergence of genomics research and the development of bioinformatics in Africa through H3ABioNet. PMID:26627985

  13. Quantitative Analysis of the Trends Exhibited by the Three Interdisciplinary Biological Sciences: Biophysics, Bioinformatics, and Systems Biology

    Directory of Open Access Journals (Sweden)

    Jonghoon Kang

    2015-08-01

    Full Text Available New interdisciplinary biological sciences like bioinformatics, biophysics, and systems biology have become increasingly relevant in modern science. Many papers have suggested the importance of adding these subjects, particularly bioinformatics, to an undergraduate curriculum; however, most of their assertions have relied on qualitative arguments. In this paper, we will show our metadata analysis of a scientific literature database (PubMed that quantitatively describes the importance of the subjects of bioinformatics, systems biology, and biophysics as compared with a well-established interdisciplinary subject, biochemistry. Specifically, we found that the development of each subject assessed by its publication volume was well described by a set of simple nonlinear equations, allowing us to characterize them quantitatively. Bioinformatics, which had the highest ratio of publications produced, was predicted to grow between 77% and 93% by 2025 according to the model. Due to the large number of publications produced in bioinformatics, which nearly matches the number published in biochemistry, it can be inferred that bioinformatics is almost equal in significance to biochemistry. Based on our analysis, we suggest that bioinformatics be added to the standard biology undergraduate curriculum. Adding this course to an undergraduate curriculum will better prepare students for future research in biology.

  14. H3ABioNet, a sustainable pan-African bioinformatics network for human heredity and health in Africa

    Science.gov (United States)

    Mulder, Nicola J.; Adebiyi, Ezekiel; Alami, Raouf; Benkahla, Alia; Brandful, James; Doumbia, Seydou; Everett, Dean; Fadlelmola, Faisal M.; Gaboun, Fatima; Gaseitsiwe, Simani; Ghazal, Hassan; Hazelhurst, Scott; Hide, Winston; Ibrahimi, Azeddine; Jaufeerally Fakim, Yasmina; Jongeneel, C. Victor; Joubert, Fourie; Kassim, Samar; Kayondo, Jonathan; Kumuthini, Judit; Lyantagaye, Sylvester; Makani, Julie; Mansour Alzohairy, Ahmed; Masiga, Daniel; Moussa, Ahmed; Nash, Oyekanmi; Ouwe Missi Oukem-Boyer, Odile; Owusu-Dabo, Ellis; Panji, Sumir; Patterton, Hugh; Radouani, Fouzia; Sadki, Khalid; Seghrouchni, Fouad; Tastan Bishop, Özlem; Tiffin, Nicki; Ulenga, Nzovu

    2016-01-01

    The application of genomics technologies to medicine and biomedical research is increasing in popularity, made possible by new high-throughput genotyping and sequencing technologies and improved data analysis capabilities. Some of the greatest genetic diversity among humans, animals, plants, and microbiota occurs in Africa, yet genomic research outputs from the continent are limited. The Human Heredity and Health in Africa (H3Africa) initiative was established to drive the development of genomic research for human health in Africa, and through recognition of the critical role of bioinformatics in this process, spurred the establishment of H3ABioNet, a pan-African bioinformatics network for H3Africa. The limitations in bioinformatics capacity on the continent have been a major contributory factor to the lack of notable outputs in high-throughput biology research. Although pockets of high-quality bioinformatics teams have existed previously, the majority of research institutions lack experienced faculty who can train and supervise bioinformatics students. H3ABioNet aims to address this dire need, specifically in the area of human genetics and genomics, but knock-on effects are ensuring this extends to other areas of bioinformatics. Here, we describe the emergence of genomics research and the development of bioinformatics in Africa through H3ABioNet. PMID:26627985

  15. Modeling ecological drivers in marine viral communities using comparative metagenomics and network analyses

    OpenAIRE

    Hurwitz, Bonnie L.; Westveld, Anton H.; Brum, Jennifer R.; Sullivan, Matthew B.

    2014-01-01

    Microorganisms and their viruses are increasingly recognized as drivers of myriad ecosystem processes. However, our knowledge of their roles is limited by the inability of culture-dependent and culture-independent (e.g., metagenomics) methods to be fully implemented at scales relevant to the diversity found in nature. Here we combine advances in bioinformatics (shared k-mer analyses) and social networking (regression modeling) to develop an annotation- and assembly-free visualization and anal...

  16. MSeqDR: A Centralized Knowledge Repository and Bioinformatics Web Resource to Facilitate Genomic Investigations in Mitochondrial Disease.

    Science.gov (United States)

    Shen, Lishuang; Diroma, Maria Angela; Gonzalez, Michael; Navarro-Gomez, Daniel; Leipzig, Jeremy; Lott, Marie T; van Oven, Mannis; Wallace, Douglas C; Muraresku, Colleen Clarke; Zolkipli-Cunningham, Zarazuela; Chinnery, Patrick F; Attimonelli, Marcella; Zuchner, Stephan; Falk, Marni J; Gai, Xiaowu

    2016-06-01

    MSeqDR is the Mitochondrial Disease Sequence Data Resource, a centralized and comprehensive genome and phenome bioinformatics resource built by the mitochondrial disease community to facilitate clinical diagnosis and research investigations of individual patient phenotypes, genomes, genes, and variants. A central Web portal (https://mseqdr.org) integrates community knowledge from expert-curated databases with genomic and phenotype data shared by clinicians and researchers. MSeqDR also functions as a centralized application server for Web-based tools to analyze data across both mitochondrial and nuclear DNA, including investigator-driven whole exome or genome dataset analyses through MSeqDR-Genesis. MSeqDR-GBrowse genome browser supports interactive genomic data exploration and visualization with custom tracks relevant to mtDNA variation and mitochondrial disease. MSeqDR-LSDB is a locus-specific database that currently manages 178 mitochondrial diseases, 1,363 genes associated with mitochondrial biology or disease, and 3,711 pathogenic variants in those genes. MSeqDR Disease Portal allows hierarchical tree-style disease exploration to evaluate their unique descriptions, phenotypes, and causative variants. Automated genomic data submission tools are provided that capture ClinVar compliant variant annotations. PhenoTips will be used for phenotypic data submission on deidentified patients using human phenotype ontology terminology. The development of a dynamic informed patient consent process to guide data access is underway to realize the full potential of these resources. PMID:26919060

  17. Molecular docking of Glycine max and Medicago truncatula ureases with urea; bioinformatics approaches.

    Science.gov (United States)

    Filiz, Ertugrul; Vatansever, Recep; Ozyigit, Ibrahim Ilker

    2016-03-01

    Urease (EC 3.5.1.5) is a nickel-dependent metalloenzyme catalyzing the hydrolysis of urea into ammonia and carbon dioxide. It is present in many bacteria, fungi, yeasts and plants. Most species, with few exceptions, use nickel metalloenzyme urease to hydrolyze urea, which is one of the commonly used nitrogen fertilizer in plant growth thus its enzymatic hydrolysis possesses vital importance in agricultural practices. Considering the essentiality and importance of urea and urease activity in most plants, this study aimed to comparatively investigate the ureases of two important legume species such as Glycine max (soybean) and Medicago truncatula (barrel medic) from Fabaceae family. With additional plant species, primary and secondary structures of 37 plant ureases were comparatively analyzed using various bioinformatics tools. A structure based phylogeny was constructed using predicted 3D models of G. max and M. truncatula, whose crystallographic structures are not available, along with three additional solved urease structures from Canavalia ensiformis (PDB: 4GY7), Bacillus pasteurii (PDB: 4UBP) and Klebsiella aerogenes (PDB: 1FWJ). In addition, urease structures of these species were docked with urea to analyze the binding affinities, interacting amino acids and atom distances in urease-urea complexes. Furthermore, mutable amino acids which could potentially affect the protein active site, stability and flexibility as well as overall protein stability were analyzed in urease structures of G. max and M. truncatula. Plant ureases demonstrated similar physico-chemical properties with 833-878 amino acid residues and 89.39-90.91 kDa molecular weight with mainly acidic (5.15-6.10 pI) nature. Four protein domain structures such as urease gamma, urease beta, urease alpha and amidohydro 1 characterized the plant ureases. Secondary structure of plant ureases also demonstrated conserved protein architecture, with predominantly α-helix and random coil structures. In

  18. libcov: A C++ bioinformatic library to manipulate protein structures, sequence alignments and phylogeny

    Directory of Open Access Journals (Sweden)

    Roger Andrew J

    2005-06-01

    Full Text Available Background An increasing number of bioinformatics methods are considering the phylogenetic relationships between biological sequences. Implementing new methodologies using the maximum likelihood phylogenetic framework can be a time consuming task. Results The bioinformatics library libcov is a collection of C++ classes that provides a high and low-level interface to maximum likelihood phylogenetics, sequence analysis and a data structure for structural biological methods. libcov can be used to compute likelihoods, search tree topologies, estimate site rates, cluster sequences, manipulate tree structures and compare phylogenies for a broad selection of applications. Conclusion Using this library, it is possible to rapidly prototype applications that use the sophistication of phylogenetic likelihoods without getting involved in a major software engineering project. libcov is thus a potentially valuable building block to develop in-house methodologies in the field of protein phylogenetics.

  19. 6th International Conference on Practical Applications of Computational Biology & Bioinformatics

    CERN Document Server

    Luscombe, Nicholas; Fdez-Riverola, Florentino; Rodríguez, Juan; Practical Applications of Computational Biology & Bioinformatics

    2012-01-01

    The growth in the Bioinformatics and Computational Biology fields over the last few years has been remarkable.. The analysis of the datasets of Next Generation Sequencing needs new algorithms and approaches from fields such as Databases, Statistics, Data Mining, Machine Learning, Optimization, Computer Science and Artificial Intelligence. Also Systems Biology has also been emerging as an alternative to the reductionist view that dominated biological research in the last decades. This book presents the results of the  6th International Conference on Practical Applications of Computational Biology & Bioinformatics held at University of Salamanca, Spain, 28-30th March, 2012 which brought together interdisciplinary scientists that have a strong background in the biological and computational sciences.

  20. Biologically inspired intelligent decision making: a commentary on the use of artificial neural networks in bioinformatics.

    Science.gov (United States)

    Manning, Timmy; Sleator, Roy D; Walsh, Paul

    2014-01-01

    Artificial neural networks (ANNs) are a class of powerful machine learning models for classification and function approximation which have analogs in nature. An ANN learns to map stimuli to responses through repeated evaluation of exemplars of the mapping. This learning approach results in networks which are recognized for their noise tolerance and ability to generalize meaningful responses for novel stimuli. It is these properties of ANNs which make them appealing for applications to bioinformatics problems where interpretation of data may not always be obvious, and where the domain knowledge required for deductive techniques is incomplete or can cause a combinatorial explosion of rules. In this paper, we provide an introduction to artificial neural network theory and review some interesting recent applications to bioinformatics problems. PMID:24335433

  1. Genomics, Proteomics & Bioinformatics (GPB) Has a New Start——Open Access

    Institute of Scientific and Technical Information of China (English)

    Jun Yu

    2012-01-01

    We are now presenting to our readers the first issue of Volume 10 of Genomics,Proteomics & Bioinformatics (GPB).It is also the first issue for the new status.GPB was founded in 2003 and published in English,focusing on research advancement in the fields of omics and bioinformatics.To ensure an international presence,GPB has its 30-50% editorial board members from outside China.From 2006,GPB has been co-published by Elsevier and Science Press,and its full-text articles are available for downloading from ScienceDirect.In 2011,GPB became a bimonthly journal.Annual downloading counts keep increasing with around 70% from outside China in 2011 (Figure 1).In addition,submissions from abroad account for 70% of the published articles after collaborating with Elsevier.

  2. The DBCLS BioHackathon: standardization and interoperability for bioinformatics web services and workflows

    OpenAIRE

    Katayama, T.; Arakawa, K.; Nakao, M; Prins, J.C.P.

    2010-01-01

    Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However, various incompatibilities among database resources and analysis services make it difficult to connect and integrate these into interoperable workflows. To resolve this situation, we invited domain speci...

  3. The DBCLS BioHackathon: standardization and interoperability for bioinformatics web services and workflows

    OpenAIRE

    Katayama, Toshiaki; Arakawa, Kazuharu; Nakao, Mitsuteru; Ono, Keiichiro; Aoki-Kinoshita, Kiyoko F; Yamamoto, Yasunori; Yamaguchi, Atsuko; Kawashima, Shuichi; Chun, Hong-Woo; Aerts, Jan; Aranda, Bruno; Barboza, Lord; Bonnal, Raoul JP; Bruskiewich, Richard; Bryne, Jan C

    2010-01-01

    Abstract Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However, various incompatibilities among database resources and analysis services make it difficult to connect and integrate these into interoperable workflows. To resolve this situation, we invited...

  4. Fiat lux!: Phylogeny and Bioinformatics shed light on GABA functions in plants

    OpenAIRE

    Renault, Hugues

    2013-01-01

    The non-protein amino acid γ-aminobutyric acid (GABA) accumulates in plants in response to a wide variety of environmental cues. Recent data point toward an involvement of GABA in tricarboxylic acid (TCA) cycle activity and respiration, especially in stressed roots. To gain further insights into potential GABA functions in plants, phylogenetic and bioinformatic approaches were undertaken. Phylogenetic reconstruction of the GABA transaminase (GABA-T) protein family revealed the monophyletic na...

  5. An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics

    OpenAIRE

    2010-01-01

    Background Bioinformatics researchers are now confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce. Description An overview is given of ...

  6. New bioinformatic tools for analysis of nucleotide modifications in eukaryotic rRNA

    OpenAIRE

    Piekna-Przybylska, Dorota; Decatur, Wayne A.; Fournier, Maurille J.

    2007-01-01

    This report presents a valuable new bioinformatics package for research on rRNA nucleotide modifications in the ribosome, especially those created by small nucleolar RNA:protein complexes (snoRNPs). The interactive service, which is not available elsewhere, enables a user to visualize the positions of pseudouridines, 2′-O-methylations, and base methylations in three-dimensional space in the ribosome and also in linear and secondary structure formats of ribosomal RNA. Our tools provide additio...

  7. Bioinformatic Analysis of Putative Gene Products Encoded in SARS-HCoV Genome

    Institute of Scientific and Technical Information of China (English)

    赵心刚; 韩敬东; 宁元亨; 孟安明; 陈晔光

    2003-01-01

    The cause of severe acute respiratory syndrome (SARS) has been identified as a new coronavirus named as SARS-HCoV.Using bioinformatic methods, we have performed a detailed domain search.In addition to the viral structure proteins, we have found that several putative polypeptides share sequence similarity to known domains or proteins.This study may provide a basis for future studies on the infection and replication process of this notorious virus.

  8. Virginia Bioinformatics Institute, partners receive $1.45 million award for plant protection in developing countries

    OpenAIRE

    Bland, Susan

    2010-01-01

    A research collaboration led by a professor from the Virginia Bioinformatics Institute at Virginia Tech will work to develop methods to protect agriculturally important crops in developing countries from devastating attacks from plant pathogens with the support of a $1.45 million award from the Basic Research to Enable Agricultural Development (BREAD) program sponsored by the National Science Foundation (NSF) and the Bill & Melinda Gates Foundation (BMGF).

  9. A Critical Analysis of Assessment Quality in Genomics and Bioinformatics Education Research

    OpenAIRE

    Campbell, Chad E.; Nehm, Ross H.

    2013-01-01

    The growing importance of genomics and bioinformatics methods and paradigms in biology has been accompanied by an explosion of new curricula and pedagogies. An important question to ask about these educational innovations is whether they are having a meaningful impact on students’ knowledge, attitudes, or skills. Although assessments are necessary tools for answering this question, their outputs are dependent on their quality. Our study 1) reviews the central importance of reliability and con...

  10. Dynamic hybrid clustering of bioinformatics by incorporating text mining and citation analysis.

    OpenAIRE

    Janssens, Frizo; Glänzel, Wolfgang; De Moor, Bart

    2007-01-01

    To unravel the concept structure and dynamics of the bioinformatics field, we analyze a set of 7401 publications from the Web of Science and MEDLINE databases, publication years 1981–2004. For delineating this complex, interdisciplinary field, a novel bibliometric retrieval strategy is used. Given that the performance of unsupervised clustering and classification of scientific publications is significantly improved by deeply merging textual contents with the structure of the citation graph, w...

  11. Synergy between medical informatics and bioinformatics: facilitating genomic medicine for future health care

    OpenAIRE

    Martin-Sanchez, F.; Iakovidis, I.; Norager, S.; Maojo, V; de Groen, P; Lei, J.; Jones, T.; Abraham-Fuchs, K.; Apweiler, R; Babic, A; Baud, R.; BRETON, Vincent; Cinquin, P.; Doupi, P; Dugas, M.

    2004-01-01

    Medical Informatics (MI) and Bioinformatics (BI) are two interdisciplinary areas located at the intersection between computer science and medicine and biology, respectively. Historically, they have been separated and only occasionally have researchers of both disciplines collaborated. The completion of the Human Genome Project has brought about in this post genomic era the need for a synergy of these two disciplines to further advance in the study of diseases by correlating essential genotypi...

  12. Threshold Average Precision (TAP-k): a measure of retrieval designed for bioinformatics

    OpenAIRE

    Carroll, Hyrum D.; Kann, Maricel G.; Sheetlin, Sergey L.; Spouge, John L.

    2010-01-01

    Motivation: Since database retrieval is a fundamental operation, the measurement of retrieval efficacy is critical to progress in bioinformatics. This article points out some issues with current methods of measuring retrieval efficacy and suggests some improvements. In particular, many studies have used the pooled receiver operating characteristic for n irrelevant records (ROC n ) score, the area under the ROC curve (AUC) of a ‘pooled’ ROC curve, truncated at n irrelevant records. Unfortunate...

  13. Characterization of the human GYS2 gene and its product using bioinformatic tools

    OpenAIRE

    DİLMEÇ, Fuat

    2009-01-01

    Aim: The GYS2 gene, which encodes for glycogen synthase 2 (liver) (GS), is an enzyme responsible for the synthesis of 1,4-linked glucose chains in glycogen. The present study aimed to investigate the homology, conserved domain, and promoter and expression profiles of the human GYS2 gene among various vertebrate species using bioinformatic tools. Materials and Methods: We analyzed the homology with NCBI blast, the conserved domain with EBI ClustalW and Mega4, the promoter with Genomatix, and...

  14. Bioinformatics of Cancer ncRNA in High Throughput Sequencing: Present State and Challenges

    OpenAIRE

    Jorge, Natasha Andressa Nogueira; Ferreira, Carlos Gil; Passetti, Fabio

    2012-01-01

    The numerous genome sequencing projects produced unprecedented amount of data providing significant information to the discovery of novel non-coding RNA (ncRNA). Several ncRNAs have been described to control gene expression and display important role during cell differentiation and homeostasis. In the last decade, high throughput methods in conjunction with approaches in bioinformatics have been used to identify, classify, and evaluate the expression of hundreds of ncRNA in normal and patholo...

  15. The discrepancies in the results of bioinformatics tools for genomic structural annotation

    Science.gov (United States)

    Pawełkowicz, Magdalena; Nowak, Robert; Osipowski, Paweł; Rymuszka, Jacek; Świerkula, Katarzyna; Wojcieszek, Michał; Przybecki, Zbigniew

    2014-11-01

    A major focus of sequencing project is to identify genes in genomes. However it is necessary to define the variety of genes and the criteria for identifying them. In this work we present discrepancies and dependencies from the application of different bioinformatic programs for structural annotation performed on the cucumber data set from Polish Consortium of Cucumber Genome Sequencing. We use Fgenesh, GenScan and GeneMark to automated structural annotation, the results have been compared to reference annotation.

  16. Advantages and disadvantages in usage of bioinformatic programs in promoter region analysis

    Science.gov (United States)

    Pawełkowicz, Magdalena E.; Skarzyńska, Agnieszka; Posyniak, Kacper; ZiÄ bska, Karolina; PlÄ der, Wojciech; Przybecki, Zbigniew

    2015-09-01

    An important computational challenge is finding the regulatory elements across the promotor region. In this work we present the advantages and disadvantages from the application of different bioinformatics programs for localization of transcription factor binding sites in the upstream region of genes connected with sex determination in cucumber. We use PlantCARE, PlantPAN and SignalScan to find motifs in the promotor regions. The results have been compared and possible function of chosen motifs has been described.

  17. Bioinformatic Analysis for the Validation of Novel Biomarkers for Cancer Diagnosis and Drug Sensitivity

    OpenAIRE

    Lockwood, Laura Anne Rebecca

    2015-01-01

    Background: The genetic control of tumour progression presents the opportunity for bioinformatics and gene expression data to be used as a basis for tumour grading. The development of a genetic signature based on microarray data allows for the development of personalised chemotherapeutic regimes. Method: ONCOMINE was utilised to create a genetic signature for ovarian serous adenocarcinoma and to compare the expression of genes between normal ovarian and cancerous cells. Ingenuity Pathways...

  18. Statistics and bioinformatics in nutritional sciences: analysis of complex data in the era of systems biology⋆

    OpenAIRE

    Fu, Wenjiang J; Stromberg, Arnold J; Viele, Kert; Carroll, Raymond J.; Wu, Guoyao

    2010-01-01

    Over the past two decades, there have been revolutionary developments in life science technologies characterized by high throughput, high efficiency, and rapid computation. Nutritionists now have the advanced methodologies for the analysis of DNA, RNA, protein, low-molecular-weight metabolites, as well as access to bioinformatics databases. Statistics, which can be defined as the process of making scientific inferences from data that contain variability, has historically played an integral ro...

  19. AphidBase: A centralized bioinformatic resource for annotation of the pea aphid genome

    OpenAIRE

    Legeai, Fabrice; Shigenobu, Shuji; Gauthier, Jean-Pierre; Colbourne, John; Rispe, Claude; Collin, Olivier; Richards, Stephen; Wilson, Alex C. C.; Tagu, Denis

    2010-01-01

    AphidBase is a centralized bioinformatic resource that was developed to facilitate community annotation of the pea aphid genome by the International Aphid Genomics Consortium (IAGC). The AphidBase Information System designed to organize and distribute genomic data and annotations for a large international community was constructed using open source software tools from the Generic Model Organism Database (GMOD). The system includes Apollo and GBrowse utilities as well as a wiki, blast search c...

  20. Bioinformatics Tools and Novel Challenges in Long Non-Coding RNAs (lncRNAs Functional Analysis

    Directory of Open Access Journals (Sweden)

    Andrea Masotti

    2011-12-01

    Full Text Available The advent of next generation sequencing revealed that a fraction of transcribed RNAs (short and long RNAs is non-coding. Long non-coding RNAs (lncRNAs have a crucial role in regulating gene expression and in epigenetics (chromatin and histones remodeling. LncRNAs may have different roles: gene activators (signaling, repressors (decoy, cis and trans gene expression regulators (guides and chromatin modificators (scaffolds without the need to be mutually exclusive. LncRNAs are also implicated in a number of diseases. The huge amount of inhomogeneous data produced so far poses several bioinformatics challenges spanning from the simple annotation to the more complex functional annotation. In this review, we report and discuss several bioinformatics resources freely available and dealing with the study of lncRNAs. To our knowledge, this is the first review summarizing all the available bioinformatics resources on lncRNAs appeared in the literature after the completion of the human genome project. Therefore, the aim of this review is to provide a little guide for biologists and bioinformaticians looking for dedicated resources, public repositories and other tools for lncRNAs functional analysis.

  1. SNPTrackTM : an integrated bioinformatics system for genetic association studies

    Directory of Open Access Journals (Sweden)

    Xu Joshua

    2012-07-01

    Full Text Available Abstract A genetic association study is a complicated process that involves collecting phenotypic data, generating genotypic data, analyzing associations between genotypic and phenotypic data, and interpreting genetic biomarkers identified. SNPTrack is an integrated bioinformatics system developed by the US Food and Drug Administration (FDA to support the review and analysis of pharmacogenetics data resulting from FDA research or submitted by sponsors. The system integrates data management, analysis, and interpretation in a single platform for genetic association studies. Specifically, it stores genotyping data and single-nucleotide polymorphism (SNP annotations along with study design data in an Oracle database. It also integrates popular genetic analysis tools, such as PLINK and Haploview. SNPTrack provides genetic analysis capabilities and captures analysis results in its database as SNP lists that can be cross-linked for biological interpretation to gene/protein annotations, Gene Ontology, and pathway analysis data. With SNPTrack, users can do the entire stream of bioinformatics jobs for genetic association studies. SNPTrack is freely available to the public at http://www.fda.gov/ScienceResearch/BioinformaticsTools/SNPTrack/default.htm.

  2. Using Bioinformatics to Develop and Test Hypotheses: E. coli-Specific Virulence Determinants

    Directory of Open Access Journals (Sweden)

    Joanna R. Klein

    2012-09-01

    Full Text Available Bioinformatics, the use of computer resources to understand biological information, is an important tool in research, and can be easily integrated into the curriculum of undergraduate courses. Such an example is provided in this series of four activities that introduces students to the field of bioinformatics as they design PCR based tests for pathogenic E. coli strains. A variety of computer tools are used including BLAST searches at NCBI, bacterial genome searches at the Integrated Microbial Genomes (IMG database, protein analysis at Pfam and literature research at PubMed. In the process, students also learn about virulence factors, enzyme function and horizontal gene transfer. Some or all of the four activities can be incorporated into microbiology or general biology courses taken by students at a variety of levels, ranging from high school through college. The activities build on one another as they teach and reinforce knowledge and skills, promote critical thinking, and provide for student collaboration and presentation. The computer-based activities can be done either in class or outside of class, thus are appropriate for inclusion in online or blended learning formats. Assessment data showed that students learned general microbiology concepts related to pathogenesis and enzyme function, gained skills in using tools of bioinformatics and molecular biology, and successfully developed and tested a scientific hypothesis.

  3. Advances in G-protein coupled receptor research and related bioinformatics study

    Institute of Scientific and Technical Information of China (English)

    2003-01-01

    G-protein coupled receptor (GPCR) is one of the most important protein families for drug target. GPCR agonists and antagonists occupy approximately one third of the world small molecule drug market. Much effort has been invested in GPCR study by both academic institutions and pharmaceutical industries. With seven-transmembrane domains, GPCR plays significant roles in intercellular signal transduction and is involved in a variety of biological pathways. With the availability of sequence data of human and other mammalian genomes, as well as their expressed sequence tag (EST) data, the bioinformatics and genomics approaches can be applied to identifying novel GPCR in the post genomic era. Deorphanizing GPCR or matching ligands with GPCR greatly facilitates target validation process and automatically provides a possible compound screening assay. Similarly, bioinformatics data mining approach could also be applied to the identification of GPCR peptide or protein ligands. Here we give a general review of recent advances in the study of GPCR structure, function, as well as GPCR and ligand identification with the emphasis on the bioinformatics database mining of GPCR and their peptide or protein ligands.

  4. Possible Biomarkers for the Early Detection of HIV-associated Heart Diseases: A Proteomics and Bioinformatics Prediction

    Directory of Open Access Journals (Sweden)

    Suraiya Rasheed

    2015-01-01

    Full Text Available The frequency of cardiovascular disorders is increasing in HIV-infected individuals despite a significant reduction in the viral load by antiretroviral therapies (ART. Since the CD4+ T-cells are responsible for the viral load as well as immunological responses, we hypothesized that chronic HIV-infection of T-cells produces novel proteins/enzymes that cause cardiac dysfunctions. To identify specific factors that might cause cardiac disorders without the influence of numerous cofactors produced by other pathogenic microorganisms that co-inhabit most HIV-infected individuals, we analyzed genome-wide proteomes of a CD4+ T-cell line at different stages of HIV replication and cell growth over >6 months. Subtractive analyses of several hundred differentially regulated proteins from HIV-infected and uninfected counterpart cells and comparisons with proteins expressed from the same cells after treating with the antiviral drug Zidovudine/AZT and inhibiting virus replication, identified a well-coordinated network of 12 soluble/diffusible proteins in HIV-infected cells. Functional categorization, bioinformatics and statistical analyses of each protein predicted that the expression of cardiac-specific Ca2+ kinase together with multiple Ca2+ release channels causes a sustained overload of Ca2+ in the heart which induces fetal/cardiac myosin heavy chains (MYH6 and MYH7 and a myosin light-chain kinase. Each of these proteins has been shown to cause cardiac stress, arrhythmia, hypertrophic signaling, cardiomyopathy and heart failure (p = 8 × 10−11. Translational studies using the newly discovered proteins produced by HIV infection alone would provide additional biomarkers that could be added to the conventional markers for an early diagnosis and/or development of specific therapeutic interventions for heart diseases in HIV-infected individuals.

  5. Improvement of the banana "Musa acuminata" reference sequence using NGS data and semi-automated bioinformatics methods

    Czech Academy of Sciences Publication Activity Database

    Martin, G.; Baurens, F.C.; Droc, G.; Rouard, M.; Cenci, A.; Kilian, A.; Hastie, A.; Doležel, Jaroslav; Aury, J. M.; Alberti, A.; Carreel, F.; D'Hont, A.

    2016-01-01

    Roč. 17, MAR 16 (2016), s. 243. ISSN 1471-2164 Institutional support: RVO:61389030 Keywords : Musa acuminata * Genome assembly * Bioinformatics tool Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 3.986, year: 2014

  6. Bio-informatics Research Progress in the Post-genome Era Based on the Quantitative Analysis of SCIE

    Institute of Scientific and Technical Information of China (English)

    Yongqin; ZHAN; Min; YU

    2013-01-01

    SCIE paper output can reflect the status quo and trend of discipline research and 7 038 scientific articles concerning bioinformatics are retrieved in SCIE database during the years between 2008 and 2012. Quantitative analysis of paper output and citation frequency are conducted according to nations, institutions, publications, research direction as well as hot articles, which provides assistance for bioinformatics researchers to understand the present situation of this subject, carry out cooperative studies and display scientific research achievements.

  7. BioShaDock: a community driven bioinformatics shared Docker-based tools registry [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    François Moreews

    2015-12-01

    Full Text Available Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community.

  8. The Development of Protein Microarrays and Their Applications in DNA-Protein and Protein-Protein Interaction Analyses of Arabidopsis Transcription Factors

    Institute of Scientific and Technical Information of China (English)

    Wei Gong; Kun He; Mike Covington; S.R Dinesh-Kumar; Michael Snyder; Stacey L.Harmer; Yu-Xian Zhu; Xing Wang Deng

    2008-01-01

    We used our collection of Arabidopsis transcription factor (TF) ORFeome clones to constructprotein microarrays containing as many as 802 TF proteins. These protein microarrays were used for both protein-DNA and proteinprotein interaction analyses. For protein-DNA interaction studies, we examined AP2/ERF family TFs and their cognate cis-elements. By careful comparison of the DNA-binding specificity of 13 TFs on the protein microarray with previous non-microarray data, we showed that protein microarrays provide an efficient and high throughput tool for genome-wide analysis of TF-DNA interactions. This microarray protein-DNA interaction analysis allowed us to derive a comprehensive view of DNA-binding profiles of AP2/ERF family proteins in Arabidopsis. It also revealed four TFs that bound the EE (evening element) and had the expected phased gene expression under clock-regulation, thus providing a basis for further functional analysis of their roles in clock regulation of gene expression. We also developed procedures for detecting protein interactions using this TF protein microarray and discovered four novel partners that interact with HY5, which can be validated by yeast two-hybrid assays. Thus, plant TF protein microarrays offer an attractive high-throughput alternative to traditional techniques for TF functional characterization on a global scale.

  9. The Bioinformatic Analysis of the blcap Gene%宫颈癌相关blcap基因的生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    刘娟; 熊金虎; 伍欣星

    2004-01-01

    BLCAP is a potential gene for suppression of cervical carcinoma, which was found by analysing the cervical carcinoma specimen with the oncogene and anti-oncogene cDNA microarray. Basing on the bioinformatical analyses, we try to predict the function of blcap gene. The results show that there are several genes that highly resemble with blcap. The comparability between the sequences of blcap and Homo sapiens mRNA (DKFZp564M053) or BC10 is 99% and 87%, respectively. The protein encoded by BLCAP is composed of Leu(19.5%), pro(9.19%), ser(8.04%)、 cys(8.04%) and other amino acids. The secondary structure of the N-terminal of BLCAP encoded protein is an alpha helix. In the C-terminal, it is beta sheet and in the middle, it is coil. The of the terminals is more hydrophobile than the middle region. Between 45-55aa, there is a transmembrane region. Therefore, we forecast the BLCAP is a member of transmembrane protein I. By analyzing the signal peptide and the procedure of blcap gene with the program of SignalP (V1.1), we found a cleavage site in 59-66aa. By using the program of Netpho, we predicted there might be three phospholate sites at 68aa, 73aa and 78aa. At 78-81aa, we found a typical [ST]-X [2] -[DE] structure—the phospholate site of tyrosine protein kinase, which might be related to its function. Bioinformatic studies of blcap provided the foundation for the function researches of BLCAP in laboratory.

  10. Cloning and Molecular Analyses of Hop Transcription Factors

    Czech Academy of Sciences Publication Activity Database

    Matoušek, Jaroslav; Kocábek, Tomáš; Škopek, Josef; Orctová, Lidmila; Stehlík, Jan; Füssy, Zoltán; Patzak, J.; Krofta, K.; Maloukh, L.; Roldán-Ruiz, I.; Heyerick, A.; De Keukeleire, D.

    Ghent : ISHA Acta Horticulturae, 2009 - (DeKeukeleire, D.; Hummer, K.; Heyerick, A.), s. 41-48 ISBN 978-90-6605-722-7. ISSN 0567-7572. [ISHS International Humulus Symposium /2./. Ghent (BE), 01.09.2009-05.09.2009] R&D Projects: GA ČR GA521/08/0740; GA MZe QH81052; GA ČR(CZ) GP521/06/P323 Institutional research plan: CEZ:AV0Z50510513 Keywords : transcription regulators * cis-elements * hop metabolome Subject RIV: EB - Genetics ; Molecular Biology

  11. Bioinformatics: Cheap and robust method to explore biomaterial from Indonesia biodiversity

    Science.gov (United States)

    Widodo

    2015-02-01

    Indonesia has a huge amount of biodiversity, which may contain many biomaterials for pharmaceutical application. These resources potency should be explored to discover new drugs for human wealth. However, the bioactive screening using conventional methods is very expensive and time-consuming. Therefore, we developed a methodology for screening the potential of natural resources based on bioinformatics. The method is developed based on the fact that organisms in the same taxon will have similar genes, metabolism and secondary metabolites product. Then we employ bioinformatics to explore the potency of biomaterial from Indonesia biodiversity by comparing species with the well-known taxon containing the active compound through published paper or chemical database. Then we analyze drug-likeness, bioactivity and the target proteins of the active compound based on their molecular structure. The target protein was examined their interaction with other proteins in the cell to determine action mechanism of the active compounds in the cellular level, as well as to predict its side effects and toxicity. By using this method, we succeeded to screen anti-cancer, immunomodulators and anti-inflammation from Indonesia biodiversity. For example, we found anticancer from marine invertebrate by employing the method. The anti-cancer was explore based on the isolated compounds of marine invertebrate from published article and database, and then identified the protein target, followed by molecular pathway analysis. The data suggested that the active compound of the invertebrate able to kill cancer cell. Further, we collect and extract the active compound from the invertebrate, and then examined the activity on cancer cell (MCF7). The MTT result showed that the methanol extract of marine invertebrate was highly potent in killing MCF7 cells. Therefore, we concluded that bioinformatics is cheap and robust way to explore bioactive from Indonesia biodiversity for source of drug and another

  12. Bioinformatics algorithm based on a parallel implementation of a machine learning approach using transducers

    International Nuclear Information System (INIS)

    Finite automata, in which each transition is augmented with an output label in addition to the familiar input label, are considered finite-state transducers. Transducers have been used to analyze some fundamental issues in bioinformatics. Weighted finite-state transducers have been proposed to pairwise alignments of DNA and protein sequences; as well as to develop kernels for computational biology. Machine learning algorithms for conditional transducers have been implemented and used for DNA sequence analysis. Transducer learning algorithms are based on conditional probability computation. It is calculated by using techniques, such as pair-database creation, normalization (with Maximum-Likelihood normalization) and parameters optimization (with Expectation-Maximization - EM). These techniques are intrinsically costly for computation, even worse when are applied to bioinformatics, because the databases sizes are large. In this work, we describe a parallel implementation of an algorithm to learn conditional transducers using these techniques. The algorithm is oriented to bioinformatics applications, such as alignments, phylogenetic trees, and other genome evolution studies. Indeed, several experiences were developed using the parallel and sequential algorithm on Westgrid (specifically, on the Breeze cluster). As results, we obtain that our parallel algorithm is scalable, because execution times are reduced considerably when the data size parameter is increased. Another experience is developed by changing precision parameter. In this case, we obtain smaller execution times using the parallel algorithm. Finally, number of threads used to execute the parallel algorithm on the Breezy cluster is changed. In this last experience, we obtain as result that speedup is considerably increased when more threads are used; however there is a convergence for number of threads equal to or greater than 16.

  13. Bioinformatic analysis of microRNA networks following the activation of the constitutive androstane receptor (CAR) in mouse liver.

    Science.gov (United States)

    Hao, Ruixin; Su, Shengzhong; Wan, Yinan; Shen, Frank; Niu, Ben; Coslo, Denise M; Albert, Istvan; Han, Xing; Omiecinski, Curtis J

    2016-09-01

    The constitutive androstane receptor (CAR; NR1I3) is a member of the nuclear receptor superfamily that functions as a xenosensor, serving to regulate xenobiotic detoxification, lipid homeostasis and energy metabolism. CAR activation is also a key contributor to the development of chemical hepatocarcinogenesis in mice. The underlying pathways affected by CAR in these processes are complex and not fully elucidated. MicroRNAs (miRNAs) have emerged as critical modulators of gene expression and appear to impact many cellular pathways, including those involved in chemical detoxification and liver tumor development. In this study, we used deep sequencing approaches with an Illumina HiSeq platform to differentially profile microRNA expression patterns in livers from wild type C57BL/6J mice following CAR activation with the mouse CAR-specific ligand activator, 1,4-bis-[2-(3,5,-dichloropyridyloxy)] benzene (TCPOBOP). Bioinformatic analyses and pathway evaluations were performed leading to the identification of 51 miRNAs whose expression levels were significantly altered by TCPOBOP treatment, including mmu-miR-802-5p and miR-485-3p. Ingenuity Pathway Analysis of the differentially expressed microRNAs revealed altered effector pathways, including those involved in liver cell growth and proliferation. A functional network among CAR targeted genes and the affected microRNAs was constructed to illustrate how CAR modulation of microRNA expression may potentially mediate its biological role in mouse hepatocyte proliferation. This article is part of a Special Issue entitled: Xenobiotic nuclear receptors: New Tricks for An Old Dog, edited by Dr. Wen Xie. PMID:27080131

  14. Bioinformatics analysis of organizational and expressional characterizations of the IFNs, IRFs and CRFBs in grass carp Ctenopharyngodon idella.

    Science.gov (United States)

    Liao, Zhiwei; Wan, Quanyuan; Su, Jianguo

    2016-08-01

    Interferons (IFNs) play crucial roles in the immune response of defense against viral infection and bacteria invasion. In the present study, we systematically identified and characterized the IFNs, their regulatory factors (Interferon Regulatory Factors, IRFs) and receptors (Cytokine Receptor Family B, CRFBs) in grass carp (Ctenopharyngodon idella). Grass carp IFNs can be classified into type I IFN (IFN-I) and type II IFN (IFN-II) like other teleosts. IFN-I consist of two groups with two (group I) or four (group II) cysteines in the mature peptide and can be further divided into three subgroups (IFN-a, -c and -d), containing four members: IFN1, IFN2, IFN3, IFN4 in grass carp. IFN-II contain two members, IFNγ2 with the similarity to mammalian IFNγ and a cyprinid specific IFNγ1 (IFNγ-rel) molecule. mRNA expression analyses of IFNs discovered that IFN1 and IFN-II were sustainably expressed in many tissues, while other IFN members were transiently expressed in specific tissues and time points. In the immune response, IFN transcriptions are primarily regulated through multiple IRFs after grass carp reovirus (GCRV) challenge. IRF family possess thirteen members in grass carp, which can be further divided into four subfamilies (IRF-1, -3, -4 and -5 subfamily), each of them plays different roles in the innate and adaptive immunity via various signaling pathways to interact with IFNs (mainly IFN-I). IFNs have to bind receptors (CRFBs) to perform their functions. CRFBs as IFN receptors contain six members in grass carp. The structure and expression characterizations of IFNs, IRFs and CRFBs were analyzed using bioinformatics tools. These results might provide basic data for the further functional research of IFN system, and deeply understand fish immune mechanisms against virus infection. PMID:27012995

  15. Significance and suppression of redundant IL17 responses in acute allograft rejection by bioinformatics based drug repositioning of fenofibrate.

    Directory of Open Access Journals (Sweden)

    Silke Roedder

    Full Text Available Despite advanced immunosuppression, redundancy in the molecular diversity of acute rejection (AR often results in incomplete resolution of the injury response. We present a bioinformatics based approach for identification of these redundant molecular pathways in AR and a drug repositioning approach to suppress these using FDA approved drugs currently available for non-transplant indications. Two independent microarray data-sets from human renal allograft biopsies (n = 101 from patients on majorly Th1/IFN-y immune response targeted immunosuppression, with and without AR, were profiled. Using gene-set analysis across 3305 biological pathways, significant enrichment was found for the IL17 pathway in AR in both data-sets. Recent evidence suggests IL17 pathway as an important escape mechanism when Th1/IFN-y mediated responses are suppressed. As current immunosuppressions do not specifically target the IL17 axis, 7200 molecular compounds were interrogated for FDA approved drugs with specific inhibition of this axis. A combined IL17/IFN-y suppressive role was predicted for the antilipidemic drug Fenofibrate. To assess the immunregulatory action of Fenofibrate, we conducted in-vitro treatment of anti-CD3/CD28 stimulated human peripheral blood cells (PBMC, and, as predicted, Fenofibrate reduced IL17 and IFN-γ gene expression in stimulated PMBC. In-vivo Fenofibrate treatment of an experimental rodent model of cardiac AR reduced infiltration of total leukocytes, reduced expression of IL17/IFN-y and their pathway related genes in allografts and recipients' spleens, and extended graft survival by 21 days (p<0.007. In conclusion, this study provides important proof of concept that meta-analyses of genomic data and drug databases can provide new insights into the redundancy of the rejection response and presents an economic methodology to reposition FDA approved drugs in organ transplantation.

  16. Swarm intelligence in bioinformatics: methods and implementations for discovering patterns of multiple sequences.

    Science.gov (United States)

    Cui, Zhihua; Zhang, Yi

    2014-02-01

    As a promising and innovative research field, bioinformatics has attracted increasing attention recently. Beneath the enormous number of open problems in this field, one fundamental issue is about the accurate and efficient computational methodology that can deal with tremendous amounts of data. In this paper, we survey some applications of swarm intelligence to discover patterns of multiple sequences. To provide a deep insight, ant colony optimization, particle swarm optimization, artificial bee colony and artificial fish swarm algorithm are selected, and their applications to multiple sequence alignment and motif detecting problem are discussed. PMID:24749453

  17. Recent advances in computational biology, bioinformatics, medicine, and healthcare by modern OR

    OpenAIRE

    Türkay, Metin; Weber, Gerhard-Wilhelm; Blazewicz, Jacek; Rauner, Marion

    2014-01-01

    CEJOR (2014) 22:427–430 DOI 10.1007/s10100-013-0327-2 EDITORIAL Recent advances in computational biology, bioinformatics, medicine, and healthcare by modern OR Gerhard-Wilhelm Weber · Jacek Blazewicz · Marion Rauner · Metin Türkay Published online: 7 September 2013 © Springer-Verlag Berlin Heidelberg 2013 At the occasion of the 25th European Conference on Operational Research, EURO XXV 2012, July 8–11, 2012, in Vilnius, Lithuania (http://www.euro-2012.lt/), the ...

  18. A Generic and Dynamic Framework for Web Publishing of Bioinformatics Databases

    Institute of Scientific and Technical Information of China (English)

    2003-01-01

    The present paper covers a generic and dynamic framework for the web publishing of bioinformatics databases based upon a meta data design, Java Bean, Java Server Page(JSP), Extensible Markup Language(XML), Extensible Stylesheet Language(XSL) and Extensible Stylesheet Language Transformation(XSLT). In this framework, the content is stored in a configurable and structured XML format, dynamically generated from an Oracle Relational Database Management System(RDBMS). The presentation is dynamically generated by transforming the XML document into HTML through XSLT.This clean separation between content and presentation makes the web publishing more flexible; changing the presentation only needs a modification of the Extensive Stylesheet(XS).

  19. Bioinformatics-Based Identification of Chemosensory Proteins in African Malaria Mosquito, Anopheles gambiae

    Institute of Scientific and Technical Information of China (English)

    Zhengxi Li; Zuorui Shen; Jingjiang Zhou; Lin Field

    2003-01-01

    Chemosensory proteins (CSPs) are identifiable by four spatially conserved Cysteine residues in their primary structure or by two disulfide bridges in their tertiary structure according to the previously identified olfactory specific-D related proteins. A genomics- and bioinformatics-based approach is taken in the present study to identify the putative CSPs in the malaria-carrying mosquito, Anopheles gambiae. The results show that five out of the nine annotated candidates are the most possible Anopheles CSPs of A. gambiae. This study lays the foundation for further functional identification of Anopheles CSPs, though all of these candidates need additional experimental verification.

  20. A Bioinformatics Approach for Determining Sample Identity from Different Lanes of High-Throughput Sequencing Data

    OpenAIRE

    Rachel L Goldfeder; Parker, Stephen C.J.; Ajay, Subramanian S.; Hatice Ozel Abaan; Margulies, Elliott H.

    2011-01-01

    The ability to generate whole genome data is rapidly becoming commoditized. For example, a mammalian sized genome (∼3Gb) can now be sequenced using approximately ten lanes on an Illumina HiSeq 2000. Since lanes from different runs are often combined, verifying that each lane in a genome's build is from the same sample is an important quality control. We sought to address this issue in a post hoc bioinformatic manner, instead of using upstream sample or "barcode" modifications. We rely on the ...

  1. The application of genomics and bioinformatics to accelerate crop improvement in a changing climate.

    Science.gov (United States)

    Batley, Jacqueline; Edwards, David

    2016-04-01

    The changing climate and growing global population will increase pressure on our ability to produce sufficient food. The breeding of novel crops and the adaptation of current crops to the new environment are required to ensure continued food production. Advances in genomics offer the potential to accelerate the genomics based breeding of crop plants. However, relating genomic data to climate related agronomic traits for use in breeding remains a huge challenge, and one which will require coordination of diverse skills and expertise. Bioinformatics, when combined with genomics has the potential to help maintain food security in the face of climate change through the accelerated production of climate ready crops. PMID:26926905

  2. A Bioinformatics Approach for Determining Sample Identity from Different Lanes of High-Throughput Sequencing Data

    OpenAIRE

    Goldfeder, Rachel L.; Stephen C J Parker; Ajay, Subramanian S.; Ozel Abaan, Hatice; Margulies, Elliott H.

    2011-01-01

    The ability to generate whole genome data is rapidly becoming commoditized. For example, a mammalian sized genome (∼3Gb) can now be sequenced using approximately ten lanes on an Illumina HiSeq 2000. Since lanes from different runs are often combined, verifying that each lane in a genome's build is from the same sample is an important quality control. We sought to address this issue in a post hoc bioinformatic manner, instead of using upstream sample or “barcode” modifications. We rely on the ...

  3. Novel C16orf57 mutations in patients with Poikiloderma with Neutropenia: bioinformatic analysis of the protein and predicted effects of all reported mutations

    Directory of Open Access Journals (Sweden)

    Colombo Elisa A

    2012-01-01

    Full Text Available Abstract Background Poikiloderma with Neutropenia (PN is a rare autosomal recessive genodermatosis caused by C16orf57 mutations. To date 17 mutations have been identified in 31 PN patients. Results We characterize six PN patients expanding the clinical phenotype of the syndrome and the mutational repertoire of the gene. We detect the two novel C16orf57 mutations, c.232C>T and c.265+2T>G, as well as the already reported c.179delC, c.531delA and c.693+1G>T mutations. cDNA analysis evidences the presence of aberrant transcripts, and bioinformatic prediction of C16orf57 protein structure gauges the mutations effects on the folded protein chain. Computational analysis of the C16orf57 protein shows two conserved H-X-S/T-X tetrapeptide motifs marking the active site of a two-fold pseudosymmetric structure recalling the 2H phosphoesterase superfamily. Based on this model C16orf57 is likely a 2H-active site enzyme functioning in RNA processing, as a presumptive RNA ligase. According to bioinformatic prediction, all known C16orf57 mutations, including the novel mutations herein described, impair the protein structure by either removing one or both tetrapeptide motifs or by destroying the symmetry of the native folding. Finally, we analyse the geographical distribution of the recurrent mutations that depicts clusters featuring a founder effect. Conclusions In cohorts of patients clinically affected by genodermatoses with overlapping symptoms, the molecular screening of C16orf57 gene seems the proper way to address the correct diagnosis of PN, enabling the syndrome-specific oncosurveillance. The bioinformatic prediction of the C16orf57 protein structure denotes a very basic enzymatic function consistent with a housekeeping function. Detection of aberrant transcripts, also in cells from PN patients carrying early truncated mutations, suggests they might be translatable. Tissue-specific sensitivity to the lack of functionally correct protein accounts for the

  4. Periodic safety analyses

    International Nuclear Information System (INIS)

    The IAEA Safety Guide 50-SG-S8 devoted to 'Safety Aspects of Foundations of Nuclear Power Plants' indicates that operator of a NPP should establish a program for inspection of safe operation during construction, start-up and service life of the plant for obtaining data needed for estimating the life time of structures and components. At the same time the program should ensure that the safety margins are appropriate. Periodic safety analysis are an important part of the safety inspection program. Periodic safety reports is a method for testing the whole system or a part of the safety system following the precise criteria. Periodic safety analyses are not meant for qualification of the plant components. Separate analyses are devoted to: start-up, qualification of components and materials, and aging. All these analyses are described in this presentation. The last chapter describes the experience obtained for PWR-900 and PWR-1300 units from 1986-1989

  5. Laser Beam Focus Analyser

    DEFF Research Database (Denmark)

    Nielsen, Peter Carøe; Hansen, Hans Nørgaard; Olsen, Flemming Ove;

    2007-01-01

    The quantitative and qualitative description of laser beam characteristics is important for process implementation and optimisation. In particular, a need for quantitative characterisation of beam diameter was identified when using fibre lasers for micro manufacturing. Here the beam diameter limits...... the obtainable features in direct laser machining as well as heat affected zones in welding processes. This paper describes the development of a measuring unit capable of analysing beam shape and diameter of lasers to be used in manufacturing processes. The analyser is based on the principle of a rotating...... mechanical wire being swept through the laser beam at varying Z-heights. The reflected signal is analysed and the resulting beam profile determined. The development comprised the design of a flexible fixture capable of providing both rotation and Z-axis movement, control software including data capture...

  6. Report sensory analyses veal

    OpenAIRE

    Veldman, M.; Schelvis-Smit, A.A.M.

    2005-01-01

    On behalf of a client of Animal Sciences Group, different varieties of veal were analyzed by both instrumental and sensory analyses. The sensory evaluation was performed with a sensory analytical panel in the period of 13th of May and 31st of May, 2005. The three varieties of veal were: young bull, pink veal and white veal. The sensory descriptive analyses show that the three groups Young bulls, pink veal and white veal, differ significantly in red colour for the raw meat as well as the baked...

  7. Wavelet Analyses and Applications

    Science.gov (United States)

    Bordeianu, Cristian C.; Landau, Rubin H.; Paez, Manuel J.

    2009-01-01

    It is shown how a modern extension of Fourier analysis known as wavelet analysis is applied to signals containing multiscale information. First, a continuous wavelet transform is used to analyse the spectrum of a nonstationary signal (one whose form changes in time). The spectral analysis of such a signal gives the strength of the signal in each…

  8. Probabilistic safety analyses (PSA)

    International Nuclear Information System (INIS)

    The guide shows how the probabilistic safety analyses (PSA) are used in the design, construction and operation of light water reactor plants in order for their part to ensure that the safety of the plant is good enough in all plant operational states

  9. Discovery-driven research and bioinformatics in nuclear receptor and coregulator signaling.

    Science.gov (United States)

    McKenna, Neil J

    2011-08-01

    Nuclear receptors (NRs) are a superfamily of ligand-regulated transcription factors that interact with coregulators and other transcription factors to direct tissue-specific programs of gene expression. Recent years have witnessed a rapid acceleration of the output of high-content data platforms in this field, generating discovery-driven datasets that have collectively described: the organization of the NR superfamily (phylogenomics); the expression patterns of NRs, coregulators and their target genes (transcriptomics); ligand- and tissue-specific functional NR and coregulator sites in DNA (cistromics); the organization of nuclear receptors and coregulators into higher order complexes (proteomics); and their downstream effects on homeostasis and metabolism (metabolomics). Significant bioinformatics challenges lie ahead both in the integration of this information into meaningful models of NR and coregulator biology, as well as in the archiving and communication of datasets to the global nuclear receptor signaling community. While holding great promise for the field, the ascendancy of discovery-driven research in this field brings with it a collective responsibility for researchers, publishers and funding agencies alike to ensure the effective archiving and management of these data. This review will discuss factors lying behind the increasing impact of discovery-driven research, examples of high-content datasets and their bioinformatic analysis, as well as a summary of currently curated web resources in this field. This article is part of a Special Issue entitled: Translating nuclear receptors from health to disease. PMID:21029773

  10. GITIRBio: A Semantic and Distributed Service Oriented-Architecture for Bioinformatics Pipeline.

    Science.gov (United States)

    Castillo, Luis F; López-Gartner, Germán; Isaza, Gustavo A; Sánchez, Mariana; Arango, Jeferson; Agudelo-Valencia, Daniel; Castaño, Sergio

    2015-01-01

    The need to process large quantities of data generated from genomic sequencing has resulted in a difficult task for life scientists who are not familiar with the use of command-line operations or developments in high performance computing and parallelization. This knowledge gap, along with unfamiliarity with necessary processes, can hinder the execution of data processing tasks. Furthermore, many of the commonly used bioinformatics tools for the scientific community are presented as isolated, unrelated entities that do not provide an integrated, guided, and assisted interaction with the scheduling facilities of computational resources or distribution, processing and mapping with runtime analysis. This paper presents the first approximation of a Web Services platform-based architecture (GITIRBio) that acts as a distributed front-end system for autonomous and assisted processing of parallel bioinformatics pipelines that has been validated using multiple sequences. Additionally, this platform allows integration with semantic repositories of genes for search annotations. GITIRBio is available at: http://c-head.ucaldas.edu.co:8080/gitirbio. PMID:26527189

  11. Design and bioinformatics analysis of novel biomimetic peptides as nanocarriers for gene transfer

    Directory of Open Access Journals (Sweden)

    Asia Majidi

    2015-01-01

    Full Text Available Objective(s: The introduction of nucleic acids into cells for therapeutic objectives is significantly hindered by the size and charge of these molecules and therefore requires efficient vectors that assist cellular uptake. For several years great efforts have been devoted to the study of development of recombinant vectors based on biological domains with potential applications in gene therapy. Such vectors have been synthesized in genetically engineered approach, resulting in biomacromolecules with new properties that are not present in nature. Materials and Methods: In this study, we have designed new peptides using homology modeling with the purpose of overcoming the cell barriers for successful gene delivery through Bioinformatics tools. Three different carriers were designed and one of those with better score through Bioinformatics tools was cloned, expressed and its affinity for pDNA was monitored. Results: The resultszz demonstrated that the vector can effectively condense pDNAinto nanoparticles with the average sizes about 100 nm. Conclusion: We hope these peptides can overcome the biological barriers associated with gene transfer, and mediate efficient gene delivery.

  12. Chemical proteomic and bioinformatic strategies for the identification and quantification of vascular antigens in cancer.

    Science.gov (United States)

    Strassberger, Verena; Fugmann, Tim; Neri, Dario; Roesli, Christoph

    2010-09-10

    One avenue towards the development of more selective anti-cancer drugs consists in the targeted delivery of bioactive molecules to the tumor environment by means of binding molecules specific to tumor-associated markers. In this context, the targeted delivery of therapeutic agents to newly-formed blood vessels ("vascular targeting") is particularly attractive, because of the dependence of tumors on new blood vessels to sustain growth and invasion, and because of the accessibility of neo-vascular structures for therapeutic agents injected intravenously. Ligand-based vascular targeting strategies crucially rely on good-quality vascular tumor markers. Here we describe a number of established technologies for the enrichment of accessible vascular proteins based on the isolation of glycoproteins, the in vivo coating of accessible cell surfaces with colloidal silica and the in vivo perfusion with reactive ester derivatives of biotin. Label-free as well as isotopic labeling based strategies for the subsequent MS-based protein quantification are outlined. Finally, bioinformatic workflows for protein quantification are depicted aiming at assisting in the evaluation of appropriate strategies for individual projects. This review gives an overview of current chemical proteomic strategies for the enrichment and quantification of the accessible vascular proteome and helps in selecting bioinformatic strategies for data analysis and validation. PMID:20538087

  13. An Abstract Description Approach to the Discovery and Classification of Bioinformatics Web Sources

    Energy Technology Data Exchange (ETDEWEB)

    Rocco, D; Critchlow, T J

    2003-05-01

    The World Wide Web provides an incredible resource to genomics researchers in the form of dynamic data sources--e.g. BLAST sequence homology search interfaces. The growth rate of these sources outpaces the speed at which they can be manually classified, meaning that the available data is not being utilized to its full potential. Existing research has not addressed the problems of automatically locating, classifying, and integrating classes of bioinformatics data sources. This paper presents an overview of a system for finding classes of bioinformatics data sources and integrating them behind a unified interface. We examine an approach to classifying these sources automatically that relies on an abstract description format: the service class description. This format allows a domain expert to describe the important features of an entire class of services without tying that description to any particular Web source. We present the features of this description format in the context of BLAST sources to show how the service class description relates to Web sources that are being described. We then show how a service class description can be used to classify an arbitrary Web source to determine if that source is an instance of the described service. To validate the effectiveness of this approach, we have constructed a prototype that can correctly classify approximately two-thirds of the BLAST sources we tested. We then examine these results, consider the factors that affect correct automatic classification, and discuss future work.

  14. Bioinformatics-driven identification and examination of candidate genes for non-alcoholic fatty liver disease.

    Directory of Open Access Journals (Sweden)

    Karina Banasik

    Full Text Available OBJECTIVE: Candidate genes for non-alcoholic fatty liver disease (NAFLD identified by a bioinformatics approach were examined for variant associations to quantitative traits of NAFLD-related phenotypes. RESEARCH DESIGN AND METHODS: By integrating public database text mining, trans-organism protein-protein interaction transferal, and information on liver protein expression a protein-protein interaction network was constructed and from this a smaller isolated interactome was identified. Five genes from this interactome were selected for genetic analysis. Twenty-one tag single-nucleotide polymorphisms (SNPs which captured all common variation in these genes were genotyped in 10,196 Danes, and analyzed for association with NAFLD-related quantitative traits, type 2 diabetes (T2D, central obesity, and WHO-defined metabolic syndrome (MetS. RESULTS: 273 genes were included in the protein-protein interaction analysis and EHHADH, ECHS1, HADHA, HADHB, and ACADL were selected for further examination. A total of 10 nominal statistical significant associations (P<0.05 to quantitative metabolic traits were identified. Also, the case-control study showed associations between variation in the five genes and T2D, central obesity, and MetS, respectively. Bonferroni adjustments for multiple testing negated all associations. CONCLUSIONS: Using a bioinformatics approach we identified five candidate genes for NAFLD. However, we failed to provide evidence of associations with major effects between SNPs in these five genes and NAFLD-related quantitative traits, T2D, central obesity, and MetS.

  15. Documenting the emergence of bio-ontologies: or, why researching bioinformatics requires HPSSB.

    Science.gov (United States)

    Leonelli, Sabina

    2010-01-01

    This paper reflects on the analytic challenges emerging from the study of bioinformatic tools recently created to store and disseminate biological data, such as databases, repositories, and bio-ontologies. I focus my discussion on the Gene Ontology, a term that defines three entities at once: a classification system facilitating the distribution and use of genomic data as evidence towards new insights; an expert community specialised in the curation of those data; and a scientific institution promoting the use of this tool among experimental biologists. These three dimensions of the Gene Ontology can be clearly distinguished analytically, but are tightly intertwined in practice. I suggest that this is true of all bioinformatic tools: they need to be understood simultaneously as epistemic, social, and institutional entities, since they shape the knowledge extracted from data and at the same time regulate the organisation, development, and communication of research. This viewpoint has one important implication for the methodologies used to study these tools; that is, the need to integrate historical, philosophical, and sociological approaches. I illustrate this claim through examples of misunderstandings that may result from a narrowly disciplinary study of the Gene Ontology, as I experienced them in my own research. PMID:20848809

  16. BioZone Exploting Source-Capability Information for Integrated Access to Multiple Bioinformatics Data Sources

    Energy Technology Data Exchange (ETDEWEB)

    Liu, L; Buttler, D; Paques, H; Pu, C; Critchlow

    2002-01-28

    Modern Bioinformatics data sources are widely used by molecular biologists for homology searching and new drug discovery. User-friendly and yet responsive access is one of the most desirable properties for integrated access to the rapidly growing, heterogeneous, and distributed collection of data sources. The increasing volume and diversity of digital information related to bioinformatics (such as genomes, protein sequences, protein structures, etc.) have led to a growing problem that conventional data management systems do not have, namely finding which information sources out of many candidate choices are the most relevant and most accessible to answer a given user query. We refer to this problem as the query routing problem. In this paper we introduce the notation and issues of query routing, and present a practical solution for designing a scalable query routing system based on multi-level progressive pruning strategies. The key idea is to create and maintain source-capability profiles independently, and to provide algorithms that can dynamically discover relevant information sources for a given query through the smart use of source profiles. Compared to the keyword-based indexing techniques adopted in most of the search engines and software, our approach offers fine-granularity of interest matching, thus it is more powerful and effective for handling queries with complex conditions.

  17. The OAuth 2.0 Web Authorization Protocol for the Internet Addiction Bioinformatics (IABio) Database.

    Science.gov (United States)

    Choi, Jeongseok; Kim, Jaekwon; Lee, Dong Kyun; Jang, Kwang Soo; Kim, Dai-Jin; Choi, In Young

    2016-03-01

    Internet addiction (IA) has become a widespread and problematic phenomenon as smart devices pervade society. Moreover, internet gaming disorder leads to increases in social expenditures for both individuals and nations alike. Although the prevention and treatment of IA are getting more important, the diagnosis of IA remains problematic. Understanding the neurobiological mechanism of behavioral addictions is essential for the development of specific and effective treatments. Although there are many databases related to other addictions, a database for IA has not been developed yet. In addition, bioinformatics databases, especially genetic databases, require a high level of security and should be designed based on medical information standards. In this respect, our study proposes the OAuth standard protocol for database access authorization. The proposed IA Bioinformatics (IABio) database system is based on internet user authentication, which is a guideline for medical information standards, and uses OAuth 2.0 for access control technology. This study designed and developed the system requirements and configuration. The OAuth 2.0 protocol is expected to establish the security of personal medical information and be applied to genomic research on IA. PMID:27103887

  18. Bioinformatics Analysis for Coding SNPs of the HLADQA1 Gene Involved in Susceptibility to Cervical Cancer

    Institute of Scientific and Technical Information of China (English)

    Yanyun Li; Jun Xing; Linsheng Zhao; Yanni Li; Yuchuan Wang; Weiming Zhang

    2006-01-01

    OBJECTIVE To analyze coding SNPs of the HLA-DQA1 gene involved in susceptibility for cervical cancer by a bioinformatics approach, and to choose some SNPs that may have an association with cervical cancer.METHODS By a SNPper tool we extracted SNPs from a public database (dbSNP), exporting them in FASTA formats suitable for subsequent use.Then we used PARSESNP as a tool for the analysis of the cSNPs.RESULTS In the cSNPs of the HLA-DQA1 gene, we find that rs9272693and rs9272703, are made up of missense mutations which convert a codon for one amino acid into a codon for a different amino acid. We chose a PSSM Difference >10 as a lower level for the scores of changes predicted to be deldterious.CONCLUSION We used a bioinformatics approach for cSNPs analysis of the HLA-DQA1 gene. This method can select the variants in a conserved region, and give a PSSM Difference score. But the results need to be verified in cervical cancer patients and a control population.

  19. Tools and data services registry: a community effort to document bioinformatics resources.

    Science.gov (United States)

    Ison, Jon; Rapacki, Kristoffer; Ménager, Hervé; Kalaš, Matúš; Rydza, Emil; Chmura, Piotr; Anthon, Christian; Beard, Niall; Berka, Karel; Bolser, Dan; Booth, Tim; Bretaudeau, Anthony; Brezovsky, Jan; Casadio, Rita; Cesareni, Gianni; Coppens, Frederik; Cornell, Michael; Cuccuru, Gianmauro; Davidsen, Kristian; Vedova, Gianluca Della; Dogan, Tunca; Doppelt-Azeroual, Olivia; Emery, Laura; Gasteiger, Elisabeth; Gatter, Thomas; Goldberg, Tatyana; Grosjean, Marie; Grüning, Björn; Helmer-Citterich, Manuela; Ienasescu, Hans; Ioannidis, Vassilios; Jespersen, Martin Closter; Jimenez, Rafael; Juty, Nick; Juvan, Peter; Koch, Maximilian; Laibe, Camille; Li, Jing-Woei; Licata, Luana; Mareuil, Fabien; Mičetić, Ivan; Friborg, Rune Møllegaard; Moretti, Sebastien; Morris, Chris; Möller, Steffen; Nenadic, Aleksandra; Peterson, Hedi; Profiti, Giuseppe; Rice, Peter; Romano, Paolo; Roncaglia, Paola; Saidi, Rabie; Schafferhans, Andrea; Schwämmle, Veit; Smith, Callum; Sperotto, Maria Maddalena; Stockinger, Heinz; Vařeková, Radka Svobodová; Tosatto, Silvio C E; de la Torre, Victor; Uva, Paolo; Via, Allegra; Yachdav, Guy; Zambelli, Federico; Vriend, Gert; Rost, Burkhard; Parkinson, Helen; Løngreen, Peter; Brunak, Søren

    2016-01-01

    Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand.Here we present a community-driven curation effort, supported by ELIXIR-the European infrastructure for biological information-that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners.As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools. PMID:26538599

  20. Opportunities and challenges provided by cloud repositories for bioinformatics-enabled drug discovery.

    Science.gov (United States)

    Dalpé, Gratien; Joly, Yann

    2014-09-01

    Healthcare-related bioinformatics databases are increasingly offering the possibility to maintain, organize, and distribute DNA sequencing data. Different national and international institutions are currently hosting such databases that offer researchers website platforms where they can obtain sequencing data on which they can perform different types of analysis. Until recently, this process remained mostly one-dimensional, with most analysis concentrated on a limited amount of data. However, newer genome sequencing technology is producing a huge amount of data that current computer facilities are unable to handle. An alternative approach has been to start adopting cloud computing services for combining the information embedded in genomic and model system biology data, patient healthcare records, and clinical trials' data. In this new technological paradigm, researchers use virtual space and computing power from existing commercial or not-for-profit cloud service providers to access, store, and analyze data via different application programming interfaces. Cloud services are an alternative to the need of larger data storage; however, they raise different ethical, legal, and social issues. The purpose of this Commentary is to summarize how cloud computing can contribute to bioinformatics-based drug discovery and to highlight some of the outstanding legal, ethical, and social issues that are inherent in the use of cloud services. PMID:25195583

  1. BioZoom: Exploiting Source-Capability Information for Integrated Access to Multiple Bioinformatics Data Sources

    Energy Technology Data Exchange (ETDEWEB)

    Liu, L; Buttler, D; Critchlow, T J; Han, W; Paques, H; Pu, C; Rocco, D

    2003-01-09

    Modern Bioinformatics data sources are widely used by molecular biologists for homology searching and new drug discovery. User-friendly and yet responsive access is one of the most desirable properties for integrated access to the rapidly growing, heterogeneous, and distributed collection of data sources. The increasing volume and diversity of digital information related to bioinformatics (such as genomes, protein sequences, protein structures, etc.) have led to a growing problem that conventional data management systems do not have, namely finding which information sources out of many candidate choices are the most relevant and most accessible to answer a given user query. We refer to this problem as the query routing problem. In this paper we introduce the notation and issues of query routing, and present a practical solution for designing a scalable query routing system based on multi-level progressive pruning strategies. The key idea is to create and maintain source-capability profiles independently, and to provide algorithms that can dynamically discover relevant information sources for a given query through the smart use of source profiles. Compared to the keyword-based indexing techniques adopted in most of the search engines and software, our approach offers fine-granularity of interest matching, thus it is more powerful and effective for handling queries with complex conditions.

  2. SweetNET: A Bioinformatics Workflow for Glycopeptide MS/MS Spectral Analysis.

    Science.gov (United States)

    Nasir, Waqas; Toledo, Alejandro Gomez; Noborn, Fredrik; Nilsson, Jonas; Wang, Mingxun; Bandeira, Nuno; Larson, Göran

    2016-08-01

    Glycoproteomics has rapidly become an independent analytical platform bridging the fields of glycomics and proteomics to address site-specific protein glycosylation and its impact in biology. Current glycopeptide characterization relies on time-consuming manual interpretations and demands high levels of personal expertise. Efficient data interpretation constitutes one of the major challenges to be overcome before true high-throughput glycopeptide analysis can be achieved. The development of new glyco-related bioinformatics tools is thus of crucial importance to fulfill this goal. Here we present SweetNET: a data-oriented bioinformatics workflow for efficient analysis of hundreds of thousands of glycopeptide MS/MS-spectra. We have analyzed MS data sets from two separate glycopeptide enrichment protocols targeting sialylated glycopeptides and chondroitin sulfate linkage region glycopeptides, respectively. Molecular networking was performed to organize the glycopeptide MS/MS data based on spectral similarities. The combination of spectral clustering, oxonium ion intensity profiles, and precursor ion m/z shift distributions provided typical signatures for the initial assignment of different N-, O- and CS-glycopeptide classes and their respective glycoforms. These signatures were further used to guide database searches leading to the identification and validation of a large number of glycopeptide variants including novel deoxyhexose (fucose) modifications in the linkage region of chondroitin sulfate proteoglycans. PMID:27399812

  3. Public or private economies of knowledge: The economics of diffusion and appropriation of bioinformatics tools

    Directory of Open Access Journals (Sweden)

    Mark Harvey

    2009-12-01

    Full Text Available The past three decades have witnessed a period of great turbulence in the economies of biological knowledge, during which there has been great uncertainty as to how and where boundaries could be drawn between public or private knowledge especially with regard to the explosive growth in biological databases and their related bioinformatic tools. This paper will focus on some of the key software tools developed in relation to bio-databases. It will argue that bioinformatic tools are particularly economically unstable, and that there is a continuing tension and competition between their public and private modes of production, appropriation, distribution, and use. The paper adopts an ‘instituted economic process’ approach, and in this paper will elaborate on processes of making knowledge public in the creation of ‘public goods’. The question is one of continuously creating and sustaining new institutions of the commons. We believe this critical to an understanding of the division and interdependency between public and private economies of knowledge.

  4. Genome-Wide Evolutionary Characterization and Expression Analyses of WRKY Family Genes in Brachypodium distachyon

    OpenAIRE

    WEN, FENG; Zhu, Hong; Li, Peng; Jiang, Min; Mao, Wenqing; Ong, Chermaine; Chu, Zhaoqing

    2014-01-01

    Members of plant WRKY gene family are ancient transcription factors that function in plant growth and development and respond to biotic and abiotic stresses. In our present study, we have investigated WRKY family genes in Brachypodium distachyon, a new model plant of family Poaceae. We identified a total of 86 WRKY genes from B. distachyon and explored their chromosomal distribution and evolution, domain alignment, promoter cis-elements, and expression profiles. Combining the analysis of phyl...

  5. Evaluating the effectiveness of a practical inquiry-based learning bioinformatics module on undergraduate student engagement and applied skills.

    Science.gov (United States)

    Brown, James A L

    2016-05-01

    A pedagogic intervention, in the form of an inquiry-based peer-assisted learning project (as a practical student-led bioinformatics module), was assessed for its ability to increase students' engagement, practical bioinformatic skills and process-specific knowledge. Elements assessed were process-specific knowledge following module completion, qualitative student-based module evaluation and the novelty, scientific validity and quality of written student reports. Bioinformatics is often the starting point for laboratory-based research projects, therefore high importance was placed on allowing students to individually develop and apply processes and methods of scientific research. Students led a bioinformatic inquiry-based project (within a framework of inquiry), discovering, justifying and exploring individually discovered research targets. Detailed assessable reports were produced, displaying data generated and the resources used. Mimicking research settings, undergraduates were divided into small collaborative groups, with distinctive central themes. The module was evaluated by assessing the quality and originality of the students' targets through reports, reflecting students' use and understanding of concepts and tools required to generate their data. Furthermore, evaluation of the bioinformatic module was assessed semi-quantitatively using pre- and post-module quizzes (a non-assessable activity, not contributing to their grade), which incorporated process- and content-specific questions (indicative of their use of the online tools). Qualitative assessment of the teaching intervention was performed using post-module surveys, exploring student satisfaction and other module specific elements. Overall, a positive experience was found, as was a post module increase in correct process-specific answers. In conclusion, an inquiry-based peer-assisted learning module increased students' engagement, practical bioinformatic skills and process-specific knowledge. © 2016 by

  6. Possible future HERA analyses

    CERN Document Server

    Geiser, Achim

    2015-01-01

    A variety of possible future analyses of HERA data in the context of the HERA data preservation programme is collected, motivated, and commented. The focus is placed on possible future analyses of the existing $ep$ collider data and their physics scope. Comparisons to the original scope of the HERA programme are made, and cross references to topics also covered by other participants of the workshop are given. This includes topics on QCD, proton structure, diffraction, jets, hadronic final states, heavy flavours, electroweak physics, and the application of related theory and phenomenology topics like NNLO QCD calculations, low-x related models, nonperturbative QCD aspects, and electroweak radiative corrections. Synergies with other collider programmes are also addressed. In summary, the range of physics topics which can still be uniquely covered using the existing data is very broad and of considerable physics interest, often matching the interest of results from colliders currently in operation. Due to well-e...

  7. Statistisk analyse med SPSS

    OpenAIRE

    Linnerud, Kristin; Oklevik, Ove; Slettvold, Harald

    2004-01-01

    Dette notatet har sitt utspring i forelesninger og undervisning for 3.års studenter i økonomi og administrasjon ved høgskolen i Sogn og Fjordane. Notatet er særlig lagt opp mot undervisningen i SPSS i de to kursene ”OR 685 Marknadsanalyse og merkevarestrategi” og ”BD 616 Økonomistyring og analyse med programvare”.

  8. Creating a specialist protein resource network: a meeting report for the protein bioinformatics and community resources retreat

    DEFF Research Database (Denmark)

    Babbitt, Patricia C.; Bagos, Pantelis G.; Bairoch, Amos;

    2015-01-01

    During 11–12 August 2014, a Protein Bioinformatics and Community Resources Retreat was held at the Wellcome Trust Genome Campus in Hinxton, UK. This meeting brought together the principal investigators of several specialized protein resources (such as CAZy, TCDB and MEROPS) as well as those from...... protein databases from the large Bioinformatics centres (including UniProt and RefSeq). The retreat was divided into five sessions: (1) key challenges, (2) the databases represented, (3) best practices for maintenance and curation, (4) information flow to and from large data centers and (5) communication...

  9. Biomass feedstock analyses

    Energy Technology Data Exchange (ETDEWEB)

    Wilen, C.; Moilanen, A.; Kurkela, E. [VTT Energy, Espoo (Finland). Energy Production Technologies

    1996-12-31

    The overall objectives of the project `Feasibility of electricity production from biomass by pressurized gasification systems` within the EC Research Programme JOULE II were to evaluate the potential of advanced power production systems based on biomass gasification and to study the technical and economic feasibility of these new processes with different type of biomass feed stocks. This report was prepared as part of this R and D project. The objectives of this task were to perform fuel analyses of potential woody and herbaceous biomasses with specific regard to the gasification properties of the selected feed stocks. The analyses of 15 Scandinavian and European biomass feed stock included density, proximate and ultimate analyses, trace compounds, ash composition and fusion behaviour in oxidizing and reducing atmospheres. The wood-derived fuels, such as whole-tree chips, forest residues, bark and to some extent willow, can be expected to have good gasification properties. Difficulties caused by ash fusion and sintering in straw combustion and gasification are generally known. The ash and alkali metal contents of the European biomasses harvested in Italy resembled those of the Nordic straws, and it is expected that they behave to a great extent as straw in gasification. Any direct relation between the ash fusion behavior (determined according to the standard method) and, for instance, the alkali metal content was not found in the laboratory determinations. A more profound characterisation of the fuels would require gasification experiments in a thermobalance and a PDU (Process development Unit) rig. (orig.) (10 refs.)

  10. Possible future HERA analyses

    Energy Technology Data Exchange (ETDEWEB)

    Geiser, Achim

    2015-12-15

    A variety of possible future analyses of HERA data in the context of the HERA data preservation programme is collected, motivated, and commented. The focus is placed on possible future analyses of the existing ep collider data and their physics scope. Comparisons to the original scope of the HERA pro- gramme are made, and cross references to topics also covered by other participants of the workshop are given. This includes topics on QCD, proton structure, diffraction, jets, hadronic final states, heavy flavours, electroweak physics, and the application of related theory and phenomenology topics like NNLO QCD calculations, low-x related models, nonperturbative QCD aspects, and electroweak radiative corrections. Synergies with other collider programmes are also addressed. In summary, the range of physics topics which can still be uniquely covered using the existing data is very broad and of considerable physics interest, often matching the interest of results from colliders currently in operation. Due to well-established data and MC sets, calibrations, and analysis procedures the manpower and expertise needed for a particular analysis is often very much smaller than that needed for an ongoing experiment. Since centrally funded manpower to carry out such analyses is not available any longer, this contribution not only targets experienced self-funded experimentalists, but also theorists and master-level students who might wish to carry out such an analysis.

  11. PRE-1, a cis element sufficient to enhance cone- and rod- specific expression in differentiating zebrafish photoreceptors

    Directory of Open Access Journals (Sweden)

    Hurley James B

    2011-01-01

    Full Text Available Abstract Background Appropriate transcriptional regulation is required for cone photoreceptor development and integrity. To date, only a few cis-regulatory elements that control cone photoreceptor-specific expression have been characterised. The alpha-subunit of cone transducin (TαC is specifically expressed in cone photoreceptors and is required for colour vision. In order to better understand the molecular genetics controlling the initiation of cone photoreceptor-specific expression in vivo, we have utilised zebrafish to identify cis-regulatory elements in the upstream promoter region of the TαC gene. Results A 0.5 kb TαC promoter fragment is sufficient to direct cone-specific expression in transgenic larvae. Within this minimal promoter, we identify photoreceptor regulatory element-1 (PRE-1, a unique 41 bp sequence. PRE-1 specifically binds nuclear factors expressed in ocular tissue. PRE-1 is not required for cone-specific expression directed from a 2.5 kb TαC promoter. However, PRE-1-like sequences, with potential functional redundancy, are located in this 2.5 kb promoter. PRE-1-rho which has the highest sequence and structural homology to PRE-1 is located in the rhodopsin promoter. Surprisingly, PRE-1 and PRE-1-rho are functionally distinct. We demonstrate that PRE-1, but not PRE-1-rho, is sufficient to enhance expression from a heterologous UV cone promoter. PRE-1 is also sufficient to enhance expression from a heterologous rhodopsin promoter without altering its rod photoreceptor specificity. Finally, mutations in consensus E-box and Otx sites prevent PRE-1 from forming complexes with eye nuclear protein and enhancing photoreceptor expression. Conclusions PRE-1 is a novel cis-regulatory module that is sufficient to enhance the initiation of photoreceptor-specific gene expression in differentiating rod and cone photoreceptors.

  12. Comparative genome sequencing of Drosophila pseudoobscura: Chromosomal, gene, and cis-element evolution

    DEFF Research Database (Denmark)

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.;

    2005-01-01

    years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences......We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each...... between the species-but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence...

  13. Molecular dissection of the AGAMOUS control region shows that cis elements for spatial regulation are located intragenically.

    Science.gov (United States)

    Sieburth, L E; Meyerowitz, E M

    1997-03-01

    AGAMOUS (AG) is an Arabidopsis MADS box gene required for the normal development of the internal two whorls of the flower. AG RNA accumulates in distinct patterns early and late in flower development, and several genes have been identified as regulators of AG gene expression based on altered AG RNA accumulation in mutants. To understand AG regulatory circuits, we are now identifying cis regulatory domains by characterizing AG::beta-glucuronidase (GUS) gene fusions. These studies show that a normal AG::GUS staining pattern is conferred by a 9.8-kb region encompassing 6 kb of upstream sequences and 3.8 kb of intragenic sequences. Constructs lacking the 3.8-kb intragenic sequences confer a GUS staining pattern that deviates both spatially and temporally from normal AG expression. The GUS staining patterns in the mutants for the three negative regulators of AG, apetala2, leunig, and curly leaf, showed the predicted change of expression for the construct containing the intragenic sequences, but no significant change was observed for the constructs lacking this intragenic region. These results suggest that intragenic sequences are essential for AG regulation and that these intragenic sequences contain the ultimate target sites for at least some of the known regulatory molecules. PMID:9090880

  14. Meta-analysis of breast cancer microarray studies in conjunction with conserved cis-elements suggest patterns for coordinate regulation

    OpenAIRE

    Lundberg Cathryn; Snøve Ola; Sætrom Pål; Smith David D; Rivas Guillermo E; Glackin Carlotta; Larson Garrett P

    2008-01-01

    Abstract Background Gene expression measurements from breast cancer (BrCa) tumors are established clinical predictive tools to identify tumor subtypes, identify patients showing poor/good prognosis, and identify patients likely to have disease recurrence. However, diverse breast cancer datasets in conjunction with diagnostic clinical arrays show little overlap in the sets of genes identified. One approach to identify a set of consistently dysregulated candidate genes in these tumors is to emp...

  15. Meta-analysis of breast cancer microarray studies in conjunction with conserved cis-elements suggest patterns for coordinate regulation

    Directory of Open Access Journals (Sweden)

    Lundberg Cathryn

    2008-01-01

    Full Text Available Abstract Background Gene expression measurements from breast cancer (BrCa tumors are established clinical predictive tools to identify tumor subtypes, identify patients showing poor/good prognosis, and identify patients likely to have disease recurrence. However, diverse breast cancer datasets in conjunction with diagnostic clinical arrays show little overlap in the sets of genes identified. One approach to identify a set of consistently dysregulated candidate genes in these tumors is to employ meta-analysis of multiple independent microarray datasets. This allows one to compare expression data from a diverse collection of breast tumor array datasets generated on either cDNA or oligonucleotide arrays. Results We gathered expression data from 9 published microarray studies examining estrogen receptor positive (ER+ and estrogen receptor negative (ER- BrCa tumor cases from the Oncomine database. We performed a meta-analysis and identified genes that were universally up or down regulated with respect to ER+ versus ER- tumor status. We surveyed both the proximal promoter and 3' untranslated regions (3'UTR of our top-ranking genes in each expression group to test whether common sequence elements may contribute to the observed expression patterns. Utilizing a combination of known transcription factor binding sites (TFBS, evolutionarily conserved mammalian promoter and 3'UTR motifs, and microRNA (miRNA seed sequences, we identified numerous motifs that were disproportionately represented between the two gene classes suggesting a common regulatory network for the observed gene expression patterns. Conclusion Some of the genes we identified distinguish key transcripts previously seen in array studies, while others are newly defined. Many of the genes identified as overexpressed in ER- tumors were previously identified as expression markers for neoplastic transformation in multiple human cancers. Moreover, our motif analysis identified a collection of specific cis-acting target sites which may collectively play a role in the differential gene expression patterns observed in ER+ versus ER- breast cancer tumors. Importantly, the gene sets and associated DNA motifs provide a starting point with which to explore the mechanistic basis for the observed expression patterns in breast tumors.

  16. Digital differential analysers

    CERN Document Server

    Shilejko, A V; Higinbotham, W

    1964-01-01

    Digital Differential Analysers presents the principles, operations, design, and applications of digital differential analyzers, a machine with the ability to present initial quantities and the possibility of dividing them into separate functional units performing a number of basic mathematical operations. The book discusses the theoretical principles underlying the operation of digital differential analyzers, such as the use of the delta-modulation method and function-generator units. Digital integration methods and the classes of digital differential analyzer designs are also reviewed. The te

  17. Analysing Access Control Specifications

    DEFF Research Database (Denmark)

    Probst, Christian W.; Hansen, René Rydhof

    2009-01-01

    common tool to answer this question, analysis of log files, faces the problem that the amount of logged data may be overwhelming. This problems gets even worse in the case of insider attacks, where the attacker’s actions usually will be logged as permissible, standard actions—if they are logged at all....... Recent events have revealed intimate knowledge of surveillance and control systems on the side of the attacker, making it often impossible to deduce the identity of an inside attacker from logged data. In this work we present an approach that analyses the access control configuration to identify the set...

  18. AMS analyses at ANSTO

    Energy Technology Data Exchange (ETDEWEB)

    Lawson, E.M. [Australian Nuclear Science and Technology Organisation, Lucas Heights, NSW (Australia). Physics Division

    1998-03-01

    The major use of ANTARES is Accelerator Mass Spectrometry (AMS) with {sup 14}C being the most commonly analysed radioisotope - presently about 35 % of the available beam time on ANTARES is used for {sup 14}C measurements. The accelerator measurements are supported by, and dependent on, a strong sample preparation section. The ANTARES AMS facility supports a wide range of investigations into fields such as global climate change, ice cores, oceanography, dendrochronology, anthropology, and classical and Australian archaeology. Described here are some examples of the ways in which AMS has been applied to support research into the archaeology, prehistory and culture of this continent`s indigenous Aboriginal peoples. (author)

  19. AMS analyses at ANSTO

    International Nuclear Information System (INIS)

    The major use of ANTARES is Accelerator Mass Spectrometry (AMS) with 14C being the most commonly analysed radioisotope - presently about 35 % of the available beam time on ANTARES is used for 14C measurements. The accelerator measurements are supported by, and dependent on, a strong sample preparation section. The ANTARES AMS facility supports a wide range of investigations into fields such as global climate change, ice cores, oceanography, dendrochronology, anthropology, and classical and Australian archaeology. Described here are some examples of the ways in which AMS has been applied to support research into the archaeology, prehistory and culture of this continent's indigenous Aboriginal peoples. (author)

  20. Systemdynamisk analyse av vannkraftsystem

    OpenAIRE

    Rydning, Anja

    2007-01-01

    I denne oppgaven er det gjennomført en dynamisk analyse av vannkraftverket Fortun kraftverk. Tre fenomener er særlig vurdert i denne oppgaven: Sjaktsvingninger mellom svingesjakt og magasin, trykkstøt ved turbinen som følge av retardasjonstrykk ved endring i turbinvannføringen og reguleringsstabilitet. Sjaktsvingningene og trykkstøt beregnes analytisk ut fra kontinuitets- og bevegelsesligningen. Modeller av Fortun kraftverk er laget for å beregne trykkstøt og sjaktsvingninger. En modell e...

  1. Assessing next-generation sequencing and 4 bioinformatics tools for detection of Enterovirus D68 and other respiratory viruses in clinical samples.

    Science.gov (United States)

    Huang, Weihua; Wang, Guiqing; Lin, Henry; Zhuge, Jian; Nolan, Sheila M; Vail, Eric; Dimitrova, Nevenka; Fallon, John T

    2016-05-01

    We used 4 different bioinformatics algorithms to evaluate the application of a metagenomic shot-gun sequencing method in detection of Enterovirus D68 and other respiratory viruses in clinical specimens. Our data supported that next-generation sequencing, combined with improved bioinformatics tools, is practically feasible and useful for clinical diagnosis of viral infections. PMID:26971640

  2. NFFinder: an online bioinformatics tool for searching similar transcriptomics experiments in the context of drug repositioning.

    Science.gov (United States)

    Setoain, Javier; Franch, Mònica; Martínez, Marta; Tabas-Madrid, Daniel; Sorzano, Carlos O S; Bakker, Annette; Gonzalez-Couto, Eduardo; Elvira, Juan; Pascual-Montano, Alberto

    2015-07-01

    Drug repositioning, using known drugs for treating conditions different from those the drug was originally designed to treat, is an important drug discovery tool that allows for a faster and cheaper development process by using drugs that are already approved or in an advanced trial stage for another purpose. This is especially relevant for orphan diseases because they affect too few people to make drug research de novo economically viable. In this paper we present NFFinder, a bioinformatics tool for identifying potential useful drugs in the context of orphan diseases. NFFinder uses transcriptomic data to find relationships between drugs, diseases and a phenotype of interest, as well as identifying experts having published on that domain. The application shows in a dashboard a series of graphics and tables designed to help researchers formulate repositioning hypotheses and identify potential biological relationships between drugs and diseases. NFFinder is freely available at http://nffinder.cnb.csic.es. PMID:25940629

  3. Bioinformatic analysis ofhuman nuclear receptornr5a2(hblf) genomic sequence

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    We have cloned the cDNA of human nuclear receptor nrSa2(hb1f) gene and obtained its whole genomic sequence previously. In this work we carried out in-depth bioinformatic analysis on the genomic sequence of nrSa2(hb1f) gene. Sequence comparison and prediction algorithms implicated that there might be additional coding regions in the 210 kb genomic sequence besides known exons,especially in the two largest introns. Comparison of the structures of nr5a loci in different species revealed distinguishable conservation and apparent gene duplication during evolution. The remarkable conservation among promoters of zebrafish, mouse and human nr5a2 genes suggested that they would be regulated by the same transcription factors.

  4. Detecting chimeric 5′/3′UTRs with cross-chromosomal splicing by bioinformatics

    Institute of Scientific and Technical Information of China (English)

    ZHANG Zhihua; ZHANG Yong; SHI Baochen; DENG Wei; ZHAO Yi; CHEN Runsheng

    2004-01-01

    The 5′/3′ UTRs of mRNA are crucial in translational regulation, and several serious diseases are believed to be associated with abnormal splicing of these parts of the mRNA sequence. In this work a novel method which uses sequence alignment database searching for detecting chimeric 5′3′ UTRs with cross-chromosomal splicing is reported. Eight highly credible instances of cross-chromosomal splicing have been found using this method, representing additional confirmation of the existence of cross-chromosomal splicing events provided by bioinformatics tools. Since no conserved motif has been found in any of the eight instances, and at the same time current prediction algorithms produce only trivial secondary structures at the "splicing sites", it is not possible to identify any specific signal leading to the splicing.

  5. Thermodynamic description of Beta amyloid formation using physicochemical scales and fractal bioinformatic scales.

    Science.gov (United States)

    Phillips, J C

    2015-05-20

    Protein function depends on both protein structure and amino acid (aa) sequence. Here we show that modular features of both structure and function can be quantified economically from the aa sequences alone for the small (40,42 aa) plaque-forming (aggregative) amyloid beta fragments. Some edge and center features of the fragments are predicted. Bioinformatic scales based on β strand formation propensities and the thermodynamically second order fractal hydropathicity scale based on evolutionary optimization (self-organized criticality) are contrasted with the standard first order physicochemical scale based on complete protein (water-air) unfolding. The results are consistent with previous studies of these physicochemical factors that show that aggregative properties, even of beta fragments, are driven primarily by near-equilibrium hydropathic forces. PMID:25702750

  6. Lecture 10: The European Bioinformatics Institute - "Big data" for biomedical sciences

    CERN Document Server

    CERN. Geneva; Dana, Jose

    2013-01-01

    Part 1: Big data for biomedical sciences (Tom Hancocks) Ten years ago witnessed the completion of the first international 'Big Biology' project that sequenced the human genome. In the years since biological sciences, have seen a vast growth in data. In the coming years advances will come from integration of experimental approaches and the translation into applied technologies is the hospital, clinic and even at home. This talk will examine the development of infrastructure, physical and virtual, that will allow millions of life scientists across Europe better access to biological data Tom studied Human Genetics at the University of Leeds and McMaster University, before completing an MSc in Analytical Genomics at the University of Birmingham. He has worked for the UK National Health Service in diagnostic genetics and in training healthcare scientists and clinicians in bioinformatics. Tom joined the EBI in 2012 and is responsible for the scientific development and delivery of training for the BioMedBridges pr...

  7. FY02 CBNP Annual Report Input: Bioinformatics Support for CBNP Research and Deployments

    Energy Technology Data Exchange (ETDEWEB)

    Slezak, T; Wolinsky, M

    2002-10-31

    The events of FY01 dynamically reprogrammed the objectives of the CBNP bioinformatics support team, to meet rapidly-changing Homeland Defense needs and requests from other agencies for assistance: Use computational techniques to determine potential unique DNA signature candidates for microbial and viral pathogens of interest to CBNP researcher and to our collaborating partner agencies such as the Centers for Disease Control and Prevention (CDC), U.S. Department of Agriculture (USDA), Department of Defense (DOD), and Food and Drug Administration (FDA). Develop effective electronic screening measures for DNA signatures to reduce the cost and time of wet-bench screening. Build a comprehensive system for tracking the development and testing of DNA signatures. Build a chain-of-custody sample tracking system for field deployment of the DNA signatures as part of the BASIS project. Provide computational tools for use by CBNP Biological Foundations researchers.

  8. Bioinformatics analysis of breast cancer bone metastasis related gene-CXCR4

    Institute of Scientific and Technical Information of China (English)

    Heng-Wei Zhang; Xian-Fu Sun; Ya-Ning He; Jun-Tao Li; Xu-Hui Guo; Hui Liu

    2013-01-01

    Objective: To analyze breast cancer bone metastasis related gene-CXCR4. Methods: This research screened breast cancer bone metastasis related genes by high-flux gene chip. Results:It was found that the expressions of 396 genes were different including 165 up-regulations and 231 down-regulations. The expression of chemokine receptor CXCR4 was obviously up-regulated in the tissue with breast cancer bone metastasis. Compared with the tissue without bone metastasis, there was significant difference, which indicated that CXCR4 played a vital role in breast cancer bone metastasis. Conclusions: The bioinformatics analysis of CXCR4 can provide a certain basis for the occurrence and diagnosis of breast cancer bone metastasis, target gene therapy and evaluation of prognosis.

  9. The Use of Bioinformatics for Studying HIV Evolutionary and Epidemiological History in South America

    Directory of Open Access Journals (Sweden)

    Gonzalo Bello

    2011-01-01

    Full Text Available The South American human immunodeficiency virus type 1 (HIV-1 epidemic is driven by several subtypes (B, C, and F1 and circulating and unique recombinant forms derived from those subtypes. Those variants are heterogeneously distributed around the continent in a country-specific manner. Despite some inconsistencies mainly derived from sampling biases and analytical constrains, most of studies carried out in the area agreed in pointing out specificities in the evolutionary dynamics of the circulating HIV-1 lineages. In this paper, we covered the theoretical basis, and the application of bioinformatics methods to reconstruct the HIV spatial-temporal dynamics, unveiling relevant information to understand the origin, geographical dissemination and the current molecular scenario of the HIV epidemic in the continent, particularly in the countries of Southern Cone.

  10. Experimental Design and Bioinformatics Analysis for the Application of Metagenomics in Environmental Sciences and Biotechnology.

    Science.gov (United States)

    Ju, Feng; Zhang, Tong

    2015-11-01

    Recent advances in DNA sequencing technologies have prompted the widespread application of metagenomics for the investigation of novel bioresources (e.g., industrial enzymes and bioactive molecules) and unknown biohazards (e.g., pathogens and antibiotic resistance genes) in natural and engineered microbial systems across multiple disciplines. This review discusses the rigorous experimental design and sample preparation in the context of applying metagenomics in environmental sciences and biotechnology. Moreover, this review summarizes the principles, methodologies, and state-of-the-art bioinformatics procedures, tools and database resources for metagenomics applications and discusses two popular strategies (analysis of unassembled reads versus assembled contigs/draft genomes) for quantitative or qualitative insights of microbial community structure and functions. Overall, this review aims to facilitate more extensive application of metagenomics in the investigation of uncultured microorganisms, novel enzymes, microbe-environment interactions, and biohazards in biotechnological applications where microbial communities are engineered for bioenergy production, wastewater treatment, and bioremediation. PMID:26451629

  11. 7th International Conference on Practical Applications of Computational Biology & Bioinformatics

    CERN Document Server

    Nanni, Loris; Rocha, Miguel; Fdez-Riverola, Florentino

    2013-01-01

    The growth in the Bioinformatics and Computational Biology fields over the last few years has been remarkable and the trend is to increase its pace. In fact, the need for computational techniques that can efficiently handle the huge amounts of data produced by the new experimental techniques in Biology is still increasing driven by new advances in Next Generation Sequencing, several types of the so called omics data and image acquisition, just to name a few. The analysis of the datasets that produces and its integration call for new algorithms and approaches from fields such as Databases, Statistics, Data Mining, Machine Learning, Optimization, Computer Science and Artificial Intelligence. Within this scenario of increasing data availability, Systems Biology has also been emerging as an alternative to the reductionist view that dominated biological research in the last decades. Indeed, Biology is more and more a science of information requiring tools from the computational sciences. In the last few years, we ...

  12. CBS Genome Atlas Database: a dynamic storage for bioinformatic results and sequence data

    DEFF Research Database (Denmark)

    Hallin, Peter Fischer; Ussery, David

    2004-01-01

    Currently, new bacterial genomes are being published on a monthly basis. With the growing amount of genome sequence data, there is a demand for a flexible and easy-to-maintain structure for storing sequence data and results from bioinformatic analysis. More than 150 sequenced bacterial genomes are...... detailed comparative genomics. DNA structural calculations like curvature and stacking energy, DNA compositions like base skews, oligo skews and repeats at the local and global level are just a few of the analysis that are presented on the CBS Genome Atlas Web page. Complex analysis, changing methods and...... frequent addition of new models are factors that require a dynamic database layout. Using basic tools like the GNU Make system, csh, Perl and MySQL, we have created a flexible database environment for storing and maintaining such results for a collection of complete microbial genomes. Currently, these...

  13. Bioinformatics Workflow for Clinical Whole Genome Sequencing at Partners HealthCare Personalized Medicine

    Directory of Open Access Journals (Sweden)

    Ellen A. Tsai

    2016-02-01

    Full Text Available Effective implementation of precision medicine will be enhanced by a thorough understanding of each patient’s genetic composition to better treat his or her presenting symptoms or mitigate the onset of disease. This ideally includes the sequence information of a complete genome for each individual. At Partners HealthCare Personalized Medicine, we have developed a clinical process for whole genome sequencing (WGS with application in both healthy individuals and those with disease. In this manuscript, we will describe our bioinformatics strategy to efficiently process and deliver genomic data to geneticists for clinical interpretation. We describe the handling of data from FASTQ to the final variant list for clinical review for the final report. We will also discuss our methodology for validating this workflow and the cost implications of running WGS.

  14. Chapter 5. Safety analyses

    International Nuclear Information System (INIS)

    In 2000 the safety analyses of the Nuclear Regulatory Authority of the Slovak Republic (UJD) were focused on verification of the safety analyses report and probabilistic safety assessment study for NPP V-1 Bohunice after the reconstruction, reviewing of the suggested changes of the Limits and Conditions for NPP V-2 Bohunice and on the assessment of operational events. An important part of work was performed also in solving of scientific and technical tasks appointed within bilateral projects of co-operation between UJD and its international partnerships' organisations, i.e. within international PHARE programme as well as the 5th framework of the European Commission. Verification of safety analyses part of the safety report for NPP V-1 Bohunice after the gradual reconstruction was focused on checking and passing judgement on the completeness of the considered initiating events, safety criteria, input data, adequacy of the used calculation models and also on the overall quality of the submitted documentation. Suitability of the used methodology and the calculation programmes, achieved level of their verification, correctness and interpretation of the results were assessed. The performed review has shown that the checked safety analyses were performed in compliance with the internationally accepted practice, recommendations of UJD and the IAEA. The required level of safety of NPP V-1 Bohunice has been approved. The document with the results and all the findings of the performed review has been prepared. It includes the details of the performed independent calculations, their results and comparison with the results given in the safety report. A special attention was paid to a review of probabilistic safety assessment study of level 1 for NPP Bohunice V-1 after its gradual reconstruction. The probabilistic safety analysis of NPP in full power operation was elaborated in the study and the impact of the gradual reconstruction to the risk decreasing was quantified. The

  15. XMPP for cloud computing in bioinformatics supporting discovery and invocation of asynchronous web services

    Directory of Open Access Journals (Sweden)

    Willighagen Egon L

    2009-09-01

    Full Text Available Abstract Background Life sciences make heavily use of the web for both data provision and analysis. However, the increasing amount of available data and the diversity of analysis tools call for machine accessible interfaces in order to be effective. HTTP-based Web service technologies, like the Simple Object Access Protocol (SOAP and REpresentational State Transfer (REST services, are today the most common technologies for this in bioinformatics. However, these methods have severe drawbacks, including lack of discoverability, and the inability for services to send status notifications. Several complementary workarounds have been proposed, but the results are ad-hoc solutions of varying quality that can be difficult to use. Results We present a novel approach based on the open standard Extensible Messaging and Presence Protocol (XMPP, consisting of an extension (IO Data to comprise discovery, asynchronous invocation, and definition of data types in the service. That XMPP cloud services are capable of asynchronous communication implies that clients do not have to poll repetitively for status, but the service sends the results back to the client upon completion. Implementations for Bioclipse and Taverna are presented, as are various XMPP cloud services in bio- and cheminformatics. Conclusion XMPP with its extensions is a powerful protocol for cloud services that demonstrate several advantages over traditional HTTP-based Web services: 1 services are discoverable without the need of an external registry, 2 asynchronous invocation eliminates the need for ad-hoc solutions like polling, and 3 input and output types defined in the service allows for generation of clients on the fly without the need of an external semantics description. The many advantages over existing technologies make XMPP a highly interesting candidate for next generation online services in bioinformatics.

  16. Towards a bioinformatics of patterning: a computational approach to understanding regulative morphogenesis

    Directory of Open Access Journals (Sweden)

    Daniel Lobo

    2012-11-01

    The mechanisms underlying the regenerative abilities of certain model species are of central importance to the basic understanding of pattern formation. Complex organisms such as planaria and salamanders exhibit an exceptional capacity to regenerate complete body regions and organs from amputated pieces. However, despite the outstanding bottom-up efforts of molecular biologists and bioinformatics focused at the level of gene sequence, no comprehensive mechanistic model exists that can account for more than one or two aspects of regeneration. The development of computational approaches that help scientists identify constructive models of pattern regulation is held back by the lack of both flexible morphological representations and a repository for the experimental procedures and their results (altered pattern formation. No formal representation or computational tools exist to efficiently store, search, or mine the available knowledge from regenerative experiments, inhibiting fundamental insights from this huge dataset. To overcome these problems, we present here a new class of ontology to encode formally and unambiguously a very wide range of possible morphologies, manipulations, and experiments. This formalism will pave the way for top-down approaches for the discovery of comprehensive models of regeneration. We chose the planarian regeneration dataset to illustrate a proof-of-principle of this novel bioinformatics of shape; we developed a software tool to facilitate the formalization and mining of the planarian experimental knowledge, and cured a database containing all of the experiments from the principal publications on planarian regeneration. These resources are freely available for the regeneration community and will readily assist researchers in identifying specific functional data in planarian experiments. More importantly, these applications illustrate the presented framework for formalizing knowledge about functional perturbations of morphogenesis, which

  17. The bioinformatics of nucleotide sequence coding for proteins requiring metal coenzymes and proteins embedded with metals

    Science.gov (United States)

    Tremberger, G.; Dehipawala, Sunil; Cheung, E.; Holden, T.; Sullivan, R.; Nguyen, A.; Lieberman, D.; Cheung, T.

    2015-09-01

    All metallo-proteins need post-translation metal incorporation. In fact, the isotope ratio of Fe, Cu, and Zn in physiology and oncology have emerged as an important tool. The nickel containing F430 is the prosthetic group of the enzyme methyl coenzyme M reductase which catalyzes the release of methane in the final step of methano-genesis, a prime energy metabolism candidate for life exploration space mission in the solar system. The 3.5 Gyr early life sulfite reductase as a life switch energy metabolism had Fe-Mo clusters. The nitrogenase for nitrogen fixation 3 billion years ago had Mo. The early life arsenite oxidase needed for anoxygenic photosynthesis energy metabolism 2.8 billion years ago had Mo and Fe. The selection pressure in metal incorporation inside a protein would be quantifiable in terms of the related nucleotide sequence complexity with fractal dimension and entropy values. Simulation model showed that the studied metal-required energy metabolism sequences had at least ten times more selection pressure relatively in comparison to the horizontal transferred sequences in Mealybug, guided by the outcome histogram of the correlation R-sq values. The metal energy metabolism sequence group was compared to the circadian clock KaiC sequence group using magnesium atomic level bond shifting mechanism in the protein, and the simulation model would suggest a much higher selection pressure for the energy life switch sequence group. The possibility of using Kepler 444 as an example of ancient life in Galaxy with the associated exoplanets has been proposed and is further discussed in this report. Examples of arsenic metal bonding shift probed by Synchrotron-based X-ray spectroscopy data and Zn controlled FOXP2 regulated pathways in human and chimp brain studied tissue samples are studied in relationship to the sequence bioinformatics. The analysis results suggest that relatively large metal bonding shift amount is associated with low probability correlation R

  18. Small envelope protein E of SARS:cloning,expression, purification, CD determination, and bioinformatics analysis

    Institute of Scientific and Technical Information of China (English)

    SHENXu; XUEJian-Hua; YUChang-Ying; LUOHai-Bin; QINLei; YUXiao-Jing; CHENJing; CHENLi-Li; XIONGBin; YUELi-Duo; CAIJian-Hua; SHENJian-Hua; LUOXiao-Min; CHENKai-Xian; SHITie-Liu; LIYi-Xue; HUGeng-Xi; JIANGHua-Liang

    2003-01-01

    AIM:To obtain the pure sample of SARS small envelope E protein (SARS E protein), study its properties and analyze its possible functions. METHODS: The plasmid of SARS E protein was constructed by the polymerase chain reaction (PCR), and the protein was expressed in the E coli strain. The secondary structure feature of the protein was determined by circular dichroism (CD) technique. The possible functions of this protein were annotated by bioinformatics methods, and its possible three-dimensional model was constructed by molecular modeling. RESULTS: The pure sample of SARS E protein was obtained. The secondary structure feature derived from CD determination is similar to that from the secondary structure prediction. Bioinformatics analysis indicated that the key residues of SARS E protein were much conserved compared to the E proteins of other coronaviruses. In particular, the primary amino acid sequence of SARS E protien is much more similar to that of murine hepatitis virus(MHV) and other mammal coronaviruses. The transmembrane (TM) segment of the SARS E protein is relatively more conserved in the whole protein than other regions. CONCLUSION: The success of expressing the SARS E protein is a good starting point for investigating the structure and functions of this protein and SARS coronavirus itself as well. The SARS E protein may fold in water solution in a similar way as it in membrane-water mixed environment. It is possible that β-sheet I of the SARS E protein interacts with the membrane surface via hydrogen bonding, this β-sheet may uncoil to a random structure in water solution.

  19. Emerging role of bioinformatics tools and software in evolution of clinical research

    Science.gov (United States)

    Gill, Supreet Kaur; Christopher, Ajay Francis; Gupta, Vikas; Bansal, Parveen

    2016-01-01

    Clinical research is making toiling efforts for promotion and wellbeing of the health status of the people. There is a rapid increase in number and severity of diseases like cancer, hepatitis, HIV etc, resulting in high morbidity and mortality. Clinical research involves drug discovery and development whereas clinical trials are performed to establish safety and efficacy of drugs. Drug discovery is a long process starting with the target identification, validation and lead optimization. This is followed by the preclinical trials, intensive clinical trials and eventually post marketing vigilance for drug safety. Softwares and the bioinformatics tools play a great role not only in the drug discovery but also in drug development. It involves the use of informatics in the development of new knowledge pertaining to health and disease, data management during clinical trials and to use clinical data for secondary research. In addition, new technology likes molecular docking, molecular dynamics simulation, proteomics and quantitative structure activity relationship in clinical research results in faster and easier drug discovery process. During the preclinical trials, the software is used for randomization to remove bias and to plan study design. In clinical trials software like electronic data capture, Remote data capture and electronic case report form (eCRF) is used to store the data. eClinical, Oracle clinical are software used for clinical data management and for statistical analysis of the data. After the drug is marketed the safety of a drug could be monitored by drug safety software like Oracle Argus or ARISg. Therefore, softwares are used from the very early stages of drug designing, to drug development, clinical trials and during pharmacovigilance. This review describes different aspects related to application of computers and bioinformatics in drug designing, discovery and development, formulation designing and clinical research.

  20. Emerging role of bioinformatics tools and software in evolution of clinical research

    Directory of Open Access Journals (Sweden)

    Supreet Kaur Gill

    2016-01-01

    Full Text Available Clinical research is making toiling efforts for promotion and wellbeing of the health status of the people. There is a rapid increase in number and severity of diseases like cancer, hepatitis, HIV etc, resulting in high morbidity and mortality. Clinical research involves drug discovery and development whereas clinical trials are performed to establish safety and efficacy of drugs. Drug discovery is a long process starting with the target identification, validation and lead optimization. This is followed by the preclinical trials, intensive clinical trials and eventually post marketing vigilance for drug safety. Softwares and the bioinformatics tools play a great role not only in the drug discovery but also in drug development. It involves the use of informatics in the development of new knowledge pertaining to health and disease, data management during clinical trials and to use clinical data for secondary research. In addition, new technology likes molecular docking, molecular dynamics simulation, proteomics and quantitative structure activity relationship in clinical research results in faster and easier drug discovery process. During the preclinical trials, the software is used for randomization to remove bias and to plan study design. In clinical trials software like electronic data capture, Remote data capture and electronic case report form (eCRF is used to store the data. eClinical, Oracle clinical are software used for clinical data management and for statistical analysis of the data. After the drug is marketed the safety of a drug could be monitored by drug safety software like Oracle Argus or ARISg. Therefore, softwares are used from the very early stages of drug designing, to drug development, clinical trials and during pharmacovigilance. This review describes different aspects related to application of computers and bioinformatics in drug designing, discovery and development, formulation designing and clinical research.