WorldWideScience

Sample records for clade-specific bioinformatics resource

  1. Report on the EMBER Project--A European Multimedia Bioinformatics Educational Resource

    Science.gov (United States)

    Attwood, Terri K.; Selimas, Ioannis; Buis, Rob; Altenburg, Ruud; Herzog, Robert; Ledent, Valerie; Ghita, Viorica; Fernandes, Pedro; Marques, Isabel; Brugman, Marc

    2005-01-01

    EMBER was a European project aiming to develop bioinformatics teaching materials on the Web and CD-ROM to help address the recognised skills shortage in bioinformatics. The project grew out of pilot work on the development of an interactive web-based bioinformatics tutorial and the desire to repackage that resource with the help of a professional…

  2. Influenza research database: an integrated bioinformatics resource for influenza virus research

    Science.gov (United States)

    The Influenza Research Database (IRD) is a U.S. National Institute of Allergy and Infectious Diseases (NIAID)-sponsored Bioinformatics Resource Center dedicated to providing bioinformatics support for influenza virus research. IRD facilitates the research and development of vaccines, diagnostics, an...

  3. The SIB Swiss Institute of Bioinformatics' resources: focus on curated databases

    OpenAIRE

    Bultet, Lisandra Aguilar; Aguilar Rodriguez, Jose; Ahrens, Christian H; Ahrne, Erik Lennart; Ai, Ni; Aimo, Lucila; Akalin, Altuna; Aleksiev, Tyanko; Alocci, Davide; Altenhoff, Adrian; Alves, Isabel; Ambrosini, Giovanna; Pedone, Pascale Anderle; Angelina, Paolo; Anisimova, Maria

    2016-01-01

    The SIB Swiss Institute of Bioinformatics (www.isb-sib.ch) provides world-class bioinformatics databases, software tools, services and training to the international life science community in academia and industry. These solutions allow life scientists to turn the exponentially growing amount of data into knowledge. Here, we provide an overview of SIB's resources and competence areas, with a strong focus on curated databases and SIB's most popular and widely used resources. In particular, SIB'...

  4. mockrobiota: a Public Resource for Microbiome Bioinformatics Benchmarking.

    Science.gov (United States)

    Bokulich, Nicholas A; Rideout, Jai Ram; Mercurio, William G; Shiffer, Arron; Wolfe, Benjamin; Maurice, Corinne F; Dutton, Rachel J; Turnbaugh, Peter J; Knight, Rob; Caporaso, J Gregory

    2016-01-01

    Mock communities are an important tool for validating, optimizing, and comparing bioinformatics methods for microbial community analysis. We present mockrobiota, a public resource for sharing, validating, and documenting mock community data resources, available at http://caporaso-lab.github.io/mockrobiota/. The materials contained in mockrobiota include data set and sample metadata, expected composition data (taxonomy or gene annotations or reference sequences for mock community members), and links to raw data (e.g., raw sequence data) for each mock community data set. mockrobiota does not supply physical sample materials directly, but the data set metadata included for each mock community indicate whether physical sample materials are available. At the time of this writing, mockrobiota contains 11 mock community data sets with known species compositions, including bacterial, archaeal, and eukaryotic mock communities, analyzed by high-throughput marker gene sequencing. IMPORTANCE The availability of standard and public mock community data will facilitate ongoing method optimizations, comparisons across studies that share source data, and greater transparency and access and eliminate redundancy. These are also valuable resources for bioinformatics teaching and training. This dynamic resource is intended to expand and evolve to meet the changing needs of the omics community.

  5. Are Clade Specific HIV Vaccines a Necessity? An Analysis Based on Mathematical Models

    Directory of Open Access Journals (Sweden)

    Dobromir Dimitrov

    2015-12-01

    Full Text Available As HIV-1 envelope immune responses are critical to vaccine related protection, most candidate HIV vaccines entering efficacy trials are based upon a clade specific design. This need for clade specific vaccine prototypes markedly reduces the implementation of potentially effective HIV vaccines. We utilized a mathematical model to determine the effectiveness of immediate roll-out of a non-clade matched vaccine with reduced efficacy compared to constructing clade specific vaccines, which would take considerable time to manufacture and test in safety and efficacy trials. We simulated the HIV epidemic in San Francisco (SF and South Africa (SA and projected effectiveness of three vaccination strategies: i immediate intervention with a 20–40% vaccine efficacy (VE non-matched vaccine, ii delayed intervention by developing a 50% VE clade-specific vaccine, and iii immediate intervention with a non-matched vaccine replaced by a clade-specific vaccine when developed. Immediate vaccination with a non-clade matched vaccine, even with reduced efficacy, would prevent thousands of new infections in SF and millions in SA over 30 years. Vaccination with 50% VE delayed for five years needs six and 12 years in SA to break-even with immediate 20 and 30% VE vaccination, respectively, while not able to surpass the impact of immediate 40% VE vaccination over 30 years. Replacing a 30% VE with a 50% VE vaccine after 5 years reduces the HIV acquisition by 5% compared to delayed vaccination. The immediate use of an HIV vaccine with reduced VE in high risk communities appears desirable over a short time line but higher VE should be the pursued to achieve strong long-term impact. Our analysis illustrates the importance of developing surrogate markers (correlates of protection to allow bridging types of immunogenicity studies to support more rapid assessment of clade specific vaccines.

  6. Genomics and bioinformatics resources for translational science in Rosaceae.

    Science.gov (United States)

    Jung, Sook; Main, Dorrie

    2014-01-01

    Recent technological advances in biology promise unprecedented opportunities for rapid and sustainable advancement of crop quality. Following this trend, the Rosaceae research community continues to generate large amounts of genomic, genetic and breeding data. These include annotated whole genome sequences, transcriptome and expression data, proteomic and metabolomic data, genotypic and phenotypic data, and genetic and physical maps. Analysis, storage, integration and dissemination of these data using bioinformatics tools and databases are essential to provide utility of the data for basic, translational and applied research. This review discusses the currently available genomics and bioinformatics resources for the Rosaceae family.

  7. G2LC: Resources Autoscaling for Real Time Bioinformatics Applications in IaaS

    Directory of Open Access Journals (Sweden)

    Rongdong Hu

    2015-01-01

    Full Text Available Cloud computing has started to change the way how bioinformatics research is being carried out. Researchers who have taken advantage of this technology can process larger amounts of data and speed up scientific discovery. The variability in data volume results in variable computing requirements. Therefore, bioinformatics researchers are pursuing more reliable and efficient methods for conducting sequencing analyses. This paper proposes an automated resource provisioning method, G2LC, for bioinformatics applications in IaaS. It enables application to output the results in a real time manner. Its main purpose is to guarantee applications performance, while improving resource utilization. Real sequence searching data of BLAST is used to evaluate the effectiveness of G2LC. Experimental results show that G2LC guarantees the application performance, while resource is saved up to 20.14%.

  8. Creating a specialist protein resource network: a meeting report for the protein bioinformatics and community resources retreat

    DEFF Research Database (Denmark)

    Babbitt, Patricia C.; Bagos, Pantelis G.; Bairoch, Amos

    2015-01-01

    During 11–12 August 2014, a Protein Bioinformatics and Community Resources Retreat was held at the Wellcome Trust Genome Campus in Hinxton, UK. This meeting brought together the principal investigators of several specialized protein resources (such as CAZy, TCDB and MEROPS) as well as those from...... protein databases from the large Bioinformatics centres (including UniProt and RefSeq). The retreat was divided into five sessions: (1) key challenges, (2) the databases represented, (3) best practices for maintenance and curation, (4) information flow to and from large data centers and (5) communication...

  9. Creating a specialist protein resource network: a meeting report for the protein bioinformatics and community resources retreat.

    Science.gov (United States)

    Babbitt, Patricia C; Bagos, Pantelis G; Bairoch, Amos; Bateman, Alex; Chatonnet, Arnaud; Chen, Mark Jinan; Craik, David J; Finn, Robert D; Gloriam, David; Haft, Daniel H; Henrissat, Bernard; Holliday, Gemma L; Isberg, Vignir; Kaas, Quentin; Landsman, David; Lenfant, Nicolas; Manning, Gerard; Nagano, Nozomi; Srinivasan, Narayanaswamy; O'Donovan, Claire; Pruitt, Kim D; Sowdhamini, Ramanathan; Rawlings, Neil D; Saier, Milton H; Sharman, Joanna L; Spedding, Michael; Tsirigos, Konstantinos D; Vastermark, Ake; Vriend, Gerrit

    2015-01-01

    During 11-12 August 2014, a Protein Bioinformatics and Community Resources Retreat was held at the Wellcome Trust Genome Campus in Hinxton, UK. This meeting brought together the principal investigators of several specialized protein resources (such as CAZy, TCDB and MEROPS) as well as those from protein databases from the large Bioinformatics centres (including UniProt and RefSeq). The retreat was divided into five sessions: (1) key challenges, (2) the databases represented, (3) best practices for maintenance and curation, (4) information flow to and from large data centers and (5) communication and funding. An important outcome of this meeting was the creation of a Specialist Protein Resource Network that we believe will improve coordination of the activities of its member resources. We invite further protein database resources to join the network and continue the dialogue.

  10. An Overview of Bioinformatics Tools and Resources in Allergy.

    Science.gov (United States)

    Fu, Zhiyan; Lin, Jing

    2017-01-01

    The rapidly increasing number of characterized allergens has created huge demands for advanced information storage, retrieval, and analysis. Bioinformatics and machine learning approaches provide useful tools for the study of allergens and epitopes prediction, which greatly complement traditional laboratory techniques. The specific applications mainly include identification of B- and T-cell epitopes, and assessment of allergenicity and cross-reactivity. In order to facilitate the work of clinical and basic researchers who are not familiar with bioinformatics, we review in this chapter the most important databases, bioinformatic tools, and methods with relevance to the study of allergens.

  11. Creating a specialist protein resource network: a meeting report for the protein bioinformatics and community resources retreat

    NARCIS (Netherlands)

    Babbitt, P.C.; Bagos, P.G.; Bairoch, A.; Bateman, A.; Chatonnet, A.; Chen, M.J.; Craik, D.J.; Finn, R.D.; Gloriam, D.; Haft, D.H.; Henrissat, B.; Holliday, G.L.; Isberg, V.; Kaas, Q.; Landsman, D.; Lenfant, N.; Manning, G.; Nagano, N.; Srinivasan, N.; O'Donovan, C.; Pruitt, K.D.; Sowdhamini, R.; Rawlings, N.D.; Saier, M.H., Jr.; Sharman, J.L.; Spedding, M.; Tsirigos, K.D.; Vastermark, A.; Vriend, G.

    2015-01-01

    During 11-12 August 2014, a Protein Bioinformatics and Community Resources Retreat was held at the Wellcome Trust Genome Campus in Hinxton, UK. This meeting brought together the principal investigators of several specialized protein resources (such as CAZy, TCDB and MEROPS) as well as those from

  12. Tools and data services registry: a community effort to document bioinformatics resources

    Science.gov (United States)

    Ison, Jon; Rapacki, Kristoffer; Ménager, Hervé; Kalaš, Matúš; Rydza, Emil; Chmura, Piotr; Anthon, Christian; Beard, Niall; Berka, Karel; Bolser, Dan; Booth, Tim; Bretaudeau, Anthony; Brezovsky, Jan; Casadio, Rita; Cesareni, Gianni; Coppens, Frederik; Cornell, Michael; Cuccuru, Gianmauro; Davidsen, Kristian; Vedova, Gianluca Della; Dogan, Tunca; Doppelt-Azeroual, Olivia; Emery, Laura; Gasteiger, Elisabeth; Gatter, Thomas; Goldberg, Tatyana; Grosjean, Marie; Grüning, Björn; Helmer-Citterich, Manuela; Ienasescu, Hans; Ioannidis, Vassilios; Jespersen, Martin Closter; Jimenez, Rafael; Juty, Nick; Juvan, Peter; Koch, Maximilian; Laibe, Camille; Li, Jing-Woei; Licata, Luana; Mareuil, Fabien; Mičetić, Ivan; Friborg, Rune Møllegaard; Moretti, Sebastien; Morris, Chris; Möller, Steffen; Nenadic, Aleksandra; Peterson, Hedi; Profiti, Giuseppe; Rice, Peter; Romano, Paolo; Roncaglia, Paola; Saidi, Rabie; Schafferhans, Andrea; Schwämmle, Veit; Smith, Callum; Sperotto, Maria Maddalena; Stockinger, Heinz; Vařeková, Radka Svobodová; Tosatto, Silvio C.E.; de la Torre, Victor; Uva, Paolo; Via, Allegra; Yachdav, Guy; Zambelli, Federico; Vriend, Gert; Rost, Burkhard; Parkinson, Helen; Løngreen, Peter; Brunak, Søren

    2016-01-01

    Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand. Here we present a community-driven curation effort, supported by ELIXIR—the European infrastructure for biological information—that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners. As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools. PMID:26538599

  13. CLIMB (the Cloud Infrastructure for Microbial Bioinformatics): an online resource for the medical microbiology community.

    Science.gov (United States)

    Connor, Thomas R; Loman, Nicholas J; Thompson, Simon; Smith, Andy; Southgate, Joel; Poplawski, Radoslaw; Bull, Matthew J; Richardson, Emily; Ismail, Matthew; Thompson, Simon Elwood-; Kitchen, Christine; Guest, Martyn; Bakke, Marius; Sheppard, Samuel K; Pallen, Mark J

    2016-09-01

    The increasing availability and decreasing cost of high-throughput sequencing has transformed academic medical microbiology, delivering an explosion in available genomes while also driving advances in bioinformatics. However, many microbiologists are unable to exploit the resulting large genomics datasets because they do not have access to relevant computational resources and to an appropriate bioinformatics infrastructure. Here, we present the Cloud Infrastructure for Microbial Bioinformatics (CLIMB) facility, a shared computing infrastructure that has been designed from the ground up to provide an environment where microbiologists can share and reuse methods and data.

  14. Genotyping of Brucella species using clade specific SNPs

    Directory of Open Access Journals (Sweden)

    Foster Jeffrey T

    2012-06-01

    Full Text Available Abstract Background Brucellosis is a worldwide disease of mammals caused by Alphaproteobacteria in the genus Brucella. The genus is genetically monomorphic, requiring extensive genotyping to differentiate isolates. We utilized two different genotyping strategies to characterize isolates. First, we developed a microarray-based assay based on 1000 single nucleotide polymorphisms (SNPs that were identified from whole genome comparisons of two B. abortus isolates , one B. melitensis, and one B. suis. We then genotyped a diverse collection of 85 Brucella strains at these SNP loci and generated a phylogenetic tree of relationships. Second, we developed a selective primer-extension assay system using capillary electrophoresis that targeted 17 high value SNPs across 8 major branches of the phylogeny and determined their genotypes in a large collection ( n = 340 of diverse isolates. Results Our 1000 SNP microarray readily distinguished B. abortus, B. melitensis, and B. suis, differentiating B. melitensis and B. suis into two clades each. Brucella abortus was divided into four major clades. Our capillary-based SNP genotyping confirmed all major branches from the microarray assay and assigned all samples to defined lineages. Isolates from these lineages and closely related isolates, among the most commonly encountered lineages worldwide, can now be quickly and easily identified and genetically characterized. Conclusions We have identified clade-specific SNPs in Brucella that can be used for rapid assignment into major groups below the species level in the three main Brucella species. Our assays represent SNP genotyping approaches that can reliably determine the evolutionary relationships of bacterial isolates without the need for whole genome sequencing of all isolates.

  15. Genetic recombination events between sympatric Clade A and Clade C lice in Africa.

    Science.gov (United States)

    Veracx, Aurélie; Boutellis, Amina; Raoult, Didier

    2013-09-01

    Human head and body lice have been classified into three phylogenetic clades (Clades A, B, and C) based on mitochondrial DNA. Based on nuclear markers (the 18S rRNA gene and the PM2 spacer), two genotypes of Clade A head and body lice, including one that is specifically African (Clade A2), have been described. In this study, we sequenced the PM2 spacer of Clade C head lice from Ethiopia and compared these sequences with sequences from previous works. Trees were drawn, and an analysis of genetic diversity based on the cytochrome b gene and the PM2 spacer was performed for African and non-African lice. In the tree drawn based on the PM2 spacer, the African and non-African lice formed separate clusters. However, Clade C lice from Ethiopia were placed within the African Clade A subcluster (Clade A2). This result suggests that recombination events have occurred between Clade A2 lice and Clade C lice, reflecting the sympatric nature of African lice. Finally, the PM2 spacer and cytochrome b gene sequences of human lice revealed a higher level of genetic diversity in Africa than in other regions.

  16. MSeqDR: A Centralized Knowledge Repository and Bioinformatics Web Resource to Facilitate Genomic Investigations in Mitochondrial Disease

    NARCIS (Netherlands)

    L. Shen (Lishuang); M.A. Diroma (Maria Angela); M. Gonzalez (Michael); D. Navarro-Gomez (Daniel); J. Leipzig (Jeremy); M.T. Lott (Marie T.); M. van Oven (Mannis); D.C. Wallace; C.C. Muraresku (Colleen Clarke); Z. Zolkipli-Cunningham (Zarazuela); P.F. Chinnery (Patrick); M. Attimonelli (Marcella); S. Zuchner (Stephan); M.J. Falk (Marni J.); X. Gai (Xiaowu)

    2016-01-01

    textabstractMSeqDR is the Mitochondrial Disease Sequence Data Resource, a centralized and comprehensive genome and phenome bioinformatics resource built by the mitochondrial disease community to facilitate clinical diagnosis and research investigations of individual patient phenotypes, genomes,

  17. PATRIC, the bacterial bioinformatics database and analysis resource.

    Science.gov (United States)

    Wattam, Alice R; Abraham, David; Dalay, Oral; Disz, Terry L; Driscoll, Timothy; Gabbard, Joseph L; Gillespie, Joseph J; Gough, Roger; Hix, Deborah; Kenyon, Ronald; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K; Olson, Robert; Overbeek, Ross; Pusch, Gordon D; Shukla, Maulik; Schulman, Julie; Stevens, Rick L; Sullivan, Daniel E; Vonstein, Veronika; Warren, Andrew; Will, Rebecca; Wilson, Meredith J C; Yoo, Hyun Seung; Zhang, Chengdong; Zhang, Yan; Sobral, Bruno W

    2014-01-01

    The Pathosystems Resource Integration Center (PATRIC) is the all-bacterial Bioinformatics Resource Center (BRC) (http://www.patricbrc.org). A joint effort by two of the original National Institute of Allergy and Infectious Diseases-funded BRCs, PATRIC provides researchers with an online resource that stores and integrates a variety of data types [e.g. genomics, transcriptomics, protein-protein interactions (PPIs), three-dimensional protein structures and sequence typing data] and associated metadata. Datatypes are summarized for individual genomes and across taxonomic levels. All genomes in PATRIC, currently more than 10,000, are consistently annotated using RAST, the Rapid Annotations using Subsystems Technology. Summaries of different data types are also provided for individual genes, where comparisons of different annotations are available, and also include available transcriptomic data. PATRIC provides a variety of ways for researchers to find data of interest and a private workspace where they can store both genomic and gene associations, and their own private data. Both private and public data can be analyzed together using a suite of tools to perform comparative genomic or transcriptomic analysis. PATRIC also includes integrated information related to disease and PPIs. All the data and integrated analysis and visualization tools are freely available. This manuscript describes updates to the PATRIC since its initial report in the 2007 NAR Database Issue.

  18. Navigating the changing learning landscape: perspective from bioinformatics.ca

    OpenAIRE

    Brazas, Michelle D.; Ouellette, B. F. Francis

    2013-01-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable...

  19. Bioinformatics Training Network (BTN): a community resource for bioinformatics trainers

    DEFF Research Database (Denmark)

    Schneider, Maria V.; Walter, Peter; Blatter, Marie-Claude

    2012-01-01

    and clearly tagged in relation to target audiences, learning objectives, etc. Ideally, they would also be peer reviewed, and easily and efficiently accessible for downloading. Here, we present the Bioinformatics Training Network (BTN), a new enterprise that has been initiated to address these needs and review...

  20. Navigating the changing learning landscape: perspective from bioinformatics.ca.

    Science.gov (United States)

    Brazas, Michelle D; Ouellette, B F Francis

    2013-09-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable in the learning continuum. Bioinformatics.ca, which hosts the Canadian Bioinformatics Workshops, has blended more traditional learning styles with current online and social learning styles. Here we share our growing experiences over the past 12 years and look toward what the future holds for bioinformatics training programs.

  1. Signature proteins for the major clades of Cyanobacteria

    Directory of Open Access Journals (Sweden)

    Mathews Divya W

    2010-01-01

    Full Text Available Abstract Background The phylogeny and taxonomy of cyanobacteria is currently poorly understood due to paucity of reliable markers for identification and circumscription of its major clades. Results A combination of phylogenomic and protein signature based approaches was used to characterize the major clades of cyanobacteria. Phylogenetic trees were constructed for 44 cyanobacteria based on 44 conserved proteins. In parallel, Blastp searches were carried out on each ORF in the genomes of Synechococcus WH8102, Synechocystis PCC6803, Nostoc PCC7120, Synechococcus JA-3-3Ab, Prochlorococcus MIT9215 and Prochlor. marinus subsp. marinus CCMP1375 to identify proteins that are specific for various main clades of cyanobacteria. These studies have identified 39 proteins that are specific for all (or most cyanobacteria and large numbers of proteins for other cyanobacterial clades. The identified signature proteins include: (i 14 proteins for a deep branching clade (Clade A of Gloebacter violaceus and two diazotrophic Synechococcus strains (JA-3-3Ab and JA2-3-B'a; (ii 5 proteins that are present in all other cyanobacteria except those from Clade A; (iii 60 proteins that are specific for a clade (Clade C consisting of various marine unicellular cyanobacteria (viz. Synechococcus and Prochlorococcus; (iv 14 and 19 signature proteins that are specific for the Clade C Synechococcus and Prochlorococcus strains, respectively; (v 67 proteins that are specific for the Low B/A ecotype Prochlorococcus strains, containing lower ratio of chl b/a2 and adapted to growth at high light intensities; (vi 65 and 8 proteins that are specific for the Nostocales and Chroococcales orders, respectively; and (vii 22 and 9 proteins that are uniquely shared by various Nostocales and Oscillatoriales orders, or by these two orders and the Chroococcales, respectively. We also describe 3 conserved indels in flavoprotein, heme oxygenase and protochlorophyllide oxidoreductase proteins that

  2. PATRIC, the bacterial bioinformatics database and analysis resource

    Science.gov (United States)

    Wattam, Alice R.; Abraham, David; Dalay, Oral; Disz, Terry L.; Driscoll, Timothy; Gabbard, Joseph L.; Gillespie, Joseph J.; Gough, Roger; Hix, Deborah; Kenyon, Ronald; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K.; Olson, Robert; Overbeek, Ross; Pusch, Gordon D.; Shukla, Maulik; Schulman, Julie; Stevens, Rick L.; Sullivan, Daniel E.; Vonstein, Veronika; Warren, Andrew; Will, Rebecca; Wilson, Meredith J.C.; Yoo, Hyun Seung; Zhang, Chengdong; Zhang, Yan; Sobral, Bruno W.

    2014-01-01

    The Pathosystems Resource Integration Center (PATRIC) is the all-bacterial Bioinformatics Resource Center (BRC) (http://www.patricbrc.org). A joint effort by two of the original National Institute of Allergy and Infectious Diseases-funded BRCs, PATRIC provides researchers with an online resource that stores and integrates a variety of data types [e.g. genomics, transcriptomics, protein–protein interactions (PPIs), three-dimensional protein structures and sequence typing data] and associated metadata. Datatypes are summarized for individual genomes and across taxonomic levels. All genomes in PATRIC, currently more than 10 000, are consistently annotated using RAST, the Rapid Annotations using Subsystems Technology. Summaries of different data types are also provided for individual genes, where comparisons of different annotations are available, and also include available transcriptomic data. PATRIC provides a variety of ways for researchers to find data of interest and a private workspace where they can store both genomic and gene associations, and their own private data. Both private and public data can be analyzed together using a suite of tools to perform comparative genomic or transcriptomic analysis. PATRIC also includes integrated information related to disease and PPIs. All the data and integrated analysis and visualization tools are freely available. This manuscript describes updates to the PATRIC since its initial report in the 2007 NAR Database Issue. PMID:24225323

  3. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software.

    Science.gov (United States)

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians.

  4. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software

    Science.gov (United States)

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians. PMID:25996054

  5. Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center

    Science.gov (United States)

    Wattam, Alice R.; Davis, James J.; Assaf, Rida; Boisvert, Sébastien; Brettin, Thomas; Bun, Christopher; Conrad, Neal; Dietrich, Emily M.; Disz, Terry; Gabbard, Joseph L.; Gerdes, Svetlana; Henry, Christopher S.; Kenyon, Ronald W.; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K.; Olsen, Gary J.; Murphy-Olson, Daniel E.; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D.; Shukla, Maulik; Vonstein, Veronika; Warren, Andrew; Xia, Fangfang; Yoo, Hyunseung; Stevens, Rick L.

    2017-01-01

    The Pathosystems Resource Integration Center (PATRIC) is the bacterial Bioinformatics Resource Center (https://www.patricbrc.org). Recent changes to PATRIC include a redesign of the web interface and some new services that provide users with a platform that takes them from raw reads to an integrated analysis experience. The redesigned interface allows researchers direct access to tools and data, and the emphasis has changed to user-created genome-groups, with detailed summaries and views of the data that researchers have selected. Perhaps the biggest change has been the enhanced capability for researchers to analyze their private data and compare it to the available public data. Researchers can assemble their raw sequence reads and annotate the contigs using RASTtk. PATRIC also provides services for RNA-Seq, variation, model reconstruction and differential expression analysis, all delivered through an updated private workspace. Private data can be compared by ‘virtual integration’ to any of PATRIC's public data. The number of genomes available for comparison in PATRIC has expanded to over 80 000, with a special emphasis on genomes with antimicrobial resistance data. PATRIC uses this data to improve both subsystem annotation and k-mer classification, and tags new genomes as having signatures that indicate susceptibility or resistance to specific antibiotics. PMID:27899627

  6. Model-driven user interfaces for bioinformatics data resources: regenerating the wheel as an alternative to reinventing it

    Directory of Open Access Journals (Sweden)

    Swainston Neil

    2006-12-01

    Full Text Available Abstract Background The proliferation of data repositories in bioinformatics has resulted in the development of numerous interfaces that allow scientists to browse, search and analyse the data that they contain. Interfaces typically support repository access by means of web pages, but other means are also used, such as desktop applications and command line tools. Interfaces often duplicate functionality amongst each other, and this implies that associated development activities are repeated in different laboratories. Interfaces developed by public laboratories are often created with limited developer resources. In such environments, reducing the time spent on creating user interfaces allows for a better deployment of resources for specialised tasks, such as data integration or analysis. Laboratories maintaining data resources are challenged to reconcile requirements for software that is reliable, functional and flexible with limitations on software development resources. Results This paper proposes a model-driven approach for the partial generation of user interfaces for searching and browsing bioinformatics data repositories. Inspired by the Model Driven Architecture (MDA of the Object Management Group (OMG, we have developed a system that generates interfaces designed for use with bioinformatics resources. This approach helps laboratory domain experts decrease the amount of time they have to spend dealing with the repetitive aspects of user interface development. As a result, the amount of time they can spend on gathering requirements and helping develop specialised features increases. The resulting system is known as Pierre, and has been validated through its application to use cases in the life sciences, including the PEDRoDB proteomics database and the e-Fungi data warehouse. Conclusion MDAs focus on generating software from models that describe aspects of service capabilities, and can be applied to support rapid development of repository

  7. Using Bioinformatics to Develop and Test Hypotheses: E. coli-Specific Virulence Determinants

    Directory of Open Access Journals (Sweden)

    Joanna R. Klein

    2012-09-01

    Full Text Available Bioinformatics, the use of computer resources to understand biological information, is an important tool in research, and can be easily integrated into the curriculum of undergraduate courses. Such an example is provided in this series of four activities that introduces students to the field of bioinformatics as they design PCR based tests for pathogenic E. coli strains. A variety of computer tools are used including BLAST searches at NCBI, bacterial genome searches at the Integrated Microbial Genomes (IMG database, protein analysis at Pfam and literature research at PubMed. In the process, students also learn about virulence factors, enzyme function and horizontal gene transfer. Some or all of the four activities can be incorporated into microbiology or general biology courses taken by students at a variety of levels, ranging from high school through college. The activities build on one another as they teach and reinforce knowledge and skills, promote critical thinking, and provide for student collaboration and presentation. The computer-based activities can be done either in class or outside of class, thus are appropriate for inclusion in online or blended learning formats. Assessment data showed that students learned general microbiology concepts related to pathogenesis and enzyme function, gained skills in using tools of bioinformatics and molecular biology, and successfully developed and tested a scientific hypothesis.

  8. FungiDB: An Integrated Bioinformatic Resource for Fungi and Oomycetes

    Directory of Open Access Journals (Sweden)

    Evelina Y. Basenko

    2018-03-01

    Full Text Available FungiDB (fungidb.org is a free online resource for data mining and functional genomics analysis for fungal and oomycete species. FungiDB is part of the Eukaryotic Pathogen Genomics Database Resource (EuPathDB, eupathdb.org platform that integrates genomic, transcriptomic, proteomic, and phenotypic datasets, and other types of data for pathogenic and nonpathogenic, free-living and parasitic organisms. FungiDB is one of the largest EuPathDB databases containing nearly 100 genomes obtained from GenBank, Aspergillus Genome Database (AspGD, The Broad Institute, Joint Genome Institute (JGI, Ensembl, and other sources. FungiDB offers a user-friendly web interface with embedded bioinformatics tools that support custom in silico experiments that leverage FungiDB-integrated data. In addition, a Galaxy-based workspace enables users to generate custom pipelines for large-scale data analysis (e.g., RNA-Seq, variant calling, etc.. This review provides an introduction to the FungiDB resources and focuses on available features, tools, and queries and how they can be used to mine data across a diverse range of integrated FungiDB datasets and records.

  9. MSeqDR: A Centralized Knowledge Repository and Bioinformatics Web Resource to Facilitate Genomic Investigations in Mitochondrial Disease

    OpenAIRE

    Shen, Lishuang; Diroma, Maria Angela; Gonzalez, Michael; Navarro-Gomez, Daniel; Leipzig, Jeremy; Lott, Marie T.; Oven, Mannis; Wallace, D.C.; Muraresku, Colleen Clarke; Zolkipli-Cunningham, Zarazuela; Chinnery, Patrick; Attimonelli, Marcella; Zuchner, Stephan; Falk, Marni J.; Gai, Xiaowu

    2016-01-01

    textabstractMSeqDR is the Mitochondrial Disease Sequence Data Resource, a centralized and comprehensive genome and phenome bioinformatics resource built by the mitochondrial disease community to facilitate clinical diagnosis and research investigations of individual patient phenotypes, genomes, genes, and variants. A central Web portal (https://mseqdr.org) integrates community knowledge from expert-curated databases with genomic and phenotype data shared by clinicians and researchers. MSeqDR ...

  10. E-MSD: an integrated data resource for bioinformatics.

    Science.gov (United States)

    Velankar, S; McNeil, P; Mittard-Runte, V; Suarez, A; Barrell, D; Apweiler, R; Henrick, K

    2005-01-01

    The Macromolecular Structure Database (MSD) group (http://www.ebi.ac.uk/msd/) continues to enhance the quality and consistency of macromolecular structure data in the worldwide Protein Data Bank (wwPDB) and to work towards the integration of various bioinformatics data resources. One of the major obstacles to the improved integration of structural databases such as MSD and sequence databases like UniProt is the absence of up to date and well-maintained mapping between corresponding entries. We have worked closely with the UniProt group at the EBI to clean up the taxonomy and sequence cross-reference information in the MSD and UniProt databases. This information is vital for the reliable integration of the sequence family databases such as Pfam and Interpro with the structure-oriented databases of SCOP and CATH. This information has been made available to the eFamily group (http://www.efamily.org.uk/) and now forms the basis of the regular interchange of information between the member databases (MSD, UniProt, Pfam, Interpro, SCOP and CATH). This exchange of annotation information has enriched the structural information in the MSD database with annotation from wider sequence-oriented resources. This work was carried out under the 'Structure Integration with Function, Taxonomy and Sequences (SIFTS)' initiative (http://www.ebi.ac.uk/msd-srv/docs/sifts) in the MSD group.

  11. Bioinformatics education dissemination with an evolutionary problem solving perspective.

    Science.gov (United States)

    Jungck, John R; Donovan, Samuel S; Weisstein, Anton E; Khiripet, Noppadon; Everse, Stephen J

    2010-11-01

    Bioinformatics is central to biology education in the 21st century. With the generation of terabytes of data per day, the application of computer-based tools to stored and distributed data is fundamentally changing research and its application to problems in medicine, agriculture, conservation and forensics. In light of this 'information revolution,' undergraduate biology curricula must be redesigned to prepare the next generation of informed citizens as well as those who will pursue careers in the life sciences. The BEDROCK initiative (Bioinformatics Education Dissemination: Reaching Out, Connecting and Knitting together) has fostered an international community of bioinformatics educators. The initiative's goals are to: (i) Identify and support faculty who can take leadership roles in bioinformatics education; (ii) Highlight and distribute innovative approaches to incorporating evolutionary bioinformatics data and techniques throughout undergraduate education; (iii) Establish mechanisms for the broad dissemination of bioinformatics resource materials and teaching models; (iv) Emphasize phylogenetic thinking and problem solving; and (v) Develop and publish new software tools to help students develop and test evolutionary hypotheses. Since 2002, BEDROCK has offered more than 50 faculty workshops around the world, published many resources and supported an environment for developing and sharing bioinformatics education approaches. The BEDROCK initiative builds on the established pedagogical philosophy and academic community of the BioQUEST Curriculum Consortium to assemble the diverse intellectual and human resources required to sustain an international reform effort in undergraduate bioinformatics education.

  12. miRToolsGallery: a tag-based and rankable microRNA bioinformatics resources database portal

    Science.gov (United States)

    Chen, Liang; Heikkinen, Liisa; Wang, ChangLiang; Yang, Yang; Knott, K Emily

    2018-01-01

    Abstract Hundreds of bioinformatics tools have been developed for MicroRNA (miRNA) investigations including those used for identification, target prediction, structure and expression profile analysis. However, finding the correct tool for a specific application requires the tedious and laborious process of locating, downloading, testing and validating the appropriate tool from a group of nearly a thousand. In order to facilitate this process, we developed a novel database portal named miRToolsGallery. We constructed the portal by manually curating > 950 miRNA analysis tools and resources. In the portal, a query to locate the appropriate tool is expedited by being searchable, filterable and rankable. The ranking feature is vital to quickly identify and prioritize the more useful from the obscure tools. Tools are ranked via different criteria including the PageRank algorithm, date of publication, number of citations, average of votes and number of publications. miRToolsGallery provides links and data for the comprehensive collection of currently available miRNA tools with a ranking function which can be adjusted using different criteria according to specific requirements. Database URL: http://www.mirtoolsgallery.org PMID:29688355

  13. A Ser29Leu substitution in the cytosine deaminase Fca1p is responsible for clade-specific flucytosine resistance in Candida dubliniensis.

    LENUS (Irish Health Repository)

    McManus, Brenda A

    2009-11-01

    The population structure of the opportunistic yeast pathogen Candida dubliniensis is composed of three main multilocus sequence typing clades (clades C1 to C3), and clade C3 predominantly consists of isolates from the Middle East that exhibit high-level resistance (MIC(50) > or = 128 microg\\/ml) to the fungicidal agent flucytosine (5FC). The close relative of C. dubliniensis, C. albicans, also exhibits clade-specific resistance to 5FC, and resistance is most commonly mediated by an Arg101Cys substitution in the FUR1 gene encoding uracil phosphoribosyltransferase. Broth microdilution assays with fluorouracil (5FU), the toxic deaminated form of 5FC, showed that both 5FC-resistant and 5FC-susceptible C. dubliniensis isolates exhibited similar 5FU MICs, suggesting that the C. dubliniensis cytosine deaminase (Fca1p) encoded by C. dubliniensis FCA1 (CdFCA1) may play a role in mediating C. dubliniensis clade-specific 5FC resistance. Amino acid sequence analysis of the CdFCA1 open reading frame (ORF) identified a homozygous Ser29Leu substitution in all 12 5FC-resistant isolates investigated which was not present in any of the 9 5FC-susceptible isolates examined. The tetracycline-inducible expression of the CdFCA1 ORF from a 5FC-susceptible C. dubliniensis isolate in two separate 5FC-resistant clade C3 isolates restored susceptibility to 5FC, demonstrating that the Ser29Leu substitution was responsible for the clade-specific 5FC resistance and that the 5FC resistance encoded by FCA1 genes with the Ser29Leu transition is recessive. Quantitative real-time PCR analysis showed no significant difference in CdFCA1 expression between 5FC-susceptible and 5FC-resistant isolates in either the presence or the absence of subinhibitory concentrations of 5FC, suggesting that the Ser29Leu substitution in the CdFCA1 ORF is the sole cause of 5FC resistance in clade C3 C. dubliniensis isolates.

  14. A Survey of Bioinformatics Database and Software Usage through Mining the Literature.

    Directory of Open Access Journals (Sweden)

    Geraint Duck

    Full Text Available Computer-based resources are central to much, if not most, biological and medical research. However, while there is an ever expanding choice of bioinformatics resources to use, described within the biomedical literature, little work to date has provided an evaluation of the full range of availability or levels of usage of database and software resources. Here we use text mining to process the PubMed Central full-text corpus, identifying mentions of databases or software within the scientific literature. We provide an audit of the resources contained within the biomedical literature, and a comparison of their relative usage, both over time and between the sub-disciplines of bioinformatics, biology and medicine. We find that trends in resource usage differs between these domains. The bioinformatics literature emphasises novel resource development, while database and software usage within biology and medicine is more stable and conservative. Many resources are only mentioned in the bioinformatics literature, with a relatively small number making it out into general biology, and fewer still into the medical literature. In addition, many resources are seeing a steady decline in their usage (e.g., BLAST, SWISS-PROT, though some are instead seeing rapid growth (e.g., the GO, R. We find a striking imbalance in resource usage with the top 5% of resource names (133 names accounting for 47% of total usage, and over 70% of resources extracted being only mentioned once each. While these results highlight the dynamic and creative nature of bioinformatics research they raise questions about software reuse, choice and the sharing of bioinformatics practice. Is it acceptable that so many resources are apparently never reused? Finally, our work is a step towards automated extraction of scientific method from text. We make the dataset generated by our study available under the CC0 license here: http://dx.doi.org/10.6084/m9.figshare.1281371.

  15. BioStar: an online question & answer resource for the bioinformatics community

    Science.gov (United States)

    Although the era of big data has produced many bioinformatics tools and databases, using them effectively often requires specialized knowledge. Many groups lack bioinformatics expertise, and frequently find that software documentation is inadequate and local colleagues may be overburdened or unfamil...

  16. The GMOD Drupal bioinformatic server framework.

    Science.gov (United States)

    Papanicolaou, Alexie; Heckel, David G

    2010-12-15

    Next-generation sequencing technologies have led to the widespread use of -omic applications. As a result, there is now a pronounced bioinformatic bottleneck. The general model organism database (GMOD) tool kit (http://gmod.org) has produced a number of resources aimed at addressing this issue. It lacks, however, a robust online solution that can deploy heterogeneous data and software within a Web content management system (CMS). We present a bioinformatic framework for the Drupal CMS. It consists of three modules. First, GMOD-DBSF is an application programming interface module for the Drupal CMS that simplifies the programming of bioinformatic Drupal modules. Second, the Drupal Bioinformatic Software Bench (biosoftware_bench) allows for a rapid and secure deployment of bioinformatic software. An innovative graphical user interface (GUI) guides both use and administration of the software, including the secure provision of pre-publication datasets. Third, we present genes4all_experiment, which exemplifies how our work supports the wider research community. Given the infrastructure presented here, the Drupal CMS may become a powerful new tool set for bioinformaticians. The GMOD-DBSF base module is an expandable community resource that decreases development time of Drupal modules for bioinformatics. The biosoftware_bench module can already enhance biologists' ability to mine their own data. The genes4all_experiment module has already been responsible for archiving of more than 150 studies of RNAi from Lepidoptera, which were previously unpublished. Implemented in PHP and Perl. Freely available under the GNU Public License 2 or later from http://gmod-dbsf.googlecode.com.

  17. Bioinformatics in translational drug discovery.

    Science.gov (United States)

    Wooller, Sarah K; Benstead-Hume, Graeme; Chen, Xiangrong; Ali, Yusuf; Pearl, Frances M G

    2017-08-31

    Bioinformatics approaches are becoming ever more essential in translational drug discovery both in academia and within the pharmaceutical industry. Computational exploitation of the increasing volumes of data generated during all phases of drug discovery is enabling key challenges of the process to be addressed. Here, we highlight some of the areas in which bioinformatics resources and methods are being developed to support the drug discovery pipeline. These include the creation of large data warehouses, bioinformatics algorithms to analyse 'big data' that identify novel drug targets and/or biomarkers, programs to assess the tractability of targets, and prediction of repositioning opportunities that use licensed drugs to treat additional indications. © 2017 The Author(s).

  18. COMPARISON OF POPULAR BIOINFORMATICS DATABASES

    OpenAIRE

    Abdulganiyu Abdu Yusuf; Zahraddeen Sufyanu; Kabir Yusuf Mamman; Abubakar Umar Suleiman

    2016-01-01

    Bioinformatics is the application of computational tools to capture and interpret biological data. It has wide applications in drug development, crop improvement, agricultural biotechnology and forensic DNA analysis. There are various databases available to researchers in bioinformatics. These databases are customized for a specific need and are ranged in size, scope, and purpose. The main drawbacks of bioinformatics databases include redundant information, constant change, data spread over m...

  19. Bioinformatics meets user-centred design: a perspective.

    Directory of Open Access Journals (Sweden)

    Katrina Pavelin

    Full Text Available Designers have a saying that "the joy of an early release lasts but a short time. The bitterness of an unusable system lasts for years." It is indeed disappointing to discover that your data resources are not being used to their full potential. Not only have you invested your time, effort, and research grant on the project, but you may face costly redesigns if you want to improve the system later. This scenario would be less likely if the product was designed to provide users with exactly what they need, so that it is fit for purpose before its launch. We work at EMBL-European Bioinformatics Institute (EMBL-EBI, and we consult extensively with life science researchers to find out what they need from biological data resources. We have found that although users believe that the bioinformatics community is providing accurate and valuable data, they often find the interfaces to these resources tricky to use and navigate. We believe that if you can find out what your users want even before you create the first mock-up of a system, the final product will provide a better user experience. This would encourage more people to use the resource and they would have greater access to the data, which could ultimately lead to more scientific discoveries. In this paper, we explore the need for a user-centred design (UCD strategy when designing bioinformatics resources and illustrate this with examples from our work at EMBL-EBI. Our aim is to introduce the reader to how selected UCD techniques may be successfully applied to software design for bioinformatics.

  20. The GMOD Drupal Bioinformatic Server Framework

    Science.gov (United States)

    Papanicolaou, Alexie; Heckel, David G.

    2010-01-01

    Motivation: Next-generation sequencing technologies have led to the widespread use of -omic applications. As a result, there is now a pronounced bioinformatic bottleneck. The general model organism database (GMOD) tool kit (http://gmod.org) has produced a number of resources aimed at addressing this issue. It lacks, however, a robust online solution that can deploy heterogeneous data and software within a Web content management system (CMS). Results: We present a bioinformatic framework for the Drupal CMS. It consists of three modules. First, GMOD-DBSF is an application programming interface module for the Drupal CMS that simplifies the programming of bioinformatic Drupal modules. Second, the Drupal Bioinformatic Software Bench (biosoftware_bench) allows for a rapid and secure deployment of bioinformatic software. An innovative graphical user interface (GUI) guides both use and administration of the software, including the secure provision of pre-publication datasets. Third, we present genes4all_experiment, which exemplifies how our work supports the wider research community. Conclusion: Given the infrastructure presented here, the Drupal CMS may become a powerful new tool set for bioinformaticians. The GMOD-DBSF base module is an expandable community resource that decreases development time of Drupal modules for bioinformatics. The biosoftware_bench module can already enhance biologists' ability to mine their own data. The genes4all_experiment module has already been responsible for archiving of more than 150 studies of RNAi from Lepidoptera, which were previously unpublished. Availability and implementation: Implemented in PHP and Perl. Freely available under the GNU Public License 2 or later from http://gmod-dbsf.googlecode.com Contact: alexie@butterflybase.org PMID:20971988

  1. Best practices in bioinformatics training for life scientists

    DEFF Research Database (Denmark)

    Via, Allegra; Blicher, Thomas; Bongcam-Rudloff, Erik

    2013-01-01

    their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes...... to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse...

  2. Bioinformatics Training: A Review of Challenges, Actions and Support Requirements

    DEFF Research Database (Denmark)

    Schneider, M.V.; Watson, J.; Attwood, T.

    2010-01-01

    As bioinformatics becomes increasingly central to research in the molecular life sciences, the need to train non-bioinformaticians to make the most of bioinformatics resources is growing. Here, we review the key challenges and pitfalls to providing effective training for users of bioinformatics...... services, and discuss successful training strategies shared by a diverse set of bioinformatics trainers. We also identify steps that trainers in bioinformatics could take together to advance the state of the art in current training practices. The ideas presented in this article derive from the first...

  3. Crowdsourcing for bioinformatics.

    Science.gov (United States)

    Good, Benjamin M; Su, Andrew I

    2013-08-15

    Bioinformatics is faced with a variety of problems that require human involvement. Tasks like genome annotation, image analysis, knowledge-base population and protein structure determination all benefit from human input. In some cases, people are needed in vast quantities, whereas in others, we need just a few with rare abilities. Crowdsourcing encompasses an emerging collection of approaches for harnessing such distributed human intelligence. Recently, the bioinformatics community has begun to apply crowdsourcing in a variety of contexts, yet few resources are available that describe how these human-powered systems work and how to use them effectively in scientific domains. Here, we provide a framework for understanding and applying several different types of crowdsourcing. The framework considers two broad classes: systems for solving large-volume 'microtasks' and systems for solving high-difficulty 'megatasks'. Within these classes, we discuss system types, including volunteer labor, games with a purpose, microtask markets and open innovation contests. We illustrate each system type with successful examples in bioinformatics and conclude with a guide for matching problems to crowdsourcing solutions that highlights the positives and negatives of different approaches.

  4. HIV Controllers Exhibit Enhanced Frequencies of Major Histocompatibility Complex Class II Tetramer+ Gag-Specific CD4+ T Cells in Chronic Clade C HIV-1 Infection.

    Science.gov (United States)

    Laher, Faatima; Ranasinghe, Srinika; Porichis, Filippos; Mewalal, Nikoshia; Pretorius, Karyn; Ismail, Nasreen; Buus, Søren; Stryhn, Anette; Carrington, Mary; Walker, Bruce D; Ndung'u, Thumbi; Ndhlovu, Zaza M

    2017-04-01

    Immune control of viral infections is heavily dependent on helper CD4 + T cell function. However, the understanding of the contribution of HIV-specific CD4 + T cell responses to immune protection against HIV-1, particularly in clade C infection, remains incomplete. Recently, major histocompatibility complex (MHC) class II tetramers have emerged as a powerful tool for interrogating antigen-specific CD4 + T cells without relying on effector functions. Here, we defined the MHC class II alleles for immunodominant Gag CD4 + T cell epitopes in clade C virus infection, constructed MHC class II tetramers, and then used these to define the magnitude, function, and relation to the viral load of HIV-specific CD4 + T cell responses in a cohort of untreated HIV clade C-infected persons. We observed significantly higher frequencies of MHC class II tetramer-positive CD4 + T cells in HIV controllers than progressors ( P = 0.0001), and these expanded Gag-specific CD4 + T cells in HIV controllers showed higher levels of expression of the cytolytic proteins granzymes A and B. Importantly, targeting of the immunodominant Gag41 peptide in the context of HLA class II DRB1*1101 was associated with HIV control ( r = -0.5, P = 0.02). These data identify an association between HIV-specific CD4 + T cell targeting of immunodominant Gag epitopes and immune control, particularly the contribution of a single class II MHC-peptide complex to the immune response against HIV-1 infection. Furthermore, these results highlight the advantage of the use of class II tetramers in evaluating HIV-specific CD4 + T cell responses in natural infections. IMPORTANCE Increasing evidence suggests that virus-specific CD4 + T cells contribute to the immune-mediated control of clade B HIV-1 infection, yet there remains a relative paucity of data regarding the role of HIV-specific CD4 + T cells in shaping adaptive immune responses in individuals infected with clade C, which is responsible for the majority of HIV

  5. Vertical and Horizontal Integration of Bioinformatics Education: A Modular, Interdisciplinary Approach

    Science.gov (United States)

    Furge, Laura Lowe; Stevens-Truss, Regina; Moore, D. Blaine; Langeland, James A.

    2009-01-01

    Bioinformatics education for undergraduates has been approached primarily in two ways: introduction of new courses with largely bioinformatics focus or introduction of bioinformatics experiences into existing courses. For small colleges such as Kalamazoo, creation of new courses within an already resource-stretched setting has not been an option.…

  6. A decade of Web Server updates at the Bioinformatics Links Directory: 2003-2012.

    Science.gov (United States)

    Brazas, Michelle D; Yim, David; Yeung, Winston; Ouellette, B F Francis

    2012-07-01

    The 2012 Bioinformatics Links Directory update marks the 10th special Web Server issue from Nucleic Acids Research. Beginning with content from their 2003 publication, the Bioinformatics Links Directory in collaboration with Nucleic Acids Research has compiled and published a comprehensive list of freely accessible, online tools, databases and resource materials for the bioinformatics and life science research communities. The past decade has exhibited significant growth and change in the types of tools, databases and resources being put forth, reflecting both technology changes and the nature of research over that time. With the addition of 90 web server tools and 12 updates from the July 2012 Web Server issue of Nucleic Acids Research, the Bioinformatics Links Directory at http://bioinformatics.ca/links_directory/ now contains an impressive 134 resources, 455 databases and 1205 web server tools, mirroring the continued activity and efforts of our field.

  7. MSeqDR: A Centralized Knowledge Repository and Bioinformatics Web Resource to Facilitate Genomic Investigations in Mitochondrial Disease.

    Science.gov (United States)

    Shen, Lishuang; Diroma, Maria Angela; Gonzalez, Michael; Navarro-Gomez, Daniel; Leipzig, Jeremy; Lott, Marie T; van Oven, Mannis; Wallace, Douglas C; Muraresku, Colleen Clarke; Zolkipli-Cunningham, Zarazuela; Chinnery, Patrick F; Attimonelli, Marcella; Zuchner, Stephan; Falk, Marni J; Gai, Xiaowu

    2016-06-01

    MSeqDR is the Mitochondrial Disease Sequence Data Resource, a centralized and comprehensive genome and phenome bioinformatics resource built by the mitochondrial disease community to facilitate clinical diagnosis and research investigations of individual patient phenotypes, genomes, genes, and variants. A central Web portal (https://mseqdr.org) integrates community knowledge from expert-curated databases with genomic and phenotype data shared by clinicians and researchers. MSeqDR also functions as a centralized application server for Web-based tools to analyze data across both mitochondrial and nuclear DNA, including investigator-driven whole exome or genome dataset analyses through MSeqDR-Genesis. MSeqDR-GBrowse genome browser supports interactive genomic data exploration and visualization with custom tracks relevant to mtDNA variation and mitochondrial disease. MSeqDR-LSDB is a locus-specific database that currently manages 178 mitochondrial diseases, 1,363 genes associated with mitochondrial biology or disease, and 3,711 pathogenic variants in those genes. MSeqDR Disease Portal allows hierarchical tree-style disease exploration to evaluate their unique descriptions, phenotypes, and causative variants. Automated genomic data submission tools are provided that capture ClinVar compliant variant annotations. PhenoTips will be used for phenotypic data submission on deidentified patients using human phenotype ontology terminology. The development of a dynamic informed patient consent process to guide data access is underway to realize the full potential of these resources. © 2016 WILEY PERIODICALS, INC.

  8. OpenHelix: bioinformatics education outside of a different box.

    Science.gov (United States)

    Williams, Jennifer M; Mangan, Mary E; Perreault-Micale, Cynthia; Lathe, Scott; Sirohi, Neeraj; Lathe, Warren C

    2010-11-01

    The amount of biological data is increasing rapidly, and will continue to increase as new rapid technologies are developed. Professionals in every area of bioscience will have data management needs that require publicly available bioinformatics resources. Not all scientists desire a formal bioinformatics education but would benefit from more informal educational sources of learning. Effective bioinformatics education formats will address a broad range of scientific needs, will be aimed at a variety of user skill levels, and will be delivered in a number of different formats to address different learning styles. Informal sources of bioinformatics education that are effective are available, and will be explored in this review.

  9. SOLiD sequencing of four Vibrio vulnificus genomes enables comparative genomic analysis and identification of candidate clade-specific virulence genes

    Directory of Open Access Journals (Sweden)

    Telonis-Scott Marina

    2010-09-01

    Full Text Available Abstract Background Vibrio vulnificus is the leading cause of reported death from consumption of seafood in the United States. Despite several decades of research on molecular pathogenesis, much remains to be learned about the mechanisms of virulence of this opportunistic bacterial pathogen. The two complete and annotated genomic DNA sequences of V. vulnificus belong to strains of clade 2, which is the predominant clade among clinical strains. Clade 2 strains generally possess higher virulence potential in animal models of disease compared with clade 1, which predominates among environmental strains. SOLiD sequencing of four V. vulnificus strains representing different clades (1 and 2 and biotypes (1 and 2 was used for comparative genomic analysis. Results Greater than 4,100,000 bases were sequenced of each strain, yielding approximately 100-fold coverage for each of the four genomes. Although the read lengths of SOLiD genomic sequencing were only 35 nt, we were able to make significant conclusions about the unique and shared sequences among the genomes, including identification of single nucleotide polymorphisms. Comparative analysis of the newly sequenced genomes to the existing reference genomes enabled the identification of 3,459 core V. vulnificus genes shared among all six strains and 80 clade 2-specific genes. We identified 523,161 SNPs among the six genomes. Conclusions We were able to glean much information about the genomic content of each strain using next generation sequencing. Flp pili, GGDEF proteins, and genomic island XII were identified as possible virulence factors because of their presence in virulent sequenced strains. Genomic comparisons also point toward the involvement of sialic acid catabolism in pathogenesis.

  10. Relax with CouchDB - Into the non-relational DBMS era of Bioinformatics

    Science.gov (United States)

    Manyam, Ganiraju; Payton, Michelle A.; Roth, Jack A.; Abruzzo, Lynne V.; Coombes, Kevin R.

    2012-01-01

    With the proliferation of high-throughput technologies, genome-level data analysis has become common in molecular biology. Bioinformaticians are developing extensive resources to annotate and mine biological features from high-throughput data. The underlying database management systems for most bioinformatics software are based on a relational model. Modern non-relational databases offer an alternative that has flexibility, scalability, and a non-rigid design schema. Moreover, with an accelerated development pace, non-relational databases like CouchDB can be ideal tools to construct bioinformatics utilities. We describe CouchDB by presenting three new bioinformatics resources: (a) geneSmash, which collates data from bioinformatics resources and provides automated gene-centric annotations, (b) drugBase, a database of drug-target interactions with a web interface powered by geneSmash, and (c) HapMap-CN, which provides a web interface to query copy number variations from three SNP-chip HapMap datasets. In addition to the web sites, all three systems can be accessed programmatically via web services. PMID:22609849

  11. Applications and Methods Utilizing the Simple Semantic Web Architecture and Protocol (SSWAP) for Bioinformatics Resource Discovery and Disparate Data and Service Integration

    Science.gov (United States)

    Scientific data integration and computational service discovery are challenges for the bioinformatic community. This process is made more difficult by the separate and independent construction of biological databases, which makes the exchange of scientific data between information resources difficu...

  12. Development of Bioinformatics Infrastructure for Genomics Research.

    Science.gov (United States)

    Mulder, Nicola J; Adebiyi, Ezekiel; Adebiyi, Marion; Adeyemi, Seun; Ahmed, Azza; Ahmed, Rehab; Akanle, Bola; Alibi, Mohamed; Armstrong, Don L; Aron, Shaun; Ashano, Efejiro; Baichoo, Shakuntala; Benkahla, Alia; Brown, David K; Chimusa, Emile R; Fadlelmola, Faisal M; Falola, Dare; Fatumo, Segun; Ghedira, Kais; Ghouila, Amel; Hazelhurst, Scott; Isewon, Itunuoluwa; Jung, Segun; Kassim, Samar Kamal; Kayondo, Jonathan K; Mbiyavanga, Mamana; Meintjes, Ayton; Mohammed, Somia; Mosaku, Abayomi; Moussa, Ahmed; Muhammd, Mustafa; Mungloo-Dilmohamud, Zahra; Nashiru, Oyekanmi; Odia, Trust; Okafor, Adaobi; Oladipo, Olaleye; Osamor, Victor; Oyelade, Jellili; Sadki, Khalid; Salifu, Samson Pandam; Soyemi, Jumoke; Panji, Sumir; Radouani, Fouzia; Souiai, Oussama; Tastan Bishop, Özlem

    2017-06-01

    Although pockets of bioinformatics excellence have developed in Africa, generally, large-scale genomic data analysis has been limited by the availability of expertise and infrastructure. H3ABioNet, a pan-African bioinformatics network, was established to build capacity specifically to enable H3Africa (Human Heredity and Health in Africa) researchers to analyze their data in Africa. Since the inception of the H3Africa initiative, H3ABioNet's role has evolved in response to changing needs from the consortium and the African bioinformatics community. H3ABioNet set out to develop core bioinformatics infrastructure and capacity for genomics research in various aspects of data collection, transfer, storage, and analysis. Various resources have been developed to address genomic data management and analysis needs of H3Africa researchers and other scientific communities on the continent. NetMap was developed and used to build an accurate picture of network performance within Africa and between Africa and the rest of the world, and Globus Online has been rolled out to facilitate data transfer. A participant recruitment database was developed to monitor participant enrollment, and data is being harmonized through the use of ontologies and controlled vocabularies. The standardized metadata will be integrated to provide a search facility for H3Africa data and biospecimens. Because H3Africa projects are generating large-scale genomic data, facilities for analysis and interpretation are critical. H3ABioNet is implementing several data analysis platforms that provide a large range of bioinformatics tools or workflows, such as Galaxy, the Job Management System, and eBiokits. A set of reproducible, portable, and cloud-scalable pipelines to support the multiple H3Africa data types are also being developed and dockerized to enable execution on multiple computing infrastructures. In addition, new tools have been developed for analysis of the uniquely divergent African data and for

  13. Using registries to integrate bioinformatics tools and services into workbench environments

    DEFF Research Database (Denmark)

    Ménager, Hervé; Kalaš, Matúš; Rapacki, Kristoffer

    2016-01-01

    The diversity and complexity of bioinformatics resources presents significant challenges to their localisation, deployment and use, creating a need for reliable systems that address these issues. Meanwhile, users demand increasingly usable and integrated ways to access and analyse data, especially......, a software component that will ease the integration of bioinformatics resources in a workbench environment, using their description provided by the existing ELIXIR Tools and Data Services Registry....

  14. Bioinformatics Education in Pathology Training: Current Scope and Future Direction

    Directory of Open Access Journals (Sweden)

    Michael R Clay

    2017-04-01

    Full Text Available Training anatomic and clinical pathology residents in the principles of bioinformatics is a challenging endeavor. Most residents receive little to no formal exposure to bioinformatics during medical education, and most of the pathology training is spent interpreting histopathology slides using light microscopy or focused on laboratory regulation, management, and interpretation of discrete laboratory data. At a minimum, residents should be familiar with data structure, data pipelines, data manipulation, and data regulations within clinical laboratories. Fellowship-level training should incorporate advanced principles unique to each subspecialty. Barriers to bioinformatics education include the clinical apprenticeship training model, ill-defined educational milestones, inadequate faculty expertise, and limited exposure during medical training. Online educational resources, case-based learning, and incorporation into molecular genomics education could serve as effective educational strategies. Overall, pathology bioinformatics training can be incorporated into pathology resident curricula, provided there is motivation to incorporate, institutional support, educational resources, and adequate faculty expertise.

  15. Teaching Bioinformatics and Neuroinformatics by Using Free Web-Based Tools

    Science.gov (United States)

    Grisham, William; Schottler, Natalie A.; Valli-Marill, Joanne; Beck, Lisa; Beatty, Jackson

    2010-01-01

    This completely computer-based module's purpose is to introduce students to bioinformatics resources. We present an easy-to-adopt module that weaves together several important bioinformatic tools so students can grasp how these tools are used in answering research questions. Students integrate information gathered from websites dealing with…

  16. Emerging strengths in Asia Pacific bioinformatics.

    Science.gov (United States)

    Ranganathan, Shoba; Hsu, Wen-Lian; Yang, Ueng-Cheng; Tan, Tin Wee

    2008-12-12

    The 2008 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998, was organized as the 7th International Conference on Bioinformatics (InCoB), jointly with the Bioinformatics and Systems Biology in Taiwan (BIT 2008) Conference, Oct. 20-23, 2008 at Taipei, Taiwan. Besides bringing together scientists from the field of bioinformatics in this region, InCoB is actively involving researchers from the area of systems biology, to facilitate greater synergy between these two groups. Marking the 10th Anniversary of APBioNet, this InCoB 2008 meeting followed on from a series of successful annual events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea), New Delhi (India) and Hong Kong. Additionally, tutorials and the Workshop on Education in Bioinformatics and Computational Biology (WEBCB) immediately prior to the 20th Federation of Asian and Oceanian Biochemists and Molecular Biologists (FAOBMB) Taipei Conference provided ample opportunity for inducting mainstream biochemists and molecular biologists from the region into a greater level of awareness of the importance of bioinformatics in their craft. In this editorial, we provide a brief overview of the peer-reviewed manuscripts accepted for publication herein, grouped into thematic areas. As the regional research expertise in bioinformatics matures, the papers fall into thematic areas, illustrating the specific contributions made by APBioNet to global bioinformatics efforts.

  17. A lightweight, flow-based toolkit for parallel and distributed bioinformatics pipelines

    Directory of Open Access Journals (Sweden)

    Cieślik Marcin

    2011-02-01

    Full Text Available Abstract Background Bioinformatic analyses typically proceed as chains of data-processing tasks. A pipeline, or 'workflow', is a well-defined protocol, with a specific structure defined by the topology of data-flow interdependencies, and a particular functionality arising from the data transformations applied at each step. In computer science, the dataflow programming (DFP paradigm defines software systems constructed in this manner, as networks of message-passing components. Thus, bioinformatic workflows can be naturally mapped onto DFP concepts. Results To enable the flexible creation and execution of bioinformatics dataflows, we have written a modular framework for parallel pipelines in Python ('PaPy'. A PaPy workflow is created from re-usable components connected by data-pipes into a directed acyclic graph, which together define nested higher-order map functions. The successive functional transformations of input data are evaluated on flexibly pooled compute resources, either local or remote. Input items are processed in batches of adjustable size, all flowing one to tune the trade-off between parallelism and lazy-evaluation (memory consumption. An add-on module ('NuBio' facilitates the creation of bioinformatics workflows by providing domain specific data-containers (e.g., for biomolecular sequences, alignments, structures and functionality (e.g., to parse/write standard file formats. Conclusions PaPy offers a modular framework for the creation and deployment of parallel and distributed data-processing workflows. Pipelines derive their functionality from user-written, data-coupled components, so PaPy also can be viewed as a lightweight toolkit for extensible, flow-based bioinformatics data-processing. The simplicity and flexibility of distributed PaPy pipelines may help users bridge the gap between traditional desktop/workstation and grid computing. PaPy is freely distributed as open-source Python code at http://muralab.org/PaPy, and

  18. Relax with CouchDB--into the non-relational DBMS era of bioinformatics.

    Science.gov (United States)

    Manyam, Ganiraju; Payton, Michelle A; Roth, Jack A; Abruzzo, Lynne V; Coombes, Kevin R

    2012-07-01

    With the proliferation of high-throughput technologies, genome-level data analysis has become common in molecular biology. Bioinformaticians are developing extensive resources to annotate and mine biological features from high-throughput data. The underlying database management systems for most bioinformatics software are based on a relational model. Modern non-relational databases offer an alternative that has flexibility, scalability, and a non-rigid design schema. Moreover, with an accelerated development pace, non-relational databases like CouchDB can be ideal tools to construct bioinformatics utilities. We describe CouchDB by presenting three new bioinformatics resources: (a) geneSmash, which collates data from bioinformatics resources and provides automated gene-centric annotations, (b) drugBase, a database of drug-target interactions with a web interface powered by geneSmash, and (c) HapMap-CN, which provides a web interface to query copy number variations from three SNP-chip HapMap datasets. In addition to the web sites, all three systems can be accessed programmatically via web services. Copyright © 2012 Elsevier Inc. All rights reserved.

  19. Development of a cloud-based Bioinformatics Training Platform.

    Science.gov (United States)

    Revote, Jerico; Watson-Haigh, Nathan S; Quenette, Steve; Bethwaite, Blair; McGrath, Annette; Shang, Catherine A

    2017-05-01

    The Bioinformatics Training Platform (BTP) has been developed to provide access to the computational infrastructure required to deliver sophisticated hands-on bioinformatics training courses. The BTP is a cloud-based solution that is in active use for delivering next-generation sequencing training to Australian researchers at geographically dispersed locations. The BTP was built to provide an easy, accessible, consistent and cost-effective approach to delivering workshops at host universities and organizations with a high demand for bioinformatics training but lacking the dedicated bioinformatics training suites required. To support broad uptake of the BTP, the platform has been made compatible with multiple cloud infrastructures. The BTP is an open-source and open-access resource. To date, 20 training workshops have been delivered to over 700 trainees at over 10 venues across Australia using the BTP. © The Author 2016. Published by Oxford University Press.

  20. Dynamic of H5N1 virus in Cambodia and emergence of a novel endemic sub-clade.

    Science.gov (United States)

    Sorn, San; Sok, Touch; Ly, Sovann; Rith, Sareth; Tung, Nguyen; Viari, Alain; Gavotte, Laurent; Holl, Davun; Seng, Heng; Asgari, Nima; Richner, Beat; Laurent, Denis; Chea, Nora; Duong, Veasna; Toyoda, Tetsuya; Yasuda, Chadwick Y; Kitsutani, Paul; Zhou, Paul; Bing, Sun; Deubel, Vincent; Donis, Ruben; Frutos, Roger; Buchy, Philippe

    2013-04-01

    In Cambodia, the first detection of HPAI H5N1 virus in birds occurred in January 2004 and since then there have been 33 outbreaks in poultry while 21 human cases were reported. The origin and dynamics of these epizootics in Cambodia remain unclear. In this work we used a range of bioinformatics methods to analyze the Cambodian virus sequences together with those from neighboring countries. Six HA lineages belonging to clades 1 and 1.1 were identified since 2004. Lineage 1 shares an ancestor with viruses from Thailand and disappeared after 2005, to be replaced by lineage 2 originating from Vietnam and then by lineage 3. The highly adapted lineage 4 was seen only in Cambodia. Lineage 5 is circulating both in Vietnam and Cambodia since 2008 and was probably introduced in Cambodia through unregistered transboundary poultry trade. Lineage 6 is endemic to Cambodia since 2010 and could be classified as a new clade according to WHO/OIE/FAO criteria for H5N1 virus nomenclature. We propose to name it clade 1.1A. There is a direct filiation of lineages 2 to 6 with a temporal evolution and geographic differentiation for lineages 4 and 6. By the end of 2011, two lineages, i.e. lineages 5 and 6, with different transmission paths cocirculate in Cambodia. The presence of lineage 6 only in Cambodia suggests the existence of a transmission specific to this country whereas the presence of lineage 5 in both Cambodia and Vietnam indicates a distinct way of circulation of infected poultry. Copyright © 2012 Elsevier B.V. All rights reserved.

  1. REDIdb: an upgraded bioinformatics resource for organellar RNA editing sites.

    Science.gov (United States)

    Picardi, Ernesto; Regina, Teresa M R; Verbitskiy, Daniil; Brennicke, Axel; Quagliariello, Carla

    2011-03-01

    RNA editing is a post-transcriptional molecular process whereby the information in a genetic message is modified from that in the corresponding DNA template by means of nucleotide substitutions, insertions and/or deletions. It occurs mostly in organelles by clade-specific diverse and unrelated biochemical mechanisms. RNA editing events have been annotated in primary databases as GenBank and at more sophisticated level in the specialized databases REDIdb, dbRES and EdRNA. At present, REDIdb is the only freely available database that focuses on the organellar RNA editing process and annotates each editing modification in its biological context. Here we present an updated and upgraded release of REDIdb with a web-interface refurbished with graphical and computational facilities that improve RNA editing investigations. Details of the REDIdb features and novelties are illustrated and compared to other RNA editing databases. REDIdb is freely queried at http://biologia.unical.it/py_script/REDIdb/. Copyright © 2010 Elsevier B.V. and Mitochondria Research Society. All rights reserved.

  2. Differential clade-specific HLA-B*3501 association with HIV-1 disease outcome is linked to immunogenicity of a single Gag epitope

    DEFF Research Database (Denmark)

    Matthews, Philippa C; Koyanagi, Madoka; Kløverpris, Henrik N

    2012-01-01

    -clade sequences, which critically reduces recognition of the Gag NY10 epitope. These data suggest that in spite of any inherent HLA-linked T-cell receptor repertoire differences that may exist, maximizing the breadth of the Gag-specific CD8(+) T-cell response, by the addition of even a single epitope, may......The strongest genetic influence on immune control in HIV-1 infection is the HLA class I genotype. Rapid disease progression in B-clade infection has been linked to HLA-B*35 expression, in particular to the less common HLA-B*3502 and HLA-B*3503 subtypes but also to the most prevalent subtype, HLA...

  3. A middleware-based platform for the integration of bioinformatic services

    Directory of Open Access Journals (Sweden)

    Guzmán Llambías

    2015-08-01

    Full Text Available Performing Bioinformatic´s experiments involve an intensive access to distributed services and information resources through Internet. Although existing tools facilitate the implementation of workflow-oriented applications, they lack of capabilities to integrate services beyond low-scale applications, particularly integrating services with heterogeneous interaction patterns and in a larger scale. This is particularly required to enable a large-scale distributed processing of biological data generated by massive sequencing technologies. On the other hand, such integration mechanisms are provided by middleware products like Enterprise Service Buses (ESB, which enable to integrate distributed systems following a Service Oriented Architecture. This paper proposes an integration platform, based on enterprise middleware, to integrate Bioinformatics services. It presents a multi-level reference architecture and focuses on ESB-based mechanisms to provide asynchronous communications, event-based interactions and data transformation capabilities. The paper presents a formal specification of the platform using the Event-B model.

  4. Best practices in bioinformatics training for life scientists.

    KAUST Repository

    Via, Allegra; Blicher, Thomas; Bongcam-Rudloff, Erik; Brazas, Michelle D; Brooksbank, Cath; Budd, Aidan; De Las Rivas, Javier; Dreyer, Jacqueline; Fernandes, Pedro L; van Gelder, Celia; Jacob, Joachim; Jimenez, Rafael C; Loveland, Jane; Moran, Federico; Mulder, Nicola; Nyrö nen, Tommi; Rother, Kristian; Schneider, Maria Victoria; Attwood, Teresa K

    2013-01-01

    concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource

  5. Best practices in bioinformatics training for life scientists.

    KAUST Repository

    Via, Allegra

    2013-06-25

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists.

  6. Bioinformatics-Aided Venomics

    Directory of Open Access Journals (Sweden)

    Quentin Kaas

    2015-06-01

    Full Text Available Venomics is a modern approach that combines transcriptomics and proteomics to explore the toxin content of venoms. This review will give an overview of computational approaches that have been created to classify and consolidate venomics data, as well as algorithms that have helped discovery and analysis of toxin nucleic acid and protein sequences, toxin three-dimensional structures and toxin functions. Bioinformatics is used to tackle specific challenges associated with the identification and annotations of toxins. Recognizing toxin transcript sequences among second generation sequencing data cannot rely only on basic sequence similarity because toxins are highly divergent. Mass spectrometry sequencing of mature toxins is challenging because toxins can display a large number of post-translational modifications. Identifying the mature toxin region in toxin precursor sequences requires the prediction of the cleavage sites of proprotein convertases, most of which are unknown or not well characterized. Tracing the evolutionary relationships between toxins should consider specific mechanisms of rapid evolution as well as interactions between predatory animals and prey. Rapidly determining the activity of toxins is the main bottleneck in venomics discovery, but some recent bioinformatics and molecular modeling approaches give hope that accurate predictions of toxin specificity could be made in the near future.

  7. GeneDig: a web application for accessing genomic and bioinformatics knowledge.

    Science.gov (United States)

    Suciu, Radu M; Aydin, Emir; Chen, Brian E

    2015-02-28

    With the exponential increase and widespread availability of genomic, transcriptomic, and proteomic data, accessing these '-omics' data is becoming increasingly difficult. The current resources for accessing and analyzing these data have been created to perform highly specific functions intended for specialists, and thus typically emphasize functionality over user experience. We have developed a web-based application, GeneDig.org, that allows any general user access to genomic information with ease and efficiency. GeneDig allows for searching and browsing genes and genomes, while a dynamic navigator displays genomic, RNA, and protein information simultaneously for co-navigation. We demonstrate that our application allows more than five times faster and efficient access to genomic information than any currently available methods. We have developed GeneDig as a platform for bioinformatics integration focused on usability as its central design. This platform will introduce genomic navigation to broader audiences while aiding the bioinformatics analyses performed in everyday biology research.

  8. MOWServ: a web client for integration of bioinformatic resources

    Science.gov (United States)

    Ramírez, Sergio; Muñoz-Mérida, Antonio; Karlsson, Johan; García, Maximiliano; Pérez-Pulido, Antonio J.; Claros, M. Gonzalo; Trelles, Oswaldo

    2010-01-01

    The productivity of any scientist is affected by cumbersome, tedious and time-consuming tasks that try to make the heterogeneous web services compatible so that they can be useful in their research. MOWServ, the bioinformatic platform offered by the Spanish National Institute of Bioinformatics, was released to provide integrated access to databases and analytical tools. Since its release, the number of available services has grown dramatically, and it has become one of the main contributors of registered services in the EMBRACE Biocatalogue. The ontology that enables most of the web-service compatibility has been curated, improved and extended. The service discovery has been greatly enhanced by Magallanes software and biodataSF. User data are securely stored on the main server by an authentication protocol that enables the monitoring of current or already-finished user’s tasks, as well as the pipelining of successive data processing services. The BioMoby standard has been greatly extended with the new features included in the MOWServ, such as management of additional information (metadata such as extended descriptions, keywords and datafile examples), a qualified registry, error handling, asynchronous services and service replication. All of them have increased the MOWServ service quality, usability and robustness. MOWServ is available at http://www.inab.org/MOWServ/ and has a mirror at http://www.bitlab-es.com/MOWServ/. PMID:20525794

  9. A revision of the Solanum elaeagnifolium clade (Elaeagnifolium clade; subgenus Leptostemonum, Solanaceae

    Directory of Open Access Journals (Sweden)

    Sandra Knapp

    2017-08-01

    Full Text Available The Solanum elaeagnifolium clade (Elaeagnifolium clade contains five species of small, often rhizomatous, shrubs from deserts and dry forests in North and South America. Members of the clade were previously classified in sections Leprophora, Nycterium and Lathyrocarpum, and were not thought to be closely related. The group is sister to the species-rich monophyletic Old World clade of spiny solanums. The species of the group have an amphitropical distribution, with three species in Mexico and the southwestern United States and three species in Argentina. Solanum elaeagnifolium occurs in both North and South America, and is a noxious invasive weed in dry areas worldwide. Members of the group are highly variable morphologically, and this variability has led to much synonymy, particularly in the widespread S. elaeagnifolium. We here review the taxonomic history, morphology, relationships and ecology of these species and provide keys for their identification, descriptions, full synonymy (including designations of lectotypes and nomenclatural notes. Illustrations, distribution maps and preliminary conservation assessments are provided for all species.

  10. Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community.

    Science.gov (United States)

    Krampis, Konstantinos; Booth, Tim; Chapman, Brad; Tiwari, Bela; Bicak, Mesude; Field, Dawn; Nelson, Karen E

    2012-03-19

    A steep drop in the cost of next-generation sequencing during recent years has made the technology affordable to the majority of researchers, but downstream bioinformatic analysis still poses a resource bottleneck for smaller laboratories and institutes that do not have access to substantial computational resources. Sequencing instruments are typically bundled with only the minimal processing and storage capacity required for data capture during sequencing runs. Given the scale of sequence datasets, scientific value cannot be obtained from acquiring a sequencer unless it is accompanied by an equal investment in informatics infrastructure. Cloud BioLinux is a publicly accessible Virtual Machine (VM) that enables scientists to quickly provision on-demand infrastructures for high-performance bioinformatics computing using cloud platforms. Users have instant access to a range of pre-configured command line and graphical software applications, including a full-featured desktop interface, documentation and over 135 bioinformatics packages for applications including sequence alignment, clustering, assembly, display, editing, and phylogeny. Each tool's functionality is fully described in the documentation directly accessible from the graphical interface of the VM. Besides the Amazon EC2 cloud, we have started instances of Cloud BioLinux on a private Eucalyptus cloud installed at the J. Craig Venter Institute, and demonstrated access to the bioinformatic tools interface through a remote connection to EC2 instances from a local desktop computer. Documentation for using Cloud BioLinux on EC2 is available from our project website, while a Eucalyptus cloud image and VirtualBox Appliance is also publicly available for download and use by researchers with access to private clouds. Cloud BioLinux provides a platform for developing bioinformatics infrastructures on the cloud. An automated and configurable process builds Virtual Machines, allowing the development of highly

  11. Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community

    Science.gov (United States)

    2012-01-01

    Background A steep drop in the cost of next-generation sequencing during recent years has made the technology affordable to the majority of researchers, but downstream bioinformatic analysis still poses a resource bottleneck for smaller laboratories and institutes that do not have access to substantial computational resources. Sequencing instruments are typically bundled with only the minimal processing and storage capacity required for data capture during sequencing runs. Given the scale of sequence datasets, scientific value cannot be obtained from acquiring a sequencer unless it is accompanied by an equal investment in informatics infrastructure. Results Cloud BioLinux is a publicly accessible Virtual Machine (VM) that enables scientists to quickly provision on-demand infrastructures for high-performance bioinformatics computing using cloud platforms. Users have instant access to a range of pre-configured command line and graphical software applications, including a full-featured desktop interface, documentation and over 135 bioinformatics packages for applications including sequence alignment, clustering, assembly, display, editing, and phylogeny. Each tool's functionality is fully described in the documentation directly accessible from the graphical interface of the VM. Besides the Amazon EC2 cloud, we have started instances of Cloud BioLinux on a private Eucalyptus cloud installed at the J. Craig Venter Institute, and demonstrated access to the bioinformatic tools interface through a remote connection to EC2 instances from a local desktop computer. Documentation for using Cloud BioLinux on EC2 is available from our project website, while a Eucalyptus cloud image and VirtualBox Appliance is also publicly available for download and use by researchers with access to private clouds. Conclusions Cloud BioLinux provides a platform for developing bioinformatics infrastructures on the cloud. An automated and configurable process builds Virtual Machines, allowing the

  12. Establishing a distributed national research infrastructure providing bioinformatics support to life science researchers in Australia.

    Science.gov (United States)

    Schneider, Maria Victoria; Griffin, Philippa C; Tyagi, Sonika; Flannery, Madison; Dayalan, Saravanan; Gladman, Simon; Watson-Haigh, Nathan; Bayer, Philipp E; Charleston, Michael; Cooke, Ira; Cook, Rob; Edwards, Richard J; Edwards, David; Gorse, Dominique; McConville, Malcolm; Powell, David; Wilkins, Marc R; Lonie, Andrew

    2017-06-30

    EMBL Australia Bioinformatics Resource (EMBL-ABR) is a developing national research infrastructure, providing bioinformatics resources and support to life science and biomedical researchers in Australia. EMBL-ABR comprises 10 geographically distributed national nodes with one coordinating hub, with current funding provided through Bioplatforms Australia and the University of Melbourne for its initial 2-year development phase. The EMBL-ABR mission is to: (1) increase Australia's capacity in bioinformatics and data sciences; (2) contribute to the development of training in bioinformatics skills; (3) showcase Australian data sets at an international level and (4) enable engagement in international programs. The activities of EMBL-ABR are focussed in six key areas, aligning with comparable international initiatives such as ELIXIR, CyVerse and NIH Commons. These key areas-Tools, Data, Standards, Platforms, Compute and Training-are described in this article. © The Author 2017. Published by Oxford University Press.

  13. BIRCH: A user-oriented, locally-customizable, bioinformatics system

    Science.gov (United States)

    Fristensky, Brian

    2007-01-01

    Background Molecular biologists need sophisticated analytical tools which often demand extensive computational resources. While finding, installing, and using these tools can be challenging, pipelining data from one program to the next is particularly awkward, especially when using web-based programs. At the same time, system administrators tasked with maintaining these tools do not always appreciate the needs of research biologists. Results BIRCH (Biological Research Computing Hierarchy) is an organizational framework for delivering bioinformatics resources to a user group, scaling from a single lab to a large institution. The BIRCH core distribution includes many popular bioinformatics programs, unified within the GDE (Genetic Data Environment) graphic interface. Of equal importance, BIRCH provides the system administrator with tools that simplify the job of managing a multiuser bioinformatics system across different platforms and operating systems. These include tools for integrating locally-installed programs and databases into BIRCH, and for customizing the local BIRCH system to meet the needs of the user base. BIRCH can also act as a front end to provide a unified view of already-existing collections of bioinformatics software. Documentation for the BIRCH and locally-added programs is merged in a hierarchical set of web pages. In addition to manual pages for individual programs, BIRCH tutorials employ step by step examples, with screen shots and sample files, to illustrate both the important theoretical and practical considerations behind complex analytical tasks. Conclusion BIRCH provides a versatile organizational framework for managing software and databases, and making these accessible to a user base. Because of its network-centric design, BIRCH makes it possible for any user to do any task from anywhere. PMID:17291351

  14. BIRCH: A user-oriented, locally-customizable, bioinformatics system

    Directory of Open Access Journals (Sweden)

    Fristensky Brian

    2007-02-01

    Full Text Available Abstract Background Molecular biologists need sophisticated analytical tools which often demand extensive computational resources. While finding, installing, and using these tools can be challenging, pipelining data from one program to the next is particularly awkward, especially when using web-based programs. At the same time, system administrators tasked with maintaining these tools do not always appreciate the needs of research biologists. Results BIRCH (Biological Research Computing Hierarchy is an organizational framework for delivering bioinformatics resources to a user group, scaling from a single lab to a large institution. The BIRCH core distribution includes many popular bioinformatics programs, unified within the GDE (Genetic Data Environment graphic interface. Of equal importance, BIRCH provides the system administrator with tools that simplify the job of managing a multiuser bioinformatics system across different platforms and operating systems. These include tools for integrating locally-installed programs and databases into BIRCH, and for customizing the local BIRCH system to meet the needs of the user base. BIRCH can also act as a front end to provide a unified view of already-existing collections of bioinformatics software. Documentation for the BIRCH and locally-added programs is merged in a hierarchical set of web pages. In addition to manual pages for individual programs, BIRCH tutorials employ step by step examples, with screen shots and sample files, to illustrate both the important theoretical and practical considerations behind complex analytical tasks. Conclusion BIRCH provides a versatile organizational framework for managing software and databases, and making these accessible to a user base. Because of its network-centric design, BIRCH makes it possible for any user to do any task from anywhere.

  15. Personalized cloud-based bioinformatics services for research and education: use cases and the elasticHPC package.

    Science.gov (United States)

    El-Kalioby, Mohamed; Abouelhoda, Mohamed; Krüger, Jan; Giegerich, Robert; Sczyrba, Alexander; Wall, Dennis P; Tonellato, Peter

    2012-01-01

    Bioinformatics services have been traditionally provided in the form of a web-server that is hosted at institutional infrastructure and serves multiple users. This model, however, is not flexible enough to cope with the increasing number of users, increasing data size, and new requirements in terms of speed and availability of service. The advent of cloud computing suggests a new service model that provides an efficient solution to these problems, based on the concepts of "resources-on-demand" and "pay-as-you-go". However, cloud computing has not yet been introduced within bioinformatics servers due to the lack of usage scenarios and software layers that address the requirements of the bioinformatics domain. In this paper, we provide different use case scenarios for providing cloud computing based services, considering both the technical and financial aspects of the cloud computing service model. These scenarios are for individual users seeking computational power as well as bioinformatics service providers aiming at provision of personalized bioinformatics services to their users. We also present elasticHPC, a software package and a library that facilitates the use of high performance cloud computing resources in general and the implementation of the suggested bioinformatics scenarios in particular. Concrete examples that demonstrate the suggested use case scenarios with whole bioinformatics servers and major sequence analysis tools like BLAST are presented. Experimental results with large datasets are also included to show the advantages of the cloud model. Our use case scenarios and the elasticHPC package are steps towards the provision of cloud based bioinformatics services, which would help in overcoming the data challenge of recent biological research. All resources related to elasticHPC and its web-interface are available at http://www.elasticHPC.org.

  16. Overexpressed Proteins in Hypervirulent Clade 8 and Clade 6 Strains of Escherichia coli O157:H7 Compared to E. coli O157:H7 EDL933 Clade 3 Strain.

    Directory of Open Access Journals (Sweden)

    Natalia Amigo

    Full Text Available Escherichia coli O157:H7 is responsible for severe diarrhea and hemolytic uremic syndrome (HUS, and predominantly affects children under 5 years. The major virulence traits are Shiga toxins, necessary to develop HUS and the Type III Secretion System (T3SS through which bacteria translocate effector proteins directly into the host cell. By SNPs typing, E. coli O157:H7 was separated into nine different clades. Clade 8 and clade 6 strains were more frequently associated with severe disease and HUS. In this study, we aimed to identify differentially expressed proteins in two strains of E. coli O157:H7 (clade 8 and clade 6, obtained from cattle and compared them with the well characterized reference EDL933 strain (clade 3. Clade 8 and clade 6 strains show enhanced pathogenicity in a mouse model and virulence-related properties. Proteins were extracted and analyzed using the TMT-6plex labeling strategy associated with two dimensional liquid chromatography and mass spectrometry in tandem. We detected 2241 proteins in the cell extract and 1787 proteins in the culture supernatants. Attention was focused on the proteins related to virulence, overexpressed in clade 6 and 8 strains compared to EDL933 strain. The proteins relevant overexpressed in clade 8 strain were the curli protein CsgC, a transcriptional activator (PchE, phage proteins, Stx2, FlgM and FlgD, a dienelactone hydrolase, CheW and CheY, and the SPATE protease EspP. For clade 6 strain, a high overexpression of phage proteins was detected, mostly from Stx2 encoding phage, including Stx2, flagellin and the protease TagA, EDL933_p0016, dienelactone hydrolase, and Haemolysin A, amongst others with unknown function. Some of these proteins were analyzed by RT-qPCR to corroborate the proteomic data. Clade 6 and clade 8 strains showed enhanced transcription of 10 out of 12 genes compared to EDL933. These results may provide new insights in E. coli O157:H7 mechanisms of pathogenesis.

  17. Bioinformatics analysis and construction of phylogenetic tree of aquaporins from Echinococcus granulosus.

    Science.gov (United States)

    Wang, Fen; Ye, Bin

    2016-09-01

    Cyst echinococcosis caused by the matacestodal larvae of Echinococcus granulosus (Eg), is a chronic, worldwide, and severe zoonotic parasitosis. The treatment of cyst echinococcosis is still difficult since surgery cannot fit the needs of all patients, and drugs can lead to serious adverse events as well as resistance. The screen of target proteins interacted with new anti-hydatidosis drugs is urgently needed to meet the prevailing challenges. Here, we analyzed the sequences and structure properties, and constructed a phylogenetic tree by bioinformatics methods. The MIP family signature and Protein kinase C phosphorylation sites were predicted in all nine EgAQPs. α-helix and random coil were the main secondary structures of EgAQPs. The numbers of transmembrane regions were three to six, which indicated that EgAQPs contained multiple hydrophobic regions. A neighbor-joining tree indicated that EgAQPs were divided into two branches, seven EgAQPs formed a clade with AQP1 from human, a "strict" aquaporins, other two EgAQPs formed a clade with AQP9 from human, an aquaglyceroporins. Unfortunately, homology modeling of EgAQPs was aborted. These results provide a foundation for understanding and researches of the biological function of E. granulosus.

  18. Bioinformatics approaches for identifying new therapeutic bioactive peptides in food

    Directory of Open Access Journals (Sweden)

    Nora Khaldi

    2012-10-01

    Full Text Available ABSTRACT:The traditional methods for mining foods for bioactive peptides are tedious and long. Similar to the drug industry, the length of time to identify and deliver a commercial health ingredient that reduces disease symptoms can take anything between 5 to 10 years. Reducing this time and effort is crucial in order to create new commercially viable products with clear and important health benefits. In the past few years, bioinformatics, the science that brings together fast computational biology, and efficient genome mining, is appearing as the long awaited solution to this problem. By quickly mining food genomes for characteristics of certain food therapeutic ingredients, researchers can potentially find new ones in a matter of a few weeks. Yet, surprisingly, very little success has been achieved so far using bioinformatics in mining for food bioactives.The absence of food specific bioinformatic mining tools, the slow integration of both experimental mining and bioinformatics, and the important difference between different experimental platforms are some of the reasons for the slow progress of bioinformatics in the field of functional food and more specifically in bioactive peptide discovery.In this paper I discuss some methods that could be easily translated, using a rational peptide bioinformatics design, to food bioactive peptide mining. I highlight the need for an integrated food peptide database. I also discuss how to better integrate experimental work with bioinformatics in order to improve the mining of food for bioactive peptides, therefore achieving a higher success rates.

  19. A bioinformatics potpourri.

    Science.gov (United States)

    Schönbach, Christian; Li, Jinyan; Ma, Lan; Horton, Paul; Sjaugi, Muhammad Farhan; Ranganathan, Shoba

    2018-01-19

    The 16th International Conference on Bioinformatics (InCoB) was held at Tsinghua University, Shenzhen from September 20 to 22, 2017. The annual conference of the Asia-Pacific Bioinformatics Network featured six keynotes, two invited talks, a panel discussion on big data driven bioinformatics and precision medicine, and 66 oral presentations of accepted research articles or posters. Fifty-seven articles comprising a topic assortment of algorithms, biomolecular networks, cancer and disease informatics, drug-target interactions and drug efficacy, gene regulation and expression, imaging, immunoinformatics, metagenomics, next generation sequencing for genomics and transcriptomics, ontologies, post-translational modification, and structural bioinformatics are the subject of this editorial for the InCoB2017 supplement issues in BMC Genomics, BMC Bioinformatics, BMC Systems Biology and BMC Medical Genomics. New Delhi will be the location of InCoB2018, scheduled for September 26-28, 2018.

  20. Generative Topic Modeling in Image Data Mining and Bioinformatics Studies

    Science.gov (United States)

    Chen, Xin

    2012-01-01

    Probabilistic topic models have been developed for applications in various domains such as text mining, information retrieval and computer vision and bioinformatics domain. In this thesis, we focus on developing novel probabilistic topic models for image mining and bioinformatics studies. Specifically, a probabilistic topic-connection (PTC) model…

  1. Bioinformatics Methods for Interpreting Toxicogenomics Data: The Role of Text-Mining

    NARCIS (Netherlands)

    Hettne, K.M.; Kleinjans, J.; Stierum, R.H.; Boorsma, A.; Kors, J.A.

    2014-01-01

    This chapter concerns the application of bioinformatics methods to the analysis of toxicogenomics data. The chapter starts with an introduction covering how bioinformatics has been applied in toxicogenomics data analysis, and continues with a description of the foundations of a specific

  2. Adapting bioinformatics curricula for big data

    Science.gov (United States)

    Greene, Anna C.; Giffin, Kristine A.; Greene, Casey S.

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. PMID:25829469

  3. Bioinformatics on the Cloud Computing Platform Azure

    Science.gov (United States)

    Shanahan, Hugh P.; Owen, Anne M.; Harrison, Andrew P.

    2014-01-01

    We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development. PMID:25050811

  4. Bioinformatics in the Netherlands: the value of a nationwide community.

    Science.gov (United States)

    van Gelder, Celia W G; Hooft, Rob W W; van Rijswijk, Merlijn N; van den Berg, Linda; Kok, Ruben G; Reinders, Marcel; Mons, Barend; Heringa, Jaap

    2017-09-15

    This review provides a historical overview of the inception and development of bioinformatics research in the Netherlands. Rooted in theoretical biology by foundational figures such as Paulien Hogeweg (at Utrecht University since the 1970s), the developments leading to organizational structures supporting a relatively large Dutch bioinformatics community will be reviewed. We will show that the most valuable resource that we have built over these years is the close-knit national expert community that is well engaged in basic and translational life science research programmes. The Dutch bioinformatics community is accustomed to facing the ever-changing landscape of data challenges and working towards solutions together. In addition, this community is the stable factor on the road towards sustainability, especially in times where existing funding models are challenged and change rapidly. © The Author 2017. Published by Oxford University Press.

  5. Data mining for bioinformatics applications

    CERN Document Server

    Zengyou, He

    2015-01-01

    Data Mining for Bioinformatics Applications provides valuable information on the data mining methods have been widely used for solving real bioinformatics problems, including problem definition, data collection, data preprocessing, modeling, and validation. The text uses an example-based method to illustrate how to apply data mining techniques to solve real bioinformatics problems, containing 45 bioinformatics problems that have been investigated in recent research. For each example, the entire data mining process is described, ranging from data preprocessing to modeling and result validation. Provides valuable information on the data mining methods have been widely used for solving real bioinformatics problems Uses an example-based method to illustrate how to apply data mining techniques to solve real bioinformatics problems Contains 45 bioinformatics problems that have been investigated in recent research.

  6. Sequential and Simultaneous Immunization of Rabbits with HIV-1 Envelope Glycoprotein SOSIP.664 Trimers from Clades A, B and C

    NARCIS (Netherlands)

    Klasse, P. J.; LaBranche, Celia C.; Ketas, Thomas J.; Ozorowski, Gabriel; Cupo, Albert; Pugach, Pavel; Ringe, Rajesh P.; Golabek, Michael; van Gils, Marit J.; Guttman, Miklos; Lee, Kelly K.; Wilson, Ian A.; Butera, Salvatore T.; Ward, Andrew B.; Montefiori, David C.; Sanders, Rogier W.; Moore, John P.

    2016-01-01

    We have investigated the immunogenicity in rabbits of native-like, soluble, recombinant SOSIP.664 trimers based on the env genes of four isolates of human immunodeficiency virus type 1 (HIV-1); specifically BG505 (clade A), B41 (clade B), CZA97 (clade C) and DU422 (clade C). The various trimers were

  7. Planning bioinformatics workflows using an expert system

    Science.gov (United States)

    Chen, Xiaoling; Chang, Jeffrey T.

    2017-01-01

    Abstract Motivation: Bioinformatic analyses are becoming formidably more complex due to the increasing number of steps required to process the data, as well as the proliferation of methods that can be used in each step. To alleviate this difficulty, pipelines are commonly employed. However, pipelines are typically implemented to automate a specific analysis, and thus are difficult to use for exploratory analyses requiring systematic changes to the software or parameters used. Results: To automate the development of pipelines, we have investigated expert systems. We created the Bioinformatics ExperT SYstem (BETSY) that includes a knowledge base where the capabilities of bioinformatics software is explicitly and formally encoded. BETSY is a backwards-chaining rule-based expert system comprised of a data model that can capture the richness of biological data, and an inference engine that reasons on the knowledge base to produce workflows. Currently, the knowledge base is populated with rules to analyze microarray and next generation sequencing data. We evaluated BETSY and found that it could generate workflows that reproduce and go beyond previously published bioinformatics results. Finally, a meta-investigation of the workflows generated from the knowledge base produced a quantitative measure of the technical burden imposed by each step of bioinformatics analyses, revealing the large number of steps devoted to the pre-processing of data. In sum, an expert system approach can facilitate exploratory bioinformatic analysis by automating the development of workflows, a task that requires significant domain expertise. Availability and Implementation: https://github.com/jefftc/changlab Contact: jeffrey.t.chang@uth.tmc.edu PMID:28052928

  8. Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses.

    Science.gov (United States)

    Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

    2014-06-01

    Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach. Copyright © 2014 Elsevier Inc. All rights reserved.

  9. Cytogenetics of Legumes in the Phaseoloid Clade

    Directory of Open Access Journals (Sweden)

    Aiko Iwata

    2013-11-01

    Full Text Available Cytogenetics played an essential role in studies of chromosome structure, behavior, and evolution in numerous plant species. The advent of molecular cytogenetics combined with recent development of genomic resources has ushered in a new era of chromosome studies that have greatly advanced our knowledge of karyotypic diversity, genome and chromosome organization, and chromosomal evolution in legumes. This review summarizes some of the achievements of cytogenetic studies in legumes in the Phaseoloid clade, which includes several important legume crops such as common bean ( L., cowpea [ (L. Walp.], soybean [ (L. Merr.], and pigeonpea [ (L. Huth]. In the Phaseoloid clade, karyotypes are mostly stable. There are, however, several species with extensive chromosomal changes. Fluorescence in situ hybridization has been useful to reveal chromosomal structure by physically mapping transposons, satellite repeats, ribosomal DNA genes, and bacterial artificial chromosome clones onto chromosomes. Polytene chromosomes, which are much longer than the mitotic chromosomes, have been successfully found and used in cytogenetic studies in some and species. Molecular cytogenetics will continue to be an important tool in legume genetics and genomics, and we discuss future applications of molecular cytogenetics to better understand chromosome and genome structure and evolution in legumes.

  10. Adapting bioinformatics curricula for big data.

    Science.gov (United States)

    Greene, Anna C; Giffin, Kristine A; Greene, Casey S; Moore, Jason H

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. © The Author 2015. Published by Oxford University Press.

  11. Deep learning in bioinformatics.

    Science.gov (United States)

    Min, Seonwoo; Lee, Byunghan; Yoon, Sungroh

    2017-09-01

    In the era of big data, transformation of biomedical big data into valuable knowledge has been one of the most important challenges in bioinformatics. Deep learning has advanced rapidly since the early 2000s and now demonstrates state-of-the-art performance in various fields. Accordingly, application of deep learning in bioinformatics to gain insight from data has been emphasized in both academia and industry. Here, we review deep learning in bioinformatics, presenting examples of current research. To provide a useful and comprehensive perspective, we categorize research both by the bioinformatics domain (i.e. omics, biomedical imaging, biomedical signal processing) and deep learning architecture (i.e. deep neural networks, convolutional neural networks, recurrent neural networks, emergent architectures) and present brief descriptions of each study. Additionally, we discuss theoretical and practical issues of deep learning in bioinformatics and suggest future research directions. We believe that this review will provide valuable insights and serve as a starting point for researchers to apply deep learning approaches in their bioinformatics studies. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  12. Influenza A H5N1 clade 2.3.4 virus with a different antiviral susceptibility profile replaced clade 1 virus in humans in northern Vietnam.

    Directory of Open Access Journals (Sweden)

    Mai T Q Le

    2008-10-01

    Full Text Available Prior to 2007, highly pathogenic avian influenza (HPAI H5N1 viruses isolated from poultry and humans in Vietnam were consistently reported to be clade 1 viruses, susceptible to oseltamivir but resistant to amantadine. Here we describe the re-emergence of human HPAI H5N1 virus infections in Vietnam in 2007 and the characteristics of the isolated viruses.Respiratory specimens from patients suspected to be infected with avian influenza in 2007 were screened by influenza and H5 subtype specific polymerase chain reaction. Isolated H5N1 strains were further characterized by genome sequencing and drug susceptibility testing. Eleven poultry outbreak isolates from 2007 were included in the sequence analysis. Eight patients, all of them from northern Vietnam, were diagnosed with H5N1 in 2007 and five of them died. Phylogenetic analysis of H5N1 viruses isolated from humans and poultry in 2007 showed that clade 2.3.4 H5N1 viruses replaced clade 1 viruses in northern Vietnam. Four human H5N1 strains had eight-fold reduced in-vitro susceptibility to oseltamivir as compared to clade 1 viruses. In two poultry isolates the I117V mutation was found in the neuraminidase gene, which is associated with reduced susceptibility to oseltamivir. No mutations in the M2 gene conferring amantadine resistance were found.In 2007, H5N1 clade 2.3.4 viruses replaced clade 1 viruses in northern Vietnam and were susceptible to amantadine but showed reduced susceptibility to oseltamivir. Combination antiviral therapy with oseltamivir and amantadine for human cases in Vietnam is recommended.

  13. GOBLET: the Global Organisation for Bioinformatics Learning, Education and Training.

    Science.gov (United States)

    Attwood, Teresa K; Atwood, Teresa K; Bongcam-Rudloff, Erik; Brazas, Michelle E; Corpas, Manuel; Gaudet, Pascale; Lewitter, Fran; Mulder, Nicola; Palagi, Patricia M; Schneider, Maria Victoria; van Gelder, Celia W G

    2015-04-01

    In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy--paradoxically, many are actually closing "niche" bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all.

  14. GOBLET: The Global Organisation for Bioinformatics Learning, Education and Training

    Science.gov (United States)

    Atwood, Teresa K.; Bongcam-Rudloff, Erik; Brazas, Michelle E.; Corpas, Manuel; Gaudet, Pascale; Lewitter, Fran; Mulder, Nicola; Palagi, Patricia M.; Schneider, Maria Victoria; van Gelder, Celia W. G.

    2015-01-01

    In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy—paradoxically, many are actually closing “niche” bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all. PMID:25856076

  15. Evidence of Sympatry of Clade A and Clade B Head Lice in a Pre-Columbian Chilean Mummy from Camarones

    Science.gov (United States)

    Boutellis, Amina; Drali, Rezak; Rivera, Mario A.; Mumcuoglu, Kosta Y.; Raoult, Didier

    2013-01-01

    Three different lineages of head lice are known to parasitize humans. Clade A, which is currently worldwide in distribution, was previously demonstrated to be present in the Americas before the time of Columbus. The two other types of head lice are geographically restricted to America and Australia for clade B and to Africa and Asia for clade C. In this study, we tested two operculated nits from a 4,000-year-old Chilean mummy of Camarones for the presence of the partial Cytb mitochondrial gene (270 bp). Our finding shows that clade B head lice were present in America before the arrival of the European colonists. PMID:24204678

  16. Bioinformatics programs are 31-fold over-represented among the highest impact scientific papers of the past two decades.

    Science.gov (United States)

    Wren, Jonathan D

    2016-09-01

    To analyze the relative proportion of bioinformatics papers and their non-bioinformatics counterparts in the top 20 most cited papers annually for the past two decades. When defining bioinformatics papers as encompassing both those that provide software for data analysis or methods underlying data analysis software, we find that over the past two decades, more than a third (34%) of the most cited papers in science were bioinformatics papers, which is approximately a 31-fold enrichment relative to the total number of bioinformatics papers published. More than half of the most cited papers during this span were bioinformatics papers. Yet, the average 5-year JIF of top 20 bioinformatics papers was 7.7, whereas the average JIF for top 20 non-bioinformatics papers was 25.8, significantly higher (P papers, bioinformatics journals tended to have higher Gini coefficients, suggesting that development of novel bioinformatics resources may be somewhat 'hit or miss'. That is, relative to other fields, bioinformatics produces some programs that are extremely widely adopted and cited, yet there are fewer of intermediate success. jdwren@gmail.com Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  17. BioXSD: the common data-exchange format for everyday bioinformatics web services.

    Science.gov (United States)

    Kalas, Matús; Puntervoll, Pål; Joseph, Alexandre; Bartaseviciūte, Edita; Töpfer, Armin; Venkataraman, Prabakar; Pettifer, Steve; Bryne, Jan Christian; Ison, Jon; Blanchet, Christophe; Rapacki, Kristoffer; Jonassen, Inge

    2010-09-15

    The world-wide community of life scientists has access to a large number of public bioinformatics databases and tools, which are developed and deployed using diverse technologies and designs. More and more of the resources offer programmatic web-service interface. However, efficient use of the resources is hampered by the lack of widely used, standard data-exchange formats for the basic, everyday bioinformatics data types. BioXSD has been developed as a candidate for standard, canonical exchange format for basic bioinformatics data. BioXSD is represented by a dedicated XML Schema and defines syntax for biological sequences, sequence annotations, alignments and references to resources. We have adapted a set of web services to use BioXSD as the input and output format, and implemented a test-case workflow. This demonstrates that the approach is feasible and provides smooth interoperability. Semantics for BioXSD is provided by annotation with the EDAM ontology. We discuss in a separate section how BioXSD relates to other initiatives and approaches, including existing standards and the Semantic Web. The BioXSD 1.0 XML Schema is freely available at http://www.bioxsd.org/BioXSD-1.0.xsd under the Creative Commons BY-ND 3.0 license. The http://bioxsd.org web page offers documentation, examples of data in BioXSD format, example workflows with source codes in common programming languages, an updated list of compatible web services and tools and a repository of feature requests from the community.

  18. Facilitating the use of large-scale biological data and tools in the era of translational bioinformatics

    DEFF Research Database (Denmark)

    Kouskoumvekaki, Irene; Shublaq, Nour; Brunak, Søren

    2014-01-01

    As both the amount of generated biological data and the processing compute power increase, computational experimentation is no longer the exclusivity of bioinformaticians, but it is moving across all biomedical domains. For bioinformatics to realize its translational potential, domain experts need...... access to user-friendly solutions to navigate, integrate and extract information out of biological databases, as well as to combine tools and data resources in bioinformatics workflows. In this review, we present services that assist biomedical scientists in incorporating bioinformatics tools...... into their research.We review recent applications of Cytoscape, BioGPS and DAVID for data visualization, integration and functional enrichment. Moreover, we illustrate the use of Taverna, Kepler, GenePattern, and Galaxy as open-access workbenches for bioinformatics workflows. Finally, we mention services...

  19. [BIOINFORMATIC SEARCH AND PHYLOGENETIC ANALYSIS OF THE CELLULOSE SYNTHASE GENES OF FLAX (LINUM USITATISSIMUM)].

    Science.gov (United States)

    Pydiura, N A; Bayer, G Ya; Galinousky, D V; Yemets, A I; Pirko, Ya V; Podvitski, T A; Anisimova, N V; Khotyleva, L V; Kilchevsky, A V; Blume, Ya B

    2015-01-01

    A bioinformatic search of sequences encoding cellulose synthase genes in the flax genome, and their comparison to dicots orthologs was carried out. The analysis revealed 32 cellulose synthase gene candidates, 16 of which are highly likely to encode cellulose synthases, and the remaining 16--cellulose synthase-like proteins (Csl). Phylogenetic analysis of gene products of cellulose synthase genes allowed distinguishing 6 groups of cellulose synthase genes of different classes: CesA1/10, CesA3, CesA4, CesA5/6/2/9, CesA7 and CesA8. Paralogous sequences within classes CesA1/10 and CesA5/6/2/9 which are associated with the primary cell wall formation are characterized by a greater similarity within these classes than orthologous sequences. Whereas the genes controlling the biosynthesis of secondary cell wall cellulose form distinct clades: CesA4, CesA7, and CesA8. The analysis of 16 identified flax cellulose synthase gene candidates shows the presence of at least 12 different cellulose synthase gene variants in flax genome which are represented in all six clades of cellulose synthase genes. Thus, at this point genes of all ten known cellulose synthase classes are identify in flax genome, but their correct classification requires additional research.

  20. Identifikasi Secara Serologi Galur Virus Flu Burung Subtipe H5N1 Clade 2.1.3 dan Clade 2.3.2 pada Ayam Petelur (SEROLOGICAL IDENTIFICATION OF AVIAN INFLUENZA STRAIN VIRUS SUBTYPE H5N1 CLADE 2.1.3 AND CLADE 2.3.2 FROM LAYER

    Directory of Open Access Journals (Sweden)

    Aprilia Kusumastuti

    2015-10-01

    Full Text Available The aim of the study was to know avian influenza (AI infection in field by using serology test in threemarketing area of AI vaccines. Haemagglutination inhibition methode was used in this test. There werefour antigen strains of AI subtype H5N1 clade 2.1.3 (AIstrainA/Chicken/West Java/PWT-WIJ/2006, AIstrain A/Chicken/Garut/BBVW-223/2007, AI strain A/Chicken/West Java-Nagrak/30/2007, and AI strainA/Chicken/Pekalongan/BBVW-208/2007 and 2 antigen strains of AI subtype H5N1 clade 2.3.2 (AI strainA/duck/Sukoharjo/BBVW-1428-9/2012 and AI strain A/duck/Sleman/BBVW-1463-10/2012 was used inthis study for HI test. The result presents that 93,33% chicken farms in three marketing area of PT. SanbioLaboratories have positive antibody titre to AI subtype H5N1 clade 2.1.3. This titre may be obtained fromAI clade 2.1.3 vaccination. From 15 samples, 92,86% are positive to AI subtype H5N1 clade 2.3.2A/duck/Sukoharjo/BBVW-1428-9/2012 and 92,31% are positive to A/duck/Sleman/BBVW-1463-10/2012 evenwithout AI clade 2.3.2 vaccination. This antibody titre may be obtained from AI clade 2.1.3 vaccine crossprotection or field infection.

  1. Biggest challenges in bioinformatics.

    Science.gov (United States)

    Fuller, Jonathan C; Khoueiry, Pierre; Dinkel, Holger; Forslund, Kristoffer; Stamatakis, Alexandros; Barry, Joseph; Budd, Aidan; Soldatos, Theodoros G; Linssen, Katja; Rajput, Abdul Mateen

    2013-04-01

    The third Heidelberg Unseminars in Bioinformatics (HUB) was held on 18th October 2012, at Heidelberg University, Germany. HUB brought together around 40 bioinformaticians from academia and industry to discuss the 'Biggest Challenges in Bioinformatics' in a 'World Café' style event.

  2. Biggest challenges in bioinformatics

    OpenAIRE

    Fuller, Jonathan C; Khoueiry, Pierre; Dinkel, Holger; Forslund, Kristoffer; Stamatakis, Alexandros; Barry, Joseph; Budd, Aidan; Soldatos, Theodoros G; Linssen, Katja; Rajput, Abdul Mateen

    2013-01-01

    The third Heidelberg Unseminars in Bioinformatics (HUB) was held in October at Heidelberg University in Germany. HUB brought together around 40 bioinformaticians from academia and industry to discuss the ‘Biggest Challenges in Bioinformatics' in a ‘World Café' style event.

  3. BioXSD: the common data-exchange format for everyday bioinformatics web services

    Science.gov (United States)

    Kalaš, Matúš; Puntervoll, Pæl; Joseph, Alexandre; Bartaševičiūtė, Edita; Töpfer, Armin; Venkataraman, Prabakar; Pettifer, Steve; Bryne, Jan Christian; Ison, Jon; Blanchet, Christophe; Rapacki, Kristoffer; Jonassen, Inge

    2010-01-01

    Motivation: The world-wide community of life scientists has access to a large number of public bioinformatics databases and tools, which are developed and deployed using diverse technologies and designs. More and more of the resources offer programmatic web-service interface. However, efficient use of the resources is hampered by the lack of widely used, standard data-exchange formats for the basic, everyday bioinformatics data types. Results: BioXSD has been developed as a candidate for standard, canonical exchange format for basic bioinformatics data. BioXSD is represented by a dedicated XML Schema and defines syntax for biological sequences, sequence annotations, alignments and references to resources. We have adapted a set of web services to use BioXSD as the input and output format, and implemented a test-case workflow. This demonstrates that the approach is feasible and provides smooth interoperability. Semantics for BioXSD is provided by annotation with the EDAM ontology. We discuss in a separate section how BioXSD relates to other initiatives and approaches, including existing standards and the Semantic Web. Availability: The BioXSD 1.0 XML Schema is freely available at http://www.bioxsd.org/BioXSD-1.0.xsd under the Creative Commons BY-ND 3.0 license. The http://bioxsd.org web page offers documentation, examples of data in BioXSD format, example workflows with source codes in common programming languages, an updated list of compatible web services and tools and a repository of feature requests from the community. Contact: matus.kalas@bccs.uib.no; developers@bioxsd.org; support@bioxsd.org PMID:20823319

  4. Establishing bioinformatics research in the Asia Pacific

    Directory of Open Access Journals (Sweden)

    Tammi Martti

    2006-12-01

    Full Text Available Abstract In 1998, the Asia Pacific Bioinformatics Network (APBioNet, Asia's oldest bioinformatics organisation was set up to champion the advancement of bioinformatics in the Asia Pacific. By 2002, APBioNet was able to gain sufficient critical mass to initiate the first International Conference on Bioinformatics (InCoB bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2006 Conference was organized as the 5th annual conference of the Asia-Pacific Bioinformatics Network, on Dec. 18–20, 2006 in New Delhi, India, following a series of successful events in Bangkok (Thailand, Penang (Malaysia, Auckland (New Zealand and Busan (South Korea. This Introduction provides a brief overview of the peer-reviewed manuscripts accepted for publication in this Supplement. It exemplifies a typical snapshot of the growing research excellence in bioinformatics of the region as we embark on a trajectory of establishing a solid bioinformatics research culture in the Asia Pacific that is able to contribute fully to the global bioinformatics community.

  5. Bioclipse: an open source workbench for chemo- and bioinformatics

    Directory of Open Access Journals (Sweden)

    Wagener Johannes

    2007-02-01

    Full Text Available Abstract Background There is a need for software applications that provide users with a complete and extensible toolkit for chemo- and bioinformatics accessible from a single workbench. Commercial packages are expensive and closed source, hence they do not allow end users to modify algorithms and add custom functionality. Existing open source projects are more focused on providing a framework for integrating existing, separately installed bioinformatics packages, rather than providing user-friendly interfaces. No open source chemoinformatics workbench has previously been published, and no sucessful attempts have been made to integrate chemo- and bioinformatics into a single framework. Results Bioclipse is an advanced workbench for resources in chemo- and bioinformatics, such as molecules, proteins, sequences, spectra, and scripts. It provides 2D-editing, 3D-visualization, file format conversion, calculation of chemical properties, and much more; all fully integrated into a user-friendly desktop application. Editing supports standard functions such as cut and paste, drag and drop, and undo/redo. Bioclipse is written in Java and based on the Eclipse Rich Client Platform with a state-of-the-art plugin architecture. This gives Bioclipse an advantage over other systems as it can easily be extended with functionality in any desired direction. Conclusion Bioclipse is a powerful workbench for bio- and chemoinformatics as well as an advanced integration platform. The rich functionality, intuitive user interface, and powerful plugin architecture make Bioclipse the most advanced and user-friendly open source workbench for chemo- and bioinformatics. Bioclipse is released under Eclipse Public License (EPL, an open source license which sets no constraints on external plugin licensing; it is totally open for both open source plugins as well as commercial ones. Bioclipse is freely available at http://www.bioclipse.net.

  6. MAPI: towards the integrated exploitation of bioinformatics Web Services.

    Science.gov (United States)

    Ramirez, Sergio; Karlsson, Johan; Trelles, Oswaldo

    2011-10-27

    Bioinformatics is commonly featured as a well assorted list of available web resources. Although diversity of services is positive in general, the proliferation of tools, their dispersion and heterogeneity complicate the integrated exploitation of such data processing capacity. To facilitate the construction of software clients and make integrated use of this variety of tools, we present a modular programmatic application interface (MAPI) that provides the necessary functionality for uniform representation of Web Services metadata descriptors including their management and invocation protocols of the services which they represent. This document describes the main functionality of the framework and how it can be used to facilitate the deployment of new software under a unified structure of bioinformatics Web Services. A notable feature of MAPI is the modular organization of the functionality into different modules associated with specific tasks. This means that only the modules needed for the client have to be installed, and that the module functionality can be extended without the need for re-writing the software client. The potential utility and versatility of the software library has been demonstrated by the implementation of several currently available clients that cover different aspects of integrated data processing, ranging from service discovery to service invocation with advanced features such as workflows composition and asynchronous services calls to multiple types of Web Services including those registered in repositories (e.g. GRID-based, SOAP, BioMOBY, R-bioconductor, and others).

  7. Preface to Introduction to Structural Bioinformatics

    NARCIS (Netherlands)

    Feenstra, K. Anton; Abeln, Sanne

    2018-01-01

    While many good textbooks are available on Protein Structure, Molecular Simulations, Thermodynamics and Bioinformatics methods in general, there is no good introductory level book for the field of Structural Bioinformatics. This book aims to give an introduction into Structural Bioinformatics, which

  8. Computational biology and bioinformatics in Nigeria.

    Science.gov (United States)

    Fatumo, Segun A; Adoga, Moses P; Ojo, Opeolu O; Oluwagbemi, Olugbenga; Adeoye, Tolulope; Ewejobi, Itunuoluwa; Adebiyi, Marion; Adebiyi, Ezekiel; Bewaji, Clement; Nashiru, Oyekanmi

    2014-04-01

    Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  9. Computational biology and bioinformatics in Nigeria.

    Directory of Open Access Journals (Sweden)

    Segun A Fatumo

    2014-04-01

    Full Text Available Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  10. Bioinformatics and Microarray Data Analysis on the Cloud.

    Science.gov (United States)

    Calabrese, Barbara; Cannataro, Mario

    2016-01-01

    High-throughput platforms such as microarray, mass spectrometry, and next-generation sequencing are producing an increasing volume of omics data that needs large data storage and computing power. Cloud computing offers massive scalable computing and storage, data sharing, on-demand anytime and anywhere access to resources and applications, and thus, it may represent the key technology for facing those issues. In fact, in the recent years it has been adopted for the deployment of different bioinformatics solutions and services both in academia and in the industry. Although this, cloud computing presents several issues regarding the security and privacy of data, that are particularly important when analyzing patients data, such as in personalized medicine. This chapter reviews main academic and industrial cloud-based bioinformatics solutions; with a special focus on microarray data analysis solutions and underlines main issues and problems related to the use of such platforms for the storage and analysis of patients data.

  11. A Review of Recent Advances in Translational Bioinformatics: Bridges from Biology to Medicine.

    Science.gov (United States)

    Vamathevan, J; Birney, E

    2017-08-01

    Objectives: To highlight and provide insights into key developments in translational bioinformatics between 2014 and 2016. Methods: This review describes some of the most influential bioinformatics papers and resources that have been published between 2014 and 2016 as well as the national genome sequencing initiatives that utilize these resources to routinely embed genomic medicine into healthcare. Also discussed are some applications of the secondary use of patient data followed by a comprehensive view of the open challenges and emergent technologies. Results: Although data generation can be performed routinely, analyses and data integration methods still require active research and standardization to improve streamlining of clinical interpretation. The secondary use of patient data has resulted in the development of novel algorithms and has enabled a refined understanding of cellular and phenotypic mechanisms. New data storage and data sharing approaches are required to enable diverse biomedical communities to contribute to genomic discovery. Conclusion: The translation of genomics data into actionable knowledge for use in healthcare is transforming the clinical landscape in an unprecedented way. Exciting and innovative models that bridge the gap between clinical and academic research are set to open up the field of translational bioinformatics for rapid growth in a digital era. Georg Thieme Verlag KG Stuttgart.

  12. Establishing bioinformatics research in the Asia Pacific

    OpenAIRE

    Ranganathan, Shoba; Tammi, Martti; Gribskov, Michael; Tan, Tin Wee

    2006-01-01

    Abstract In 1998, the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation was set up to champion the advancement of bioinformatics in the Asia Pacific. By 2002, APBioNet was able to gain sufficient critical mass to initiate the first International Conference on Bioinformatics (InCoB) bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2006 Conference was organized as the 5th annual conference of the Asia-...

  13. increased specialisation causes the demise of animal clades

    OpenAIRE

    Raia, P.; Carotenuto, F.; Mondanaro, A.; Castiglione, S.; Passaro, F.; Saggese, F.; Melchionna, M.; Serio, C.; Alessio, L.; Silvestro, D.; Fortelius, M.

    2016-01-01

    Animal clades tend to follow a predictable path of waxing and waning during their existence, regardless of their total species richness or geographic coverage. Clades begin small and undifferentiated, then expand to a peak in diversity and range, only to shift into a rarely broken decline towards extinction. While this trajectory is now well documented and broadly recognised, the reasons underlying it remain obscure. In particular, it is unknown why clade extinction is universal and occurs wi...

  14. Evaluation of immunological cross-reactivity between clade A9 high-risk human papillomavirus types on the basis of E6-Specific CD4+ memory T cell responses

    NARCIS (Netherlands)

    van den Hende, Muriel; Redeker, Anke; Kwappenberg, Kitty M. C.; Franken, Kees L. M. C.; Drijfhout, Jan W.; Oostendorp, Jaap; Valentijn, A. Rob P. M.; Fathers, Loraine M.; Welters, Marij J. P.; Melief, Cornelis J. M.; Kenter, Gemma G.; van der Burg, Sjoerd H.; Offringa, Rienk

    2010-01-01

    CD4(+) T cell responses against the E6 oncoprotein of human papillomavirus (HPV) type 16 and 5 closely related members of clade A9 (HPV31, 33, 35, 52, and 58) were charted in peripheral blood mononuclear cell cultures from healthy subjects and patients who underwent HPV16 E6/E7-specific vaccination.

  15. Structural and antigenic variation among diverse clade 2 H5N1 viruses.

    Directory of Open Access Journals (Sweden)

    David A Shore

    Full Text Available Antigenic variation among circulating H5N1 highly pathogenic avian influenza A viruses mandates the continuous production of strain-specific pre-pandemic vaccine candidates and represents a significant challenge for pandemic preparedness. Here we assessed the structural, antigenic and receptor-binding properties of three H5N1 HPAI virus hemagglutinins, which were recently selected by the WHO as vaccine candidates [A/Egypt/N03072/2010 (Egypt10, clade 2.2.1, A/Hubei/1/2010 (Hubei10, clade 2.3.2.1 and A/Anhui/1/2005 (Anhui05, clade 2.3.4]. These analyses revealed that antigenic diversity among these three isolates was restricted to changes in the size and charge of amino acid side chains at a handful of positions, spatially equivalent to the antigenic sites identified in H1 subtype viruses circulating among humans. All three of the H5N1 viruses analyzed in this study were responsible for fatal human infections, with the most recently-isolated strains, Hubei10 and Egypt10, containing multiple residues in the receptor-binding site of the HA, which were suspected to enhance mammalian transmission. However, glycan-binding analyses demonstrated a lack of binding to human α2-6-linked sialic acid receptor analogs for all three HAs, reinforcing the notion that receptor-binding specificity contributes only partially to transmissibility and pathogenesis of HPAI viruses and suggesting that changes in host specificity must be interpreted in the context of the host and environmental factors, as well as the virus as a whole. Together, our data reveal structural linkages with phylogenetic and antigenic analyses of recently emerged H5N1 virus clades and should assist in interpreting the significance of future changes in antigenic and receptor-binding properties.

  16. HIV controllers exhibit enhanced frequencies of major histocompatibility complex class II tetramer+ Gag-specific CD4+ T cells in chronic clade C HIV-1 infection

    DEFF Research Database (Denmark)

    Laher, Faatima; Ranasinghe, Srinika; Porichis, Filippos

    2017-01-01

    Immune control of viral infections is heavily dependent on helper CD4+ T cell function. However, the understanding of the contribution of HIV-specific CD4+ T cell responses to immune protection against HIV-1, particularly in clade C infection, remains incomplete. Recently, major histocompatibilit...

  17. BioQueue: a novel pipeline framework to accelerate bioinformatics analysis.

    Science.gov (United States)

    Yao, Li; Wang, Heming; Song, Yuanyuan; Sui, Guangchao

    2017-10-15

    With the rapid development of Next-Generation Sequencing, a large amount of data is now available for bioinformatics research. Meanwhile, the presence of many pipeline frameworks makes it possible to analyse these data. However, these tools concentrate mainly on their syntax and design paradigms, and dispatch jobs based on users' experience about the resources needed by the execution of a certain step in a protocol. As a result, it is difficult for these tools to maximize the potential of computing resources, and avoid errors caused by overload, such as memory overflow. Here, we have developed BioQueue, a web-based framework that contains a checkpoint before each step to automatically estimate the system resources (CPU, memory and disk) needed by the step and then dispatch jobs accordingly. BioQueue possesses a shell command-like syntax instead of implementing a new script language, which means most biologists without computer programming background can access the efficient queue system with ease. BioQueue is freely available at https://github.com/liyao001/BioQueue. The extensive documentation can be found at http://bioqueue.readthedocs.io. li_yao@outlook.com or gcsui@nefu.edu.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  18. Generalized Centroid Estimators in Bioinformatics

    Science.gov (United States)

    Hamada, Michiaki; Kiryu, Hisanori; Iwasaki, Wataru; Asai, Kiyoshi

    2011-01-01

    In a number of estimation problems in bioinformatics, accuracy measures of the target problem are usually given, and it is important to design estimators that are suitable to those accuracy measures. However, there is often a discrepancy between an employed estimator and a given accuracy measure of the problem. In this study, we introduce a general class of efficient estimators for estimation problems on high-dimensional binary spaces, which represent many fundamental problems in bioinformatics. Theoretical analysis reveals that the proposed estimators generally fit with commonly-used accuracy measures (e.g. sensitivity, PPV, MCC and F-score) as well as it can be computed efficiently in many cases, and cover a wide range of problems in bioinformatics from the viewpoint of the principle of maximum expected accuracy (MEA). It is also shown that some important algorithms in bioinformatics can be interpreted in a unified manner. Not only the concept presented in this paper gives a useful framework to design MEA-based estimators but also it is highly extendable and sheds new light on many problems in bioinformatics. PMID:21365017

  19. GENEASE: Real time bioinformatics tool for multi-omics and disease ontology exploration, analysis and visualization.

    Science.gov (United States)

    Ghandikota, Sudhir; Hershey, Gurjit K Khurana; Mersha, Tesfaye B

    2018-03-24

    Advances in high-throughput sequencing technologies have made it possible to generate multiple omics data at an unprecedented rate and scale. The accumulation of these omics data far outpaces the rate at which biologists can mine and generate new hypothesis to test experimentally. There is an urgent need to develop a myriad of powerful tools to efficiently and effectively search and filter these resources to address specific post-GWAS functional genomics questions. However, to date, these resources are scattered across several databases and often lack a unified portal for data annotation and analytics. In addition, existing tools to analyze and visualize these databases are highly fragmented, resulting researchers to access multiple applications and manual interventions for each gene or variant in an ad hoc fashion until all the questions are answered. In this study, we present GENEASE, a web-based one-stop bioinformatics tool designed to not only query and explore multi-omics and phenotype databases (e.g., GTEx, ClinVar, dbGaP, GWAS Catalog, ENCODE, Roadmap Epigenomics, KEGG, Reactome, Gene and Phenotype Ontology) in a single web interface but also to perform seamless post genome-wide association downstream functional and overlap analysis for non-coding regulatory variants. GENEASE accesses over 50 different databases in public domain including model organism-specific databases to facilitate gene/variant and disease exploration, enrichment and overlap analysis in real time. It is a user-friendly tool with point-and-click interface containing links for support information including user manual and examples. GENEASE can be accessed freely at http://research.cchmc.org/mershalab/genease_new/login.html. Tesfaye.Mersha@cchmc.org, Sudhir.Ghandikota@cchmc.org. Supplementary data are available at Bioinformatics online.

  20. The first complete chloroplast genome of the Genistoid legume Lupinus luteus: evidence for a novel major lineage-specific rearrangement and new insights regarding plastome evolution in the legume family

    OpenAIRE

    Martin , Guillaume E.; Rousseau-Gueutin , Mathieu; Cordonnier , Solenn; Lima , Oscar; Michon-Coudouel , Sophie; Naquin , Delphine; Ferreira De Carvalho , Julie; Aïnouche , Malika L.; Salmon , Armel; Aïnouche , Abdelkader

    2014-01-01

    support from the 'Plate-forme Génomique Environnementale et Fonctionnelle' (OSUR: INEE-CNRS) and the Genouest Bioinformatic Plateform (University of Rennes 1); International audience; † Background and Aims To date chloroplast genomes are available only for members of the non-protein amino acidaccumulating clade (NPAAA) Papilionoid lineages in the legume family (i.e. Millettioids, Robinoids and the 'inverted repeat-lacking clade', IRLC). It is thus very important to sequence plastomes from oth...

  1. Introduction to bioinformatics.

    Science.gov (United States)

    Can, Tolga

    2014-01-01

    Bioinformatics is an interdisciplinary field mainly involving molecular biology and genetics, computer science, mathematics, and statistics. Data intensive, large-scale biological problems are addressed from a computational point of view. The most common problems are modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution usually involves the following steps: Collect statistics from biological data. Build a computational model. Solve a computational modeling problem. Test and evaluate a computational algorithm. This chapter gives a brief introduction to bioinformatics by first providing an introduction to biological terminology and then discussing some classical bioinformatics problems organized by the types of data sources. Sequence analysis is the analysis of DNA and protein sequences for clues regarding function and includes subproblems such as identification of homologs, multiple sequence alignment, searching sequence patterns, and evolutionary analyses. Protein structures are three-dimensional data and the associated problems are structure prediction (secondary and tertiary), analysis of protein structures for clues regarding function, and structural alignment. Gene expression data is usually represented as matrices and analysis of microarray data mostly involves statistics analysis, classification, and clustering approaches. Biological networks such as gene regulatory networks, metabolic pathways, and protein-protein interaction networks are usually modeled as graphs and graph theoretic approaches are used to solve associated problems such as construction and analysis of large-scale networks.

  2. Bioinformatics clouds for big data manipulation.

    Science.gov (United States)

    Dai, Lin; Gao, Xin; Guo, Yan; Xiao, Jingfa; Zhang, Zhang

    2012-11-28

    As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.

  3. Bioinformatics and systems biology research update from the 15th International Conference on Bioinformatics (InCoB2016).

    Science.gov (United States)

    Schönbach, Christian; Verma, Chandra; Bond, Peter J; Ranganathan, Shoba

    2016-12-22

    The International Conference on Bioinformatics (InCoB) has been publishing peer-reviewed conference papers in BMC Bioinformatics since 2006. Of the 44 articles accepted for publication in supplement issues of BMC Bioinformatics, BMC Genomics, BMC Medical Genomics and BMC Systems Biology, 24 articles with a bioinformatics or systems biology focus are reviewed in this editorial. InCoB2017 is scheduled to be held in Shenzen, China, September 20-22, 2017.

  4. Designing XML schemas for bioinformatics.

    Science.gov (United States)

    Bruhn, Russel Elton; Burton, Philip John

    2003-06-01

    Data interchange bioinformatics databases will, in the future, most likely take place using extensible markup language (XML). The document structure will be described by an XML Schema rather than a document type definition (DTD). To ensure flexibility, the XML Schema must incorporate aspects of Object-Oriented Modeling. This impinges on the choice of the data model, which, in turn, is based on the organization of bioinformatics data by biologists. Thus, there is a need for the general bioinformatics community to be aware of the design issues relating to XML Schema. This paper, which is aimed at a general bioinformatics audience, uses examples to describe the differences between a DTD and an XML Schema and indicates how Unified Modeling Language diagrams may be used to incorporate Object-Oriented Modeling in the design of schema.

  5. Evaluating the Effectiveness of a Practical Inquiry-Based Learning Bioinformatics Module on Undergraduate Student Engagement and Applied Skills

    Science.gov (United States)

    Brown, James A. L.

    2016-01-01

    A pedagogic intervention, in the form of an inquiry-based peer-assisted learning project (as a practical student-led bioinformatics module), was assessed for its ability to increase students' engagement, practical bioinformatic skills and process-specific knowledge. Elements assessed were process-specific knowledge following module completion,…

  6. Bioinformatics

    DEFF Research Database (Denmark)

    Baldi, Pierre; Brunak, Søren

    , and medicine will be particularly affected by the new results and the increased understanding of life at the molecular level. Bioinformatics is the development and application of computer methods for analysis, interpretation, and prediction, as well as for the design of experiments. It has emerged...

  7. Bioinformatics education in high school: implications for promoting science, technology, engineering, and mathematics careers.

    Science.gov (United States)

    Kovarik, Dina N; Patterson, Davis G; Cohen, Carolyn; Sanders, Elizabeth A; Peterson, Karen A; Porter, Sandra G; Chowning, Jeanne Ting

    2013-01-01

    We investigated the effects of our Bio-ITEST teacher professional development model and bioinformatics curricula on cognitive traits (awareness, engagement, self-efficacy, and relevance) in high school teachers and students that are known to accompany a developing interest in science, technology, engineering, and mathematics (STEM) careers. The program included best practices in adult education and diverse resources to empower teachers to integrate STEM career information into their classrooms. The introductory unit, Using Bioinformatics: Genetic Testing, uses bioinformatics to teach basic concepts in genetics and molecular biology, and the advanced unit, Using Bioinformatics: Genetic Research, utilizes bioinformatics to study evolution and support student research with DNA barcoding. Pre-post surveys demonstrated significant growth (n = 24) among teachers in their preparation to teach the curricula and infuse career awareness into their classes, and these gains were sustained through the end of the academic year. Introductory unit students (n = 289) showed significant gains in awareness, relevance, and self-efficacy. While these students did not show significant gains in engagement, advanced unit students (n = 41) showed gains in all four cognitive areas. Lessons learned during Bio-ITEST are explored in the context of recommendations for other programs that wish to increase student interest in STEM careers.

  8. An innovative approach for testing bioinformatics programs using metamorphic testing

    Directory of Open Access Journals (Sweden)

    Liu Huai

    2009-01-01

    Full Text Available Abstract Background Recent advances in experimental and computational technologies have fueled the development of many sophisticated bioinformatics programs. The correctness of such programs is crucial as incorrectly computed results may lead to wrong biological conclusion or misguide downstream experimentation. Common software testing procedures involve executing the target program with a set of test inputs and then verifying the correctness of the test outputs. However, due to the complexity of many bioinformatics programs, it is often difficult to verify the correctness of the test outputs. Therefore our ability to perform systematic software testing is greatly hindered. Results We propose to use a novel software testing technique, metamorphic testing (MT, to test a range of bioinformatics programs. Instead of requiring a mechanism to verify whether an individual test output is correct, the MT technique verifies whether a pair of test outputs conform to a set of domain specific properties, called metamorphic relations (MRs, thus greatly increases the number and variety of test cases that can be applied. To demonstrate how MT is used in practice, we applied MT to test two open-source bioinformatics programs, namely GNLab and SeqMap. In particular we show that MT is simple to implement, and is effective in detecting faults in a real-life program and some artificially fault-seeded programs. Further, we discuss how MT can be applied to test programs from various domains of bioinformatics. Conclusion This paper describes the application of a simple, effective and automated technique to systematically test a range of bioinformatics programs. We show how MT can be implemented in practice through two real-life case studies. Since many bioinformatics programs, particularly those for large scale simulation and data analysis, are hard to test systematically, their developers may benefit from using MT as part of the testing strategy. Therefore our work

  9. Bioinformatics clouds for big data manipulation

    Directory of Open Access Journals (Sweden)

    Dai Lin

    2012-11-01

    Full Text Available Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS, Software as a Service (SaaS, Platform as a Service (PaaS, and Infrastructure as a Service (IaaS, and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.

  10. Bioinformatics clouds for big data manipulation

    KAUST Repository

    Dai, Lin

    2012-11-28

    As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics.This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor. 2012 Dai et al.; licensee BioMed Central Ltd.

  11. Towards a molecular taxonomic key of the Aurantioideae subfamily using chloroplastic SNP diagnostic markers of the main clades genotyped by competitive allele-specific PCR.

    Science.gov (United States)

    Oueslati, Amel; Ollitrault, Frederique; Baraket, Ghada; Salhi-Hannachi, Amel; Navarro, Luis; Ollitrault, Patrick

    2016-08-18

    Chloroplast DNA is a primary source of molecular variations for phylogenetic analysis of photosynthetic eukaryotes. However, the sequencing and analysis of multiple chloroplastic regions is difficult to apply to large collections or large samples of natural populations. The objective of our work was to demonstrate that a molecular taxonomic key based on easy, scalable and low-cost genotyping method should be developed from a set of Single Nucleotide Polymorphisms (SNPs) diagnostic of well-established clades. It was applied to the Aurantioideae subfamily, the largest group of the Rutaceae family that includes the cultivated citrus species. The publicly available nucleotide sequences of eight plastid genomic regions were compared for 79 accessions of the Aurantioideae subfamily to search for SNPs revealing taxonomic differentiation at the inter-tribe, inter-subtribe, inter-genus and interspecific levels. Diagnostic SNPs (DSNPs) were found for 46 of the 54 clade levels analysed. Forty DSNPs were selected to develop KASPar markers and their taxonomic value was tested by genotyping 108 accessions of the Aurantioideae subfamily. Twenty-seven markers diagnostic of 24 clades were validated and they displayed a very high rate of transferability in the Aurantioideae subfamily (only 1.2 % of missing data on average). The UPGMA from the validated markers produced a cladistic organisation that was highly coherent with the previous phylogenetic analysis based on the sequence data of the eight plasmid regions. In particular, the monophyletic origin of the "true citrus" genera plus Oxanthera was validated. However, some clarification remains necessary regarding the organisation of the other wild species of the Citreae tribe. We validated the concept that with well-established clades, DSNPs can be selected and efficiently transformed into competitive allele-specific PCR markers (KASPar method) allowing cost-effective highly efficient cladistic analysis in large collections at

  12. Interdisciplinary Introductory Course in Bioinformatics

    Science.gov (United States)

    Kortsarts, Yana; Morris, Robert W.; Utell, Janine M.

    2010-01-01

    Bioinformatics is a relatively new interdisciplinary field that integrates computer science, mathematics, biology, and information technology to manage, analyze, and understand biological, biochemical and biophysical information. We present our experience in teaching an interdisciplinary course, Introduction to Bioinformatics, which was developed…

  13. pocketZebra: a web-server for automated selection and classification of subfamily-specific binding sites by bioinformatic analysis of diverse protein families.

    Science.gov (United States)

    Suplatov, Dmitry; Kirilin, Eugeny; Arbatsky, Mikhail; Takhaveev, Vakil; Svedas, Vytas

    2014-07-01

    The new web-server pocketZebra implements the power of bioinformatics and geometry-based structural approaches to identify and rank subfamily-specific binding sites in proteins by functional significance, and select particular positions in the structure that determine selective accommodation of ligands. A new scoring function has been developed to annotate binding sites by the presence of the subfamily-specific positions in diverse protein families. pocketZebra web-server has multiple input modes to meet the needs of users with different experience in bioinformatics. The server provides on-site visualization of the results as well as off-line version of the output in annotated text format and as PyMol sessions ready for structural analysis. pocketZebra can be used to study structure-function relationship and regulation in large protein superfamilies, classify functionally important binding sites and annotate proteins with unknown function. The server can be used to engineer ligand-binding sites and allosteric regulation of enzymes, or implemented in a drug discovery process to search for potential molecular targets and novel selective inhibitors/effectors. The server, documentation and examples are freely available at http://biokinet.belozersky.msu.ru/pocketzebra and there are no login requirements. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  14. jORCA: easily integrating bioinformatics Web Services.

    Science.gov (United States)

    Martín-Requena, Victoria; Ríos, Javier; García, Maximiliano; Ramírez, Sergio; Trelles, Oswaldo

    2010-02-15

    Web services technology is becoming the option of choice to deploy bioinformatics tools that are universally available. One of the major strengths of this approach is that it supports machine-to-machine interoperability over a network. However, a weakness of this approach is that various Web Services differ in their definition and invocation protocols, as well as their communication and data formats-and this presents a barrier to service interoperability. jORCA is a desktop client aimed at facilitating seamless integration of Web Services. It does so by making a uniform representation of the different web resources, supporting scalable service discovery, and automatic composition of workflows. Usability is at the top of the jORCA agenda; thus it is a highly customizable and extensible application that accommodates a broad range of user skills featuring double-click invocation of services in conjunction with advanced execution-control, on the fly data standardization, extensibility of viewer plug-ins, drag-and-drop editing capabilities, plus a file-based browsing style and organization of favourite tools. The integration of bioinformatics Web Services is made easier to support a wider range of users. .

  15. Lineage-specific responses of tooth shape in murine rodents (murinae, rodentia) to late Miocene dietary change in the Siwaliks of Pakistan.

    Science.gov (United States)

    Kimura, Yuri; Jacobs, Louis L; Flynn, Lawrence J

    2013-01-01

    Past ecological responses of mammals to climate change are recognized in the fossil record by adaptive significance of morphological variations. To understand the role of dietary behavior on functional adaptations of dental morphology in rodent evolution, we examine evolutionary change of tooth shape in late Miocene Siwalik murine rodents, which experienced a dietary shift toward C4 diets during late Miocene ecological change indicated by carbon isotopic evidence. Geometric morphometric analysis in the outline of upper first molars captures dichotomous lineages of Siwalik murines, in agreement with phylogenetic hypotheses of previous studies (two distinct clades: the Karnimata and Progonomys clades), and indicates lineage-specific functional responses to mechanical properties of their diets. Tooth shapes of the two clades are similar at their sympatric origin but deviate from each other with decreasing overlap through time. Shape change in the Karnimata clade is associated with greater efficiency of propalinal chewing for tough diets than in the Progonomys clade. Larger body mass in Karnimata may be related to exploitation of lower-quality food items, such as grasses, than in smaller-bodied Progonomys. The functional and ecophysiological aspects of Karnimata exploiting C4 grasses are concordant with their isotopic dietary preference relative to Progonomys. Lineage-specific selection was differentially greater in Karnimata, and a faster rate of shape change toward derived Karnimata facilitated inclusion of C4 grasses in the diet. Sympatric speciation in these clades is most plausibly explained by interspecific competition on resource utilization between the two, based on comparisons of our results with the carbon isotope data. Interspecific competition with Karnimata may have suppressed morphological innovation of the Progonomys clade. Pairwise analyses of morphological and carbon isotope data can uncover ecological causes of sympatric speciation and define

  16. Lineage-specific responses of tooth shape in murine rodents (murinae, rodentia to late Miocene dietary change in the Siwaliks of Pakistan.

    Directory of Open Access Journals (Sweden)

    Yuri Kimura

    Full Text Available Past ecological responses of mammals to climate change are recognized in the fossil record by adaptive significance of morphological variations. To understand the role of dietary behavior on functional adaptations of dental morphology in rodent evolution, we examine evolutionary change of tooth shape in late Miocene Siwalik murine rodents, which experienced a dietary shift toward C4 diets during late Miocene ecological change indicated by carbon isotopic evidence. Geometric morphometric analysis in the outline of upper first molars captures dichotomous lineages of Siwalik murines, in agreement with phylogenetic hypotheses of previous studies (two distinct clades: the Karnimata and Progonomys clades, and indicates lineage-specific functional responses to mechanical properties of their diets. Tooth shapes of the two clades are similar at their sympatric origin but deviate from each other with decreasing overlap through time. Shape change in the Karnimata clade is associated with greater efficiency of propalinal chewing for tough diets than in the Progonomys clade. Larger body mass in Karnimata may be related to exploitation of lower-quality food items, such as grasses, than in smaller-bodied Progonomys. The functional and ecophysiological aspects of Karnimata exploiting C4 grasses are concordant with their isotopic dietary preference relative to Progonomys. Lineage-specific selection was differentially greater in Karnimata, and a faster rate of shape change toward derived Karnimata facilitated inclusion of C4 grasses in the diet. Sympatric speciation in these clades is most plausibly explained by interspecific competition on resource utilization between the two, based on comparisons of our results with the carbon isotope data. Interspecific competition with Karnimata may have suppressed morphological innovation of the Progonomys clade. Pairwise analyses of morphological and carbon isotope data can uncover ecological causes of sympatric speciation

  17. Taking Bioinformatics to Systems Medicine.

    Science.gov (United States)

    van Kampen, Antoine H C; Moerland, Perry D

    2016-01-01

    Systems medicine promotes a range of approaches and strategies to study human health and disease at a systems level with the aim of improving the overall well-being of (healthy) individuals, and preventing, diagnosing, or curing disease. In this chapter we discuss how bioinformatics critically contributes to systems medicine. First, we explain the role of bioinformatics in the management and analysis of data. In particular we show the importance of publicly available biological and clinical repositories to support systems medicine studies. Second, we discuss how the integration and analysis of multiple types of omics data through integrative bioinformatics may facilitate the determination of more predictive and robust disease signatures, lead to a better understanding of (patho)physiological molecular mechanisms, and facilitate personalized medicine. Third, we focus on network analysis and discuss how gene networks can be constructed from omics data and how these networks can be decomposed into smaller modules. We discuss how the resulting modules can be used to generate experimentally testable hypotheses, provide insight into disease mechanisms, and lead to predictive models. Throughout, we provide several examples demonstrating how bioinformatics contributes to systems medicine and discuss future challenges in bioinformatics that need to be addressed to enable the advancement of systems medicine.

  18. Is there room for ethics within bioinformatics education?

    Science.gov (United States)

    Taneri, Bahar

    2011-07-01

    When bioinformatics education is considered, several issues are addressed. At the undergraduate level, the main issue revolves around conveying information from two main and different fields: biology and computer science. At the graduate level, the main issue is bridging the gap between biology students and computer science students. However, there is an educational component that is rarely addressed within the context of bioinformatics education: the ethics component. Here, a different perspective is provided on bioinformatics education, and the current status of ethics is analyzed within the existing bioinformatics programs. Analysis of the existing undergraduate and graduate programs, in both Europe and the United States, reveals the minimal attention given to ethics within bioinformatics education. Given that bioinformaticians speedily and effectively shape the biomedical sciences and hence their implications for society, here redesigning of the bioinformatics curricula is suggested in order to integrate the necessary ethics education. Unique ethical problems awaiting bioinformaticians and bioinformatics ethics as a separate field of study are discussed. In addition, a template for an "Ethics in Bioinformatics" course is provided.

  19. Rising Strengths Hong Kong SAR in Bioinformatics.

    Science.gov (United States)

    Chakraborty, Chiranjib; George Priya Doss, C; Zhu, Hailong; Agoramoorthy, Govindasamy

    2017-06-01

    Hong Kong's bioinformatics sector is attaining new heights in combination with its economic boom and the predominance of the working-age group in its population. Factors such as a knowledge-based and free-market economy have contributed towards a prominent position on the world map of bioinformatics. In this review, we have considered the educational measures, landmark research activities and the achievements of bioinformatics companies and the role of the Hong Kong government in the establishment of bioinformatics as strength. However, several hurdles remain. New government policies will assist computational biologists to overcome these hurdles and further raise the profile of the field. There is a high expectation that bioinformatics in Hong Kong will be a promising area for the next generation.

  20. EURASIP journal on bioinformatics & systems biology

    National Research Council Canada - National Science Library

    2006-01-01

    "The overall aim of "EURASIP Journal on Bioinformatics and Systems Biology" is to publish research results related to signal processing and bioinformatics theories and techniques relevant to a wide...

  1. Virtual Bioinformatics Distance Learning Suite

    Science.gov (United States)

    Tolvanen, Martti; Vihinen, Mauno

    2004-01-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material…

  2. The 2016 Bioinformatics Open Source Conference (BOSC).

    Science.gov (United States)

    Harris, Nomi L; Cock, Peter J A; Chapman, Brad; Fields, Christopher J; Hokamp, Karsten; Lapp, Hilmar; Muñoz-Torres, Monica; Wiencko, Heather

    2016-01-01

    Message from the ISCB: The Bioinformatics Open Source Conference (BOSC) is a yearly meeting organized by the Open Bioinformatics Foundation (OBF), a non-profit group dedicated to promoting the practice and philosophy of Open Source software development and Open Science within the biological research community. BOSC has been run since 2000 as a two-day Special Interest Group (SIG) before the annual ISMB conference. The 17th annual BOSC ( http://www.open-bio.org/wiki/BOSC_2016) took place in Orlando, Florida in July 2016. As in previous years, the conference was preceded by a two-day collaborative coding event open to the bioinformatics community. The conference brought together nearly 100 bioinformatics researchers, developers and users of open source software to interact and share ideas about standards, bioinformatics software development, and open and reproducible science.

  3. Bioinformatic tools and guideline for PCR primer design | Abd ...

    African Journals Online (AJOL)

    Bioinformatics has become an essential tool not only for basic research but also for applied research in biotechnology and biomedical sciences. Optimal primer sequence and appropriate primer concentration are essential for maximal specificity and efficiency of PCR. A poorly designed primer can result in little or no ...

  4. Characterization of Prochlorococcus clades from iron-depleted oceanic regions.

    Science.gov (United States)

    Rusch, Douglas B; Martiny, Adam C; Dupont, Christopher L; Halpern, Aaron L; Venter, J Craig

    2010-09-14

    Prochlorococcus describes a diverse and abundant genus of marine photosynthetic microbes. It is primarily found in oligotrophic waters across the globe and plays a crucial role in energy and nutrient cycling in the ocean ecosystem. The abundance, global distribution, and availability of isolates make Prochlorococcus a model system for understanding marine microbial diversity and biogeochemical cycling. Analysis of 73 metagenomic samples from the Global Ocean Sampling expedition acquired in the Atlantic, Pacific, and Indian Oceans revealed the presence of two uncharacterized Prochlorococcus clades. A phylogenetic analysis using six different genetic markers places the clades close to known lineages adapted to high-light environments. The two uncharacterized clades consistently cooccur and dominate the surface waters of high-temperature, macronutrient-replete, and low-iron regions of the Eastern Equatorial Pacific upwelling and the tropical Indian Ocean. They are genetically distinct from each other and other high-light Prochlorococcus isolates and likely define a previously unrecognized ecotype. Our detailed genomic analysis indicates that these clades comprise organisms that are adapted to iron-depleted environments by reducing their iron quota through the loss of several iron-containing proteins that likely function as electron sinks in the photosynthetic pathway in other Prochlorococcus clades from high-light environments. The presence and inferred physiology of these clades may explain why Prochlorococcus populations from iron-depleted regions do not respond to iron fertilization experiments and further expand our understanding of how phytoplankton adapt to variations in nutrient availability in the ocean.

  5. XML schemas for common bioinformatic data types and their application in workflow systems.

    Science.gov (United States)

    Seibel, Philipp N; Krüger, Jan; Hartmeier, Sven; Schwarzer, Knut; Löwenthal, Kai; Mersch, Henning; Dandekar, Thomas; Giegerich, Robert

    2006-11-06

    Today, there is a growing need in bioinformatics to combine available software tools into chains, thus building complex applications from existing single-task tools. To create such workflows, the tools involved have to be able to work with each other's data--therefore, a common set of well-defined data formats is needed. Unfortunately, current bioinformatic tools use a great variety of heterogeneous formats. Acknowledging the need for common formats, the Helmholtz Open BioInformatics Technology network (HOBIT) identified several basic data types used in bioinformatics and developed appropriate format descriptions, formally defined by XML schemas, and incorporated them in a Java library (BioDOM). These schemas currently cover sequence, sequence alignment, RNA secondary structure and RNA secondary structure alignment formats in a form that is independent of any specific program, thus enabling seamless interoperation of different tools. All XML formats are available at http://bioschemas.sourceforge.net, the BioDOM library can be obtained at http://biodom.sourceforge.net. The HOBIT XML schemas and the BioDOM library simplify adding XML support to newly created and existing bioinformatic tools, enabling these tools to interoperate seamlessly in workflow scenarios.

  6. XML schemas for common bioinformatic data types and their application in workflow systems

    Science.gov (United States)

    Seibel, Philipp N; Krüger, Jan; Hartmeier, Sven; Schwarzer, Knut; Löwenthal, Kai; Mersch, Henning; Dandekar, Thomas; Giegerich, Robert

    2006-01-01

    Background Today, there is a growing need in bioinformatics to combine available software tools into chains, thus building complex applications from existing single-task tools. To create such workflows, the tools involved have to be able to work with each other's data – therefore, a common set of well-defined data formats is needed. Unfortunately, current bioinformatic tools use a great variety of heterogeneous formats. Results Acknowledging the need for common formats, the Helmholtz Open BioInformatics Technology network (HOBIT) identified several basic data types used in bioinformatics and developed appropriate format descriptions, formally defined by XML schemas, and incorporated them in a Java library (BioDOM). These schemas currently cover sequence, sequence alignment, RNA secondary structure and RNA secondary structure alignment formats in a form that is independent of any specific program, thus enabling seamless interoperation of different tools. All XML formats are available at , the BioDOM library can be obtained at . Conclusion The HOBIT XML schemas and the BioDOM library simplify adding XML support to newly created and existing bioinformatic tools, enabling these tools to interoperate seamlessly in workflow scenarios. PMID:17087823

  7. Bioinformatics for cancer immunotherapy target discovery

    DEFF Research Database (Denmark)

    Olsen, Lars Rønn; Campos, Benito; Barnkob, Mike Stein

    2014-01-01

    therapy target discovery in a bioinformatics analysis pipeline. We describe specialized bioinformatics tools and databases for three main bottlenecks in immunotherapy target discovery: the cataloging of potentially antigenic proteins, the identification of potential HLA binders, and the selection epitopes...

  8. Expanding the World of Marine Bacterial and Archaeal Clades

    Science.gov (United States)

    Yilmaz, Pelin; Yarza, Pablo; Rapp, Josephine Z.; Glöckner, Frank O.

    2016-01-01

    Determining which microbial taxa are out there, where they live, and what they are doing is a driving approach in marine microbial ecology. The importance of these questions is underlined by concerted, large-scale, and global ocean sampling initiatives, for example the International Census of Marine Microbes, Ocean Sampling Day, or Tara Oceans. Given decades of effort, we know that the large majority of marine Bacteria and Archaea belong to about a dozen phyla. In addition to the classically culturable Bacteria and Archaea, at least 50 “clades,” at different taxonomic depths, exist. These account for the majority of marine microbial diversity, but there is still an underexplored and less abundant portion remaining. We refer to these hitherto unrecognized clades as unknown, as their boundaries, names, and classifications are not available. In this work, we were able to characterize up to 92 of these unknown clades found within the bacterial and archaeal phylogenetic diversity currently reported for marine water column environments. We mined the SILVA 16S rRNA gene datasets for sequences originating from the marine water column. Instead of the usual subjective taxa delineation and nomenclature methods, we applied the candidate taxonomic unit (CTU) circumscription system, along with a standardized nomenclature to the sequences in newly constructed phylogenetic trees. With this new phylogenetic and taxonomic framework, we performed an analysis of ICoMM rRNA gene amplicon datasets to gain insights into the global distribution of the new marine clades, their ecology, biogeography, and interaction with oceanographic variables. Most of the new clades we identified were interspersed by known taxa with cultivated members, whose genome sequences are available. This result encouraged us to perform metabolic predictions for the novel marine clades using the PICRUSt approach. Our work also provides an update on the taxonomy of several phyla and widely known marine clades as

  9. Targeted sequencing of clade-specific markers from skin microbiomes for forensic human identification.

    Science.gov (United States)

    Schmedes, Sarah E; Woerner, August E; Novroski, Nicole M M; Wendt, Frank R; King, Jonathan L; Stephens, Kathryn M; Budowle, Bruce

    2018-01-01

    The human skin microbiome is comprised of diverse communities of bacterial, eukaryotic, and viral taxa and contributes millions of additional genes to the repertoire of human genes, affecting human metabolism and immune response. Numerous genetic and environmental factors influence the microbiome composition and as such contribute to individual-specific microbial signatures which may be exploited for forensic applications. Previous studies have demonstrated the potential to associate skin microbial profiles collected from touched items to their individual owner, mainly using unsupervised methods from samples collected over short time intervals. Those studies utilize either targeted 16S rRNA or shotgun metagenomic sequencing to characterize skin microbiomes; however, these approaches have limited species and strain resolution and susceptibility to stochastic effects, respectively. Clade-specific markers from the skin microbiome, using supervised learning, can predict individual identity using skin microbiomes from their respective donors with high accuracy. In this study the hidSkinPlex is presented, a novel targeted sequencing method using skin microbiome markers developed for human identification. The hidSkinPlex (comprised of 286 bacterial (and phage) family-, genus-, species-, and subspecies-level markers), initially was evaluated on three bacterial control samples represented in the panel (i.e., Propionibacterium acnes, Propionibacterium granulosum, and Rothia dentocariosa) to assess the performance of the multiplex. The hidSkinPlex was further evaluated for prediction purposes. The hidSkinPlex markers were used to attribute skin microbiomes collected from eight individuals from three body sites (i.e., foot (Fb), hand (Hp) and manubrium (Mb)) to their host donor. Supervised learning, specifically regularized multinomial logistic regression and 1-nearest-neighbor classification were used to classify skin microbiomes to their hosts with up to 92% (Fb), 96% (Mb

  10. A semantic web approach applied to integrative bioinformatics experimentation: a biological use case with genomics data.

    NARCIS (Netherlands)

    Post, L.J.G.; Roos, M.; Marshall, M.S.; van Driel, R.; Breit, T.M.

    2007-01-01

    The numerous public data resources make integrative bioinformatics experimentation increasingly important in life sciences research. However, it is severely hampered by the way the data and information are made available. The semantic web approach enhances data exchange and integration by providing

  11. Meiotic Clade AAA ATPases: Protein Polymer Disassembly Machines.

    Science.gov (United States)

    Monroe, Nicole; Hill, Christopher P

    2016-05-08

    Meiotic clade AAA ATPases (ATPases associated with diverse cellular activities), which were initially grouped on the basis of phylogenetic classification of their AAA ATPase cassette, include four relatively well characterized family members, Vps4, spastin, katanin and fidgetin. These enzymes all function to disassemble specific polymeric protein structures, with Vps4 disassembling the ESCRT-III polymers that are central to the many membrane-remodeling activities of the ESCRT (endosomal sorting complexes required for transport) pathway and spastin, katanin p60 and fidgetin affecting multiple aspects of cellular dynamics by severing microtubules. They share a common domain architecture that features an N-terminal MIT (microtubule interacting and trafficking) domain followed by a single AAA ATPase cassette. Meiotic clade AAA ATPases function as hexamers that can cycle between the active assembly and inactive monomers/dimers in a regulated process, and they appear to disassemble their polymeric substrates by translocating subunits through the central pore of their hexameric ring. Recent studies with Vps4 have shown that nucleotide-induced asymmetry is a requirement for substrate binding to the pore loops and that recruitment to the protein lattice via MIT domains also relieves autoinhibition and primes the AAA ATPase cassettes for substrate binding. The most striking, unifying feature of meiotic clade AAA ATPases may be their MIT domain, which is a module that is found in a wide variety of proteins that localize to ESCRT-III polymers. Spastin also displays an adjacent microtubule binding sequence, and the presence of both ESCRT-III and microtubule binding elements may underlie the recent findings that the ESCRT-III disassembly function of Vps4 and the microtubule-severing function of spastin, as well as potentially katanin and fidgetin, are highly coordinated. Copyright © 2015 Elsevier Ltd. All rights reserved.

  12. An overview of topic modeling and its current applications in bioinformatics.

    Science.gov (United States)

    Liu, Lin; Tang, Lin; Dong, Wen; Yao, Shaowen; Zhou, Wei

    2016-01-01

    With the rapid accumulation of biological datasets, machine learning methods designed to automate data analysis are urgently needed. In recent years, so-called topic models that originated from the field of natural language processing have been receiving much attention in bioinformatics because of their interpretability. Our aim was to review the application and development of topic models for bioinformatics. This paper starts with the description of a topic model, with a focus on the understanding of topic modeling. A general outline is provided on how to build an application in a topic model and how to develop a topic model. Meanwhile, the literature on application of topic models to biological data was searched and analyzed in depth. According to the types of models and the analogy between the concept of document-topic-word and a biological object (as well as the tasks of a topic model), we categorized the related studies and provided an outlook on the use of topic models for the development of bioinformatics applications. Topic modeling is a useful method (in contrast to the traditional means of data reduction in bioinformatics) and enhances researchers' ability to interpret biological information. Nevertheless, due to the lack of topic models optimized for specific biological data, the studies on topic modeling in biological data still have a long and challenging road ahead. We believe that topic models are a promising method for various applications in bioinformatics research.

  13. Bioinformatics for Exploration

    Science.gov (United States)

    Johnson, Kathy A.

    2006-01-01

    For the purpose of this paper, bioinformatics is defined as the application of computer technology to the management of biological information. It can be thought of as the science of developing computer databases and algorithms to facilitate and expedite biological research. This is a crosscutting capability that supports nearly all human health areas ranging from computational modeling, to pharmacodynamics research projects, to decision support systems within autonomous medical care. Bioinformatics serves to increase the efficiency and effectiveness of the life sciences research program. It provides data, information, and knowledge capture which further supports management of the bioastronautics research roadmap - identifying gaps that still remain and enabling the determination of which risks have been addressed.

  14. Bioinformatics process management: information flow via a computational journal

    Directory of Open Access Journals (Sweden)

    Lushington Gerald

    2007-12-01

    Full Text Available Abstract This paper presents the Bioinformatics Computational Journal (BCJ, a framework for conducting and managing computational experiments in bioinformatics and computational biology. These experiments often involve series of computations, data searches, filters, and annotations which can benefit from a structured environment. Systems to manage computational experiments exist, ranging from libraries with standard data models to elaborate schemes to chain together input and output between applications. Yet, although such frameworks are available, their use is not widespread–ad hoc scripts are often required to bind applications together. The BCJ explores another solution to this problem through a computer based environment suitable for on-site use, which builds on the traditional laboratory notebook paradigm. It provides an intuitive, extensible paradigm designed for expressive composition of applications. Extensive features facilitate sharing data, computational methods, and entire experiments. By focusing on the bioinformatics and computational biology domain, the scope of the computational framework was narrowed, permitting us to implement a capable set of features for this domain. This report discusses the features determined critical by our system and other projects, along with design issues. We illustrate the use of our implementation of the BCJ on two domain-specific examples.

  15. Adaptation of a Bioinformatics Microarray Analysis Workflow for a Toxicogenomic Study in Rainbow Trout.

    Directory of Open Access Journals (Sweden)

    Sophie Depiereux

    Full Text Available Sex steroids play a key role in triggering sex differentiation in fish, the use of exogenous hormone treatment leading to partial or complete sex reversal. This phenomenon has attracted attention since the discovery that even low environmental doses of exogenous steroids can adversely affect gonad morphology (ovotestis development and induce reproductive failure. Modern genomic-based technologies have enhanced opportunities to find out mechanisms of actions (MOA and identify biomarkers related to the toxic action of a compound. However, high throughput data interpretation relies on statistical analysis, species genomic resources, and bioinformatics tools. The goals of this study are to improve the knowledge of feminisation in fish, by the analysis of molecular responses in the gonads of rainbow trout fry after chronic exposure to several doses (0.01, 0.1, 1 and 10 μg/L of ethynylestradiol (EE2 and to offer target genes as potential biomarkers of ovotestis development. We successfully adapted a bioinformatics microarray analysis workflow elaborated on human data to a toxicogenomic study using rainbow trout, a fish species lacking accurate functional annotation and genomic resources. The workflow allowed to obtain lists of genes supposed to be enriched in true positive differentially expressed genes (DEGs, which were subjected to over-representation analysis methods (ORA. Several pathways and ontologies, mostly related to cell division and metabolism, sexual reproduction and steroid production, were found significantly enriched in our analyses. Moreover, two sets of potential ovotestis biomarkers were selected using several criteria. The first group displayed specific potential biomarkers belonging to pathways/ontologies highlighted in the experiment. Among them, the early ovarian differentiation gene foxl2a was overexpressed. The second group, which was highly sensitive but not specific, included the DEGs presenting the highest fold change and

  16. Bioinformatics: Cheap and robust method to explore biomaterial from Indonesia biodiversity

    Science.gov (United States)

    Widodo

    2015-02-01

    Indonesia has a huge amount of biodiversity, which may contain many biomaterials for pharmaceutical application. These resources potency should be explored to discover new drugs for human wealth. However, the bioactive screening using conventional methods is very expensive and time-consuming. Therefore, we developed a methodology for screening the potential of natural resources based on bioinformatics. The method is developed based on the fact that organisms in the same taxon will have similar genes, metabolism and secondary metabolites product. Then we employ bioinformatics to explore the potency of biomaterial from Indonesia biodiversity by comparing species with the well-known taxon containing the active compound through published paper or chemical database. Then we analyze drug-likeness, bioactivity and the target proteins of the active compound based on their molecular structure. The target protein was examined their interaction with other proteins in the cell to determine action mechanism of the active compounds in the cellular level, as well as to predict its side effects and toxicity. By using this method, we succeeded to screen anti-cancer, immunomodulators and anti-inflammation from Indonesia biodiversity. For example, we found anticancer from marine invertebrate by employing the method. The anti-cancer was explore based on the isolated compounds of marine invertebrate from published article and database, and then identified the protein target, followed by molecular pathway analysis. The data suggested that the active compound of the invertebrate able to kill cancer cell. Further, we collect and extract the active compound from the invertebrate, and then examined the activity on cancer cell (MCF7). The MTT result showed that the methanol extract of marine invertebrate was highly potent in killing MCF7 cells. Therefore, we concluded that bioinformatics is cheap and robust way to explore bioactive from Indonesia biodiversity for source of drug and another

  17. The 2015 Bioinformatics Open Source Conference (BOSC 2015).

    Science.gov (United States)

    Harris, Nomi L; Cock, Peter J A; Lapp, Hilmar; Chapman, Brad; Davey, Rob; Fields, Christopher; Hokamp, Karsten; Munoz-Torres, Monica

    2016-02-01

    The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included "Data Science;" "Standards and Interoperability;" "Open Science and Reproducibility;" "Translational Bioinformatics;" "Visualization;" and "Bioinformatics Open Source Project Updates". In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled "Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community," that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule.

  18. Component-Based Approach for Educating Students in Bioinformatics

    Science.gov (United States)

    Poe, D.; Venkatraman, N.; Hansen, C.; Singh, G.

    2009-01-01

    There is an increasing need for an effective method of teaching bioinformatics. Increased progress and availability of computer-based tools for educating students have led to the implementation of a computer-based system for teaching bioinformatics as described in this paper. Bioinformatics is a recent, hybrid field of study combining elements of…

  19. Phylogenetic trees in bioinformatics

    Energy Technology Data Exchange (ETDEWEB)

    Burr, Tom L [Los Alamos National Laboratory

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  20. Rapid cloning and bioinformatic analysis of spinach Y chromosome ...

    Indian Academy of Sciences (India)

    Rapid cloning and bioinformatic analysis of spinach Y chromosome- specific EST sequences. Chuan-Liang Deng, Wei-Li Zhang, Ying Cao, Shao-Jing Wang, ... Arabidopsis thaliana mRNA for mitochondrial half-ABC transporter (STA1 gene). 389 2.31E-13. 98.96. SP3−12. Betula pendula histidine kinase 3 (HK3) mRNA, ...

  1. A Quick Guide for Building a Successful Bioinformatics Community

    Science.gov (United States)

    Budd, Aidan; Corpas, Manuel; Brazas, Michelle D.; Fuller, Jonathan C.; Goecks, Jeremy; Mulder, Nicola J.; Michaut, Magali; Ouellette, B. F. Francis; Pawlik, Aleksandra; Blomberg, Niklas

    2015-01-01

    “Scientific community” refers to a group of people collaborating together on scientific-research-related activities who also share common goals, interests, and values. Such communities play a key role in many bioinformatics activities. Communities may be linked to a specific location or institute, or involve people working at many different institutions and locations. Education and training is typically an important component of these communities, providing a valuable context in which to develop skills and expertise, while also strengthening links and relationships within the community. Scientific communities facilitate: (i) the exchange and development of ideas and expertise; (ii) career development; (iii) coordinated funding activities; (iv) interactions and engagement with professionals from other fields; and (v) other activities beneficial to individual participants, communities, and the scientific field as a whole. It is thus beneficial at many different levels to understand the general features of successful, high-impact bioinformatics communities; how individual participants can contribute to the success of these communities; and the role of education and training within these communities. We present here a quick guide to building and maintaining a successful, high-impact bioinformatics community, along with an overview of the general benefits of participating in such communities. This article grew out of contributions made by organizers, presenters, panelists, and other participants of the ISMB/ECCB 2013 workshop “The ‘How To Guide’ for Establishing a Successful Bioinformatics Network” at the 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and the 12th European Conference on Computational Biology (ECCB). PMID:25654371

  2. Loss of lager specific genes and subtelomeric regions define two different Saccharomyces cerevisiae lineages for Saccharomyces pastorianus Group I and II strains.

    Science.gov (United States)

    Monerawela, Chandre; James, Tharappel C; Wolfe, Kenneth H; Bond, Ursula

    2015-03-01

    Lager yeasts, Saccharomyces pastorianus, are interspecies hybrids between S. cerevisiae and S. eubayanus and are classified into Group I and Group II clades. The genome of the Group II strain, Weihenstephan 34/70, contains eight so-called 'lager-specific' genes that are located in subtelomeric regions. We evaluated the origins of these genes through bioinformatic and PCR analyses of Saccharomyces genomes. We determined that four are of cerevisiae origin while four originate from S. eubayanus. The Group I yeasts contain all four S. eubayanus genes but individual strains contain only a subset of the cerevisiae genes. We identified S. cerevisiae strains that contain all four cerevisiae 'lager-specific' genes, and distinct patterns of loss of these genes in other strains. Analysis of the subtelomeric regions uncovered patterns of loss in different S. cerevisiae strains. We identify two classes of S. cerevisiae strains: ale yeasts (Foster O) and stout yeasts with patterns of 'lager-specific' genes and subtelomeric regions identical to Group I and II S. pastorianus yeasts, respectively. These findings lead us to propose that Group I and II S. pastorianus strains originate from separate hybridization events involving different S. cerevisiae lineages. Using the combined bioinformatic and PCR data, we describe a potential classification map for industrial yeasts. © FEMS 2015. All rights reserved. For permissions, please e-mail: journals.permission@oup.com.

  3. A Mathematical Optimization Problem in Bioinformatics

    Science.gov (United States)

    Heyer, Laurie J.

    2008-01-01

    This article describes the sequence alignment problem in bioinformatics. Through examples, we formulate sequence alignment as an optimization problem and show how to compute the optimal alignment with dynamic programming. The examples and sample exercises have been used by the author in a specialized course in bioinformatics, but could be adapted…

  4. Bioinformatics and Cancer

    Science.gov (United States)

    Researchers take on challenges and opportunities to mine "Big Data" for answers to complex biological questions. Learn how bioinformatics uses advanced computing, mathematics, and technological platforms to store, manage, analyze, and understand data.

  5. Phylogenetic analysis of New Zealand earthworms (Oligochaeta: Megascolecidae) reveals ancient clades and cryptic taxonomic diversity.

    Science.gov (United States)

    Buckley, Thomas R; James, Sam; Allwood, Julia; Bartlam, Scott; Howitt, Robyn; Prada, Diana

    2011-01-01

    We have constructed the first ever phylogeny for the New Zealand earthworm fauna (Megascolecinae and Acanthodrilinae) including representatives from other major continental regions. Bayesian and maximum likelihood phylogenetic trees were constructed from 427 base pairs from the mitochondrial large subunit (16S) rRNA gene and 661 base pairs from the nuclear large subunit (28S) rRNA gene. Within the Acanthodrilinae we were able to identify a number of well-supported clades that were restricted to continental landmasses. Estimates of nodal support for these major clades were generally high, but relationships among clades were poorly resolved. The phylogenetic analyses revealed several independent lineages in New Zealand, some of which had a comparable phylogenetic depth to monophyletic groups sampled from Madagascar, Africa, North America and Australia. These results are consistent with at least some of these clades having inhabited New Zealand since rifting from Gondwana in the Late Cretaceous. Within the New Zealand Acanthodrilinae, major clades tended to be restricted to specific regions of New Zealand, with the central North Island and Cook Strait representing major biogeographic boundaries. Our field surveys of New Zealand and subsequent identification has also revealed extensive cryptic taxonomic diversity with approximately 48 new species sampled in addition to the 199 species recognized by previous authors. Our results indicate that further survey and taxonomic work is required to establish a foundation for future biogeographic and ecological research on this vitally important component of the New Zealand biota. Copyright © 2010 Elsevier Inc. All rights reserved.

  6. Bioinformatics analysis and detection of gelatinase encoded gene in Lysinibacillussphaericus

    Science.gov (United States)

    Repin, Rul Aisyah Mat; Mutalib, Sahilah Abdul; Shahimi, Safiyyah; Khalid, Rozida Mohd.; Ayob, Mohd. Khan; Bakar, Mohd. Faizal Abu; Isa, Mohd Noor Mat

    2016-11-01

    In this study, we performed bioinformatics analysis toward genome sequence of Lysinibacillussphaericus (L. sphaericus) to determine gene encoded for gelatinase. L. sphaericus was isolated from soil and gelatinase species-specific bacterium to porcine and bovine gelatin. This bacterium offers the possibility of enzymes production which is specific to both species of meat, respectively. The main focus of this research is to identify the gelatinase encoded gene within the bacteria of L. Sphaericus using bioinformatics analysis of partially sequence genome. From the research study, three candidate gene were identified which was, gelatinase candidate gene 1 (P1), NODE_71_length_93919_cov_158.931839_21 which containing 1563 base pair (bp) in size with 520 amino acids sequence; Secondly, gelatinase candidate gene 2 (P2), NODE_23_length_52851_cov_190.061386_17 which containing 1776 bp in size with 591 amino acids sequence; and Thirdly, gelatinase candidate gene 3 (P3), NODE_106_length_32943_cov_169.147919_8 containing 1701 bp in size with 566 amino acids sequence. Three pairs of oligonucleotide primers were designed and namely as, F1, R1, F2, R2, F3 and R3 were targeted short sequences of cDNA by PCR. The amplicons were reliably results in 1563 bp in size for candidate gene P1 and 1701 bp in size for candidate gene P3. Therefore, the results of bioinformatics analysis of L. Sphaericus resulting in gene encoded gelatinase were identified.

  7. Biology in 'silico': The Bioinformatics Revolution.

    Science.gov (United States)

    Bloom, Mark

    2001-01-01

    Explains the Human Genome Project (HGP) and efforts to sequence the human genome. Describes the role of bioinformatics in the project and considers it the genetics Swiss Army Knife, which has many different uses, for use in forensic science, medicine, agriculture, and environmental sciences. Discusses the use of bioinformatics in the high school…

  8. Bioinformatics research in the Asia Pacific: a 2007 update.

    Science.gov (United States)

    Ranganathan, Shoba; Gribskov, Michael; Tan, Tin Wee

    2008-01-01

    We provide a 2007 update on the bioinformatics research in the Asia-Pacific from the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998. From 2002, APBioNet has organized the first International Conference on Bioinformatics (InCoB) bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2007 Conference was organized as the 6th annual conference of the Asia-Pacific Bioinformatics Network, on Aug. 27-30, 2007 at Hong Kong, following a series of successful events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea) and New Delhi (India). Besides a scientific meeting at Hong Kong, satellite events organized are a pre-conference training workshop at Hanoi, Vietnam and a post-conference workshop at Nansha, China. This Introduction provides a brief overview of the peer-reviewed manuscripts accepted for publication in this Supplement. We have organized the papers into thematic areas, highlighting the growing contribution of research excellence from this region, to global bioinformatics endeavours.

  9. Progress to extinction: increased specialisation causes the demise of animal clades.

    Science.gov (United States)

    Raia, P; Carotenuto, F; Mondanaro, A; Castiglione, S; Passaro, F; Saggese, F; Melchionna, M; Serio, C; Alessio, L; Silvestro, D; Fortelius, M

    2016-08-10

    Animal clades tend to follow a predictable path of waxing and waning during their existence, regardless of their total species richness or geographic coverage. Clades begin small and undifferentiated, then expand to a peak in diversity and range, only to shift into a rarely broken decline towards extinction. While this trajectory is now well documented and broadly recognised, the reasons underlying it remain obscure. In particular, it is unknown why clade extinction is universal and occurs with such surprising regularity. Current explanations for paleontological extinctions call on the growing costs of biological interactions, geological accidents, evolutionary traps, and mass extinctions. While these are effective causes of extinction, they mainly apply to species, not clades. Although mass extinctions is the undeniable cause for the demise of a sizeable number of major taxa, we show here that clades escaping them go extinct because of the widespread tendency of evolution to produce increasingly specialised, sympatric, and geographically restricted species over time.

  10. Progress to extinction: increased specialisation causes the demise of animal clades

    Science.gov (United States)

    Raia, P.; Carotenuto, F.; Mondanaro, A.; Castiglione, S.; Passaro, F.; Saggese, F.; Melchionna, M.; Serio, C.; Alessio, L.; Silvestro, D.; Fortelius, M.

    2016-08-01

    Animal clades tend to follow a predictable path of waxing and waning during their existence, regardless of their total species richness or geographic coverage. Clades begin small and undifferentiated, then expand to a peak in diversity and range, only to shift into a rarely broken decline towards extinction. While this trajectory is now well documented and broadly recognised, the reasons underlying it remain obscure. In particular, it is unknown why clade extinction is universal and occurs with such surprising regularity. Current explanations for paleontological extinctions call on the growing costs of biological interactions, geological accidents, evolutionary traps, and mass extinctions. While these are effective causes of extinction, they mainly apply to species, not clades. Although mass extinctions is the undeniable cause for the demise of a sizeable number of major taxa, we show here that clades escaping them go extinct because of the widespread tendency of evolution to produce increasingly specialised, sympatric, and geographically restricted species over time.

  11. Biowep: a workflow enactment portal for bioinformatics applications.

    Science.gov (United States)

    Romano, Paolo; Bartocci, Ezio; Bertolini, Guglielmo; De Paoli, Flavio; Marra, Domenico; Mauri, Giancarlo; Merelli, Emanuela; Milanesi, Luciano

    2007-03-08

    The huge amount of biological information, its distribution over the Internet and the heterogeneity of available software tools makes the adoption of new data integration and analysis network tools a necessity in bioinformatics. ICT standards and tools, like Web Services and Workflow Management Systems (WMS), can support the creation and deployment of such systems. Many Web Services are already available and some WMS have been proposed. They assume that researchers know which bioinformatics resources can be reached through a programmatic interface and that they are skilled in programming and building workflows. Therefore, they are not viable to the majority of unskilled researchers. A portal enabling these to take profit from new technologies is still missing. We designed biowep, a web based client application that allows for the selection and execution of a set of predefined workflows. The system is available on-line. Biowep architecture includes a Workflow Manager, a User Interface and a Workflow Executor. The task of the Workflow Manager is the creation and annotation of workflows. These can be created by using either the Taverna Workbench or BioWMS. Enactment of workflows is carried out by FreeFluo for Taverna workflows and by BioAgent/Hermes, a mobile agent-based middleware, for BioWMS ones. Main workflows' processing steps are annotated on the basis of their input and output, elaboration type and application domain by using a classification of bioinformatics data and tasks. The interface supports users authentication and profiling. Workflows can be selected on the basis of users' profiles and can be searched through their annotations. Results can be saved. We developed a web system that support the selection and execution of predefined workflows, thus simplifying access for all researchers. The implementation of Web Services allowing specialized software to interact with an exhaustive set of biomedical databases and analysis software and the creation of

  12. Biowep: a workflow enactment portal for bioinformatics applications

    Directory of Open Access Journals (Sweden)

    Romano Paolo

    2007-03-01

    Full Text Available Abstract Background The huge amount of biological information, its distribution over the Internet and the heterogeneity of available software tools makes the adoption of new data integration and analysis network tools a necessity in bioinformatics. ICT standards and tools, like Web Services and Workflow Management Systems (WMS, can support the creation and deployment of such systems. Many Web Services are already available and some WMS have been proposed. They assume that researchers know which bioinformatics resources can be reached through a programmatic interface and that they are skilled in programming and building workflows. Therefore, they are not viable to the majority of unskilled researchers. A portal enabling these to take profit from new technologies is still missing. Results We designed biowep, a web based client application that allows for the selection and execution of a set of predefined workflows. The system is available on-line. Biowep architecture includes a Workflow Manager, a User Interface and a Workflow Executor. The task of the Workflow Manager is the creation and annotation of workflows. These can be created by using either the Taverna Workbench or BioWMS. Enactment of workflows is carried out by FreeFluo for Taverna workflows and by BioAgent/Hermes, a mobile agent-based middleware, for BioWMS ones. Main workflows' processing steps are annotated on the basis of their input and output, elaboration type and application domain by using a classification of bioinformatics data and tasks. The interface supports users authentication and profiling. Workflows can be selected on the basis of users' profiles and can be searched through their annotations. Results can be saved. Conclusion We developed a web system that support the selection and execution of predefined workflows, thus simplifying access for all researchers. The implementation of Web Services allowing specialized software to interact with an exhaustive set of biomedical

  13. Phylogenetic Framework and Molecular Signatures for the Main Clades of the Phylum Actinobacteria

    Science.gov (United States)

    Gao, Beile

    2012-01-01

    Summary: The phylum Actinobacteria harbors many important human pathogens and also provides one of the richest sources of natural products, including numerous antibiotics and other compounds of biotechnological interest. Thus, a reliable phylogeny of this large phylum and the means to accurately identify its different constituent groups are of much interest. Detailed phylogenetic and comparative analyses of >150 actinobacterial genomes reported here form the basis for achieving these objectives. In phylogenetic trees based upon 35 conserved proteins, most of the main groups of Actinobacteria as well as a number of their superageneric clades are resolved. We also describe large numbers of molecular markers consisting of conserved signature indels in protein sequences and whole proteins that are specific for either all Actinobacteria or their different clades (viz., orders, families, genera, and subgenera) at various taxonomic levels. These signatures independently support the existence of different phylogenetic clades, and based upon them, it is now possible to delimit the phylum Actinobacteria (excluding Coriobacteriia) and most of its major groups in clear molecular terms. The species distribution patterns of these markers also provide important information regarding the interrelationships among different main orders of Actinobacteria. The identified molecular markers, in addition to enabling the development of a stable and reliable phylogenetic framework for this phylum, also provide novel and powerful means for the identification of different groups of Actinobacteria in diverse environments. Genetic and biochemical studies on these Actinobacteria-specific markers should lead to the discovery of novel biochemical and/or other properties that are unique to different groups of Actinobacteria. PMID:22390973

  14. Challenge: A Multidisciplinary Degree Program in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Mudasser Fraz Wyne

    2006-06-01

    Full Text Available Bioinformatics is a new field that is poorly served by any of the traditional science programs in Biology, Computer science or Biochemistry. Known to be a rapidly evolving discipline, Bioinformatics has emerged from experimental molecular biology and biochemistry as well as from the artificial intelligence, database, pattern recognition, and algorithms disciplines of computer science. While institutions are responding to this increased demand by establishing graduate programs in bioinformatics, entrance barriers for these programs are high, largely due to the significant prerequisite knowledge which is required, both in the fields of biochemistry and computer science. Although many schools currently have or are proposing graduate programs in bioinformatics, few are actually developing new undergraduate programs. In this paper I explore the blend of a multidisciplinary approach, discuss the response of academia and highlight challenges faced by this emerging field.

  15. H3ABioNet, a sustainable pan-African bioinformatics network for human heredity and health in Africa

    Science.gov (United States)

    Mulder, Nicola J.; Adebiyi, Ezekiel; Alami, Raouf; Benkahla, Alia; Brandful, James; Doumbia, Seydou; Everett, Dean; Fadlelmola, Faisal M.; Gaboun, Fatima; Gaseitsiwe, Simani; Ghazal, Hassan; Hazelhurst, Scott; Hide, Winston; Ibrahimi, Azeddine; Jaufeerally Fakim, Yasmina; Jongeneel, C. Victor; Joubert, Fourie; Kassim, Samar; Kayondo, Jonathan; Kumuthini, Judit; Lyantagaye, Sylvester; Makani, Julie; Mansour Alzohairy, Ahmed; Masiga, Daniel; Moussa, Ahmed; Nash, Oyekanmi; Ouwe Missi Oukem-Boyer, Odile; Owusu-Dabo, Ellis; Panji, Sumir; Patterton, Hugh; Radouani, Fouzia; Sadki, Khalid; Seghrouchni, Fouad; Tastan Bishop, Özlem; Tiffin, Nicki; Ulenga, Nzovu

    2016-01-01

    The application of genomics technologies to medicine and biomedical research is increasing in popularity, made possible by new high-throughput genotyping and sequencing technologies and improved data analysis capabilities. Some of the greatest genetic diversity among humans, animals, plants, and microbiota occurs in Africa, yet genomic research outputs from the continent are limited. The Human Heredity and Health in Africa (H3Africa) initiative was established to drive the development of genomic research for human health in Africa, and through recognition of the critical role of bioinformatics in this process, spurred the establishment of H3ABioNet, a pan-African bioinformatics network for H3Africa. The limitations in bioinformatics capacity on the continent have been a major contributory factor to the lack of notable outputs in high-throughput biology research. Although pockets of high-quality bioinformatics teams have existed previously, the majority of research institutions lack experienced faculty who can train and supervise bioinformatics students. H3ABioNet aims to address this dire need, specifically in the area of human genetics and genomics, but knock-on effects are ensuring this extends to other areas of bioinformatics. Here, we describe the emergence of genomics research and the development of bioinformatics in Africa through H3ABioNet. PMID:26627985

  16. Rapid Differentiation between Livestock-Associated and Livestock-Independent Staphylococcus aureus CC398 Clades

    Science.gov (United States)

    Larsen, Jesper; Soldanova, Katerina; Aziz, Maliha; Contente-Cuomo, Tania; Petersen, Andreas; Vandendriessche, Stien; Jiménez, Judy N.; Mammina, Caterina; van Belkum, Alex; Salmenlinna, Saara; Laurent, Frederic; Skov, Robert L.; Larsen, Anders R.; Andersen, Paal S.; Price, Lance B.

    2013-01-01

    Staphylococcus aureus clonal complex 398 (CC398) isolates cluster into two distinct phylogenetic clades based on single-nucleotide polymorphisms (SNPs) revealing a basal human clade and a more derived livestock clade. The scn and tet(M) genes are strongly associated with the human and the livestock clade, respectively, due to loss and acquisition of mobile genetic elements. We present canonical single-nucleotide polymorphism (canSNP) assays that differentiate the two major host-associated S. aureus CC398 clades and a duplex PCR assay for detection of scn and tet(M). The canSNP assays correctly placed 88 S. aureus CC398 isolates from a reference collection into the human and livestock clades and the duplex PCR assay correctly identified scn and tet(M). The assays were successfully applied to a geographically diverse collection of 272 human S. aureus CC398 isolates. The simple assays described here generate signals comparable to a whole-genome phylogeny for major clade assignment and are easily integrated into S. aureus CC398 surveillance programs and epidemiological studies. PMID:24244535

  17. Protection level of AI H5N1 vaccine clade 2.1.3 commercial against AI H5N1 clade 2.3.2 virus from Ducks to SPF chicken in laboratory conditions

    Directory of Open Access Journals (Sweden)

    Indriani R

    2015-03-01

    Full Text Available Highly Pathogenic Avian Influenza (HPAI subtype H5N1 clade 2.3.2 has infected chickens in farms, causing mortality and a decrease in egg production. Vaccination is one of the strategies to control disease of AI subtype H5N1. AI H5N1 clade 2.1.3 vaccine is available commercially. The effectiveness of two vaccines of AI H5N1 clade 2.1.3 (product A and B, and AI H5N1 clade 2.3.2 (Sukoharjo against AI H5N1 clade 2.3.2 (Sukoharjo virus SPF chickens was tested in laboratory. Four groups of SPF chickens were used in this study, there were (1 vaccinated with H5N1 clade 2.1.3 (product A, (2 vaccinated with H5N1 clade 2.1.3 (product B, (3 vaccinated with AI H5N1 clade 2.3.2 and (4 unvaccinated (as a control. Each vaccinated group consisted of 10 chicken except 8 chicken for control group. SPF chicken were vaccinated with 1 dose of vaccine at 3 weeks olds, and then after 3 weeks post vaccination (at 6 weeks olds. All group of chicken were challenged with 106 EID50 per 0.1 ml via intranasal. The results showed, chicken vaccinated with H5N1 clade 2.1.3 product A and B gave 100 and 80% protection respectively, but showed challenged virus shedding, whereas vaccine of H5N1 clade 2.3.2 gave 100% protection from mortality and without virus shedding. Vaccines of AI H5N1 clade 2.1.3 product A was better than vaccine product B, and when chicken vaccinated against H5N1 clade 2.3.2, H5N1 clade 2.3.2 vaccine was the best to be used. In order to protect chicken from AI subtype H5N1 clade 2.1.3 and 2.3.2 in the field, a bivalent vaccine of H5N1 clade 2.1.3 and 2.3.2 subtypes should be developed.

  18. When cloud computing meets bioinformatics: a review.

    Science.gov (United States)

    Zhou, Shuigeng; Liao, Ruiqi; Guan, Jihong

    2013-10-01

    In the past decades, with the rapid development of high-throughput technologies, biology research has generated an unprecedented amount of data. In order to store and process such a great amount of data, cloud computing and MapReduce were applied to many fields of bioinformatics. In this paper, we first introduce the basic concepts of cloud computing and MapReduce, and their applications in bioinformatics. We then highlight some problems challenging the applications of cloud computing and MapReduce to bioinformatics. Finally, we give a brief guideline for using cloud computing in biology research.

  19. Application of machine learning methods in bioinformatics

    Science.gov (United States)

    Yang, Haoyu; An, Zheng; Zhou, Haotian; Hou, Yawen

    2018-05-01

    Faced with the development of bioinformatics, high-throughput genomic technology have enabled biology to enter the era of big data. [1] Bioinformatics is an interdisciplinary, including the acquisition, management, analysis, interpretation and application of biological information, etc. It derives from the Human Genome Project. The field of machine learning, which aims to develop computer algorithms that improve with experience, holds promise to enable computers to assist humans in the analysis of large, complex data sets.[2]. This paper analyzes and compares various algorithms of machine learning and their applications in bioinformatics.

  20. Keemei: cloud-based validation of tabular bioinformatics file formats in Google Sheets.

    Science.gov (United States)

    Rideout, Jai Ram; Chase, John H; Bolyen, Evan; Ackermann, Gail; González, Antonio; Knight, Rob; Caporaso, J Gregory

    2016-06-13

    Bioinformatics software often requires human-generated tabular text files as input and has specific requirements for how those data are formatted. Users frequently manage these data in spreadsheet programs, which is convenient for researchers who are compiling the requisite information because the spreadsheet programs can easily be used on different platforms including laptops and tablets, and because they provide a familiar interface. It is increasingly common for many different researchers to be involved in compiling these data, including study coordinators, clinicians, lab technicians and bioinformaticians. As a result, many research groups are shifting toward using cloud-based spreadsheet programs, such as Google Sheets, which support the concurrent editing of a single spreadsheet by different users working on different platforms. Most of the researchers who enter data are not familiar with the formatting requirements of the bioinformatics programs that will be used, so validating and correcting file formats is often a bottleneck prior to beginning bioinformatics analysis. We present Keemei, a Google Sheets Add-on, for validating tabular files used in bioinformatics analyses. Keemei is available free of charge from Google's Chrome Web Store. Keemei can be installed and run on any web browser supported by Google Sheets. Keemei currently supports the validation of two widely used tabular bioinformatics formats, the Quantitative Insights into Microbial Ecology (QIIME) sample metadata mapping file format and the Spatially Referenced Genetic Data (SRGD) format, but is designed to easily support the addition of others. Keemei will save researchers time and frustration by providing a convenient interface for tabular bioinformatics file format validation. By allowing everyone involved with data entry for a project to easily validate their data, it will reduce the validation and formatting bottlenecks that are commonly encountered when human-generated data files are

  1. The Longibrachiatum Clade of Trichoderma: a revision with new species

    Science.gov (United States)

    The Longibrachiatum Clade of Trichoderma is revised. Eight new species are described (T. aethiopicum, T. capillare, T. flagellatum, T. gillesii, T. gracile, T. pinnatum, T. saturnisporopsis, T. solani). The twenty-one species known to belong to the Longibrachiatum Clade are included in a synoptic ke...

  2. Bioinformatics and moonlighting proteins

    Directory of Open Access Journals (Sweden)

    Sergio eHernández

    2015-06-01

    Full Text Available Multitasking or moonlighting is the capability of some proteins to execute two or more biochemical functions. Usually, moonlighting proteins are experimentally revealed by serendipity. For this reason, it would be helpful that Bioinformatics could predict this multifunctionality, especially because of the large amounts of sequences from genome projects. In the present work, we analyse and describe several approaches that use sequences, structures, interactomics and current bioinformatics algorithms and programs to try to overcome this problem. Among these approaches are: a remote homology searches using Psi-Blast, b detection of functional motifs and domains, c analysis of data from protein-protein interaction databases (PPIs, d match the query protein sequence to 3D databases (i.e., algorithms as PISITE, e mutation correlation analysis between amino acids by algorithms as MISTIC. Programs designed to identify functional motif/domains detect mainly the canonical function but usually fail in the detection of the moonlighting one, Pfam and ProDom being the best methods. Remote homology search by Psi-Blast combined with data from interactomics databases (PPIs have the best performance. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can only be used in very specific situations –it requires the existence of multialigned family protein sequences - but can suggest how the evolutionary process of second function acquisition took place. The multitasking protein database MultitaskProtDB (http://wallace.uab.es/multitask/, previously published by our group, has been used as a benchmark for the all of the analyses.

  3. Host-ant specificity of endangered large blue butterflies (Phengaris spp., Lepidoptera: Lycaenidae) in Japan.

    Science.gov (United States)

    Ueda, Shouhei; Komatsu, Takashi; Itino, Takao; Arai, Ryusuke; Sakamoto, Hironori

    2016-11-03

    Large blue butterflies, Phengaris (Maculinea), are an important focus of endangered-species conservation in Eurasia. Later-instar Phengaris caterpillars live in Myrmica ant nests and exploit the ant colony's resources, and they are specialized to specific host-ant species. For example, local extinction of P. arion in the U. K. is thought to have been due to the replacement of its host-ant species with a less-suitable congener, as a result of changes in habitat. In Japan, Myrmica kotokui hosts P. teleius and P. arionides caterpillars. We recently showed, however, that the morphological species M. kotokui actually comprises four genetic clades. Therefore, to determine to which group of ants the hosts of these two Japanese Phengaris species belong, we used mitochondrial COI-barcoding of M. kotokui specimens from colonies in the habitats of P. teleius and P. arionides to identify the ant clade actually parasitized by the caterpillars of each species. We found that these two butterfly species parasitize different ant clades within M. kotokui.

  4. Rapid Identification of Different Escherichia coli Sequence Type 131 Clades.

    Science.gov (United States)

    Matsumura, Yasufumi; Pitout, Johann D D; Peirano, Gisele; DeVinney, Rebekah; Noguchi, Taro; Yamamoto, Masaki; Gomi, Ryota; Matsuda, Tomonari; Nakano, Satoshi; Nagao, Miki; Tanaka, Michio; Ichiyama, Satoshi

    2017-08-01

    Escherichia coli sequence type 131 (ST131) is a pandemic clonal lineage that is responsible for the global increase in fluoroquinolone resistance and extended-spectrum-β-lactamase (ESBL) producers. The members of ST131 clade C, especially subclades C2 and C1-M27, are associated with ESBLs. We developed a multiplex conventional PCR assay with the ability to detect all ST131 clades (A, B, and C), as well as C subclades (C1-M27, C1-nM27 [C1-non-M27], and C2). To validate the assay, we used 80 ST131 global isolates that had been fully sequenced. We then used the assay to define the prevalence of each clade in two Japanese collections consisting of 460 ESBL-producing E. coli ST131 (2001-12) and 329 E. coli isolates from extraintestinal sites (ExPEC) (2014). The assay correctly identified the different clades in all 80 global isolates: clades A ( n = 12), B ( n = 12), and C, including subclades C1-M27 ( n = 16), C1-nM27 ( n = 20), C2 ( n = 17), and other C ( n = 3). The assay also detected all 565 ST131 isolates in both collections without any false positives. Isolates from clades A ( n = 54), B ( n = 23), and C ( n = 483) corresponded to the O serotypes and the fimH types of O16-H41, O25b-H22, and O25b-H30, respectively. Of the 483 clade C isolates, C1-M27 was the most common subclade (36%), followed by C1-nM27 (32%) and C2 (15%). The C1-M27 subclade with bla CTX-M-27 became especially prominent after 2009. Our novel multiplex PCR assay revealed the predominance of the C1-M27 subclade in recent Japanese ESBL-producing E. coli isolates and is a promising tool for epidemiological studies of ST131. Copyright © 2017 American Society for Microbiology.

  5. Why should we investigate the morphological disparity of plant clades?

    Science.gov (United States)

    Oyston, Jack W; Hughes, Martin; Gerber, Sylvain; Wills, Matthew A

    2016-04-01

    Disparity refers to the morphological variation in a sample of taxa, and is distinct from diversity or taxonomic richness. Diversity and disparity are fundamentally decoupled; many groups attain high levels of disparity early in their evolution, while diversity is still comparatively low. Diversity may subsequently increase even in the face of static or declining disparity by increasingly fine sub-division of morphological 'design' space (morphospace). Many animal clades reached high levels of disparity early in their evolution, but there have been few comparable studies of plant clades, despite their profound ecological and evolutionary importance. This study offers a prospective and some preliminary macroevolutionary analyses. Classical morphometric methods are most suitable when there is reasonable conservation of form, but lose traction where morphological differences become greater (e.g. in comparisons across higher taxa). Discrete character matrices offer one means to compare a greater diversity of forms. This study explores morphospaces derived from eight discrete data sets for major plant clades, and discusses their macroevolutionary implications. Most of the plant clades in this study show initial, high levels of disparity that approach or attain the maximum levels reached subsequently. These plant clades are characterized by an initial phase of evolution during which most regions of their empirical morphospaces are colonized. Angiosperms, palms, pines and ferns show remarkably little variation in disparity through time. Conifers furnish the most marked exception, appearing at relatively low disparity in the latest Carboniferous, before expanding incrementally with the radiation of successive, tightly clustered constituent sub-clades. Many cladistic data sets can be repurposed for investigating the morphological disparity of plant clades through time, and offer insights that are complementary to more focused morphometric studies. The unique structural and

  6. Continuing Education Workshops in Bioinformatics Positively Impact Research and Careers.

    Science.gov (United States)

    Brazas, Michelle D; Ouellette, B F Francis

    2016-06-01

    Bioinformatics.ca has been hosting continuing education programs in introductory and advanced bioinformatics topics in Canada since 1999 and has trained more than 2,000 participants to date. These workshops have been adapted over the years to keep pace with advances in both science and technology as well as the changing landscape in available learning modalities and the bioinformatics training needs of our audience. Post-workshop surveys have been a mandatory component of each workshop and are used to ensure appropriate adjustments are made to workshops to maximize learning. However, neither bioinformatics.ca nor others offering similar training programs have explored the long-term impact of bioinformatics continuing education training. Bioinformatics.ca recently initiated a look back on the impact its workshops have had on the career trajectories, research outcomes, publications, and collaborations of its participants. Using an anonymous online survey, bioinformatics.ca analyzed responses from those surveyed and discovered its workshops have had a positive impact on collaborations, research, publications, and career progression.

  7. Using "Arabidopsis" Genetic Sequences to Teach Bioinformatics

    Science.gov (United States)

    Zhang, Xiaorong

    2009-01-01

    This article describes a new approach to teaching bioinformatics using "Arabidopsis" genetic sequences. Several open-ended and inquiry-based laboratory exercises have been designed to help students grasp key concepts and gain practical skills in bioinformatics, using "Arabidopsis" leucine-rich repeat receptor-like kinase (LRR…

  8. Phylogenomics and comparative genomic studies delineate six main clades within the family Enterobacteriaceae and support the reclassification of several polyphyletic members of the family.

    Science.gov (United States)

    Alnajar, Seema; Gupta, Radhey S

    2017-10-01

    The family Enterobacteriaceae harbors many important pathogens, however it has proven difficult to reliably distinguish different members of this family or discern their interrelationships. To understand the interrelationships among the Enterobacteriaceae species, we have constructed two comprehensive phylogenetic trees for 78 genome-sequenced Enterobacteriaceae species based on 2487 core genome proteins, and another set of 118 conserved proteins. The genome sequences of Enterobacteriaceae species were also analyzed for genetic relatedness based on average amino acid identity and 16S rRNA sequence similarity. In parallel, comparative genomic studies on protein sequences from the Enterobacteriaceae have identified 88 molecular markers in the form of conserved signature indels (CSIs) that are uniquely shared by specific members of the family. All of these multiple lines of investigations provide consistent evidence that most of the species/genera within the family can be assigned to 6 different subfamily level clades which are designated as the "Escherichia clade", "Klebsiella clade", "Enterobacter clade", "Kosakonia clade", "Cronobacter clade" and "Cedecea clade". The members of the six described clades, in addition to their distinct branching in phylogenetic trees, can now be reliably demarcated in molecular terms on the basis of multiple identified CSIs that are exclusively shared by the group members. Several additional CSIs identified in this work that are either specific for individual genera (viz. Kosakonia, Kluyvera and Escherichia-Shigella), or are present at various taxonomic depths, offer information regarding the interrelationships among the different clades. The described molecular markers provide novel means for diagnostic as well as genetic and biochemical studies on the Enterobacteriaceae species and for resolving the polyphyly of its several genera viz. Escherichia, Enterobacter and Kluyvera. On the bases of our results, we are proposing the

  9. EDAM: an ontology of bioinformatics operations, types of data and identifiers, topics and formats

    Science.gov (United States)

    Ison, Jon; Kalaš, Matúš; Jonassen, Inge; Bolser, Dan; Uludag, Mahmut; McWilliam, Hamish; Malone, James; Lopez, Rodrigo; Pettifer, Steve; Rice, Peter

    2013-01-01

    Motivation: Advancing the search, publication and integration of bioinformatics tools and resources demands consistent machine-understandable descriptions. A comprehensive ontology allowing such descriptions is therefore required. Results: EDAM is an ontology of bioinformatics operations (tool or workflow functions), types of data and identifiers, application domains and data formats. EDAM supports semantic annotation of diverse entities such as Web services, databases, programmatic libraries, standalone tools, interactive applications, data schemas, datasets and publications within bioinformatics. EDAM applies to organizing and finding suitable tools and data and to automating their integration into complex applications or workflows. It includes over 2200 defined concepts and has successfully been used for annotations and implementations. Availability: The latest stable version of EDAM is available in OWL format from http://edamontology.org/EDAM.owl and in OBO format from http://edamontology.org/EDAM.obo. It can be viewed online at the NCBO BioPortal and the EBI Ontology Lookup Service. For documentation and license please refer to http://edamontology.org. This article describes version 1.2 available at http://edamontology.org/EDAM_1.2.owl. Contact: jison@ebi.ac.uk PMID:23479348

  10. Fuzzy Logic in Medicine and Bioinformatics

    Directory of Open Access Journals (Sweden)

    Angela Torres

    2006-01-01

    Full Text Available The purpose of this paper is to present a general view of the current applications of fuzzy logic in medicine and bioinformatics. We particularly review the medical literature using fuzzy logic. We then recall the geometrical interpretation of fuzzy sets as points in a fuzzy hypercube and present two concrete illustrations in medicine (drug addictions and in bioinformatics (comparison of genomes.

  11. PyPedia: using the wiki paradigm as crowd sourcing environment for bioinformatics protocols.

    Science.gov (United States)

    Kanterakis, Alexandros; Kuiper, Joël; Potamias, George; Swertz, Morris A

    2015-01-01

    Today researchers can choose from many bioinformatics protocols for all types of life sciences research, computational environments and coding languages. Although the majority of these are open source, few of them possess all virtues to maximize reuse and promote reproducible science. Wikipedia has proven a great tool to disseminate information and enhance collaboration between users with varying expertise and background to author qualitative content via crowdsourcing. However, it remains an open question whether the wiki paradigm can be applied to bioinformatics protocols. We piloted PyPedia, a wiki where each article is both implementation and documentation of a bioinformatics computational protocol in the python language. Hyperlinks within the wiki can be used to compose complex workflows and induce reuse. A RESTful API enables code execution outside the wiki. Initial content of PyPedia contains articles for population statistics, bioinformatics format conversions and genotype imputation. Use of the easy to learn wiki syntax effectively lowers the barriers to bring expert programmers and less computer savvy researchers on the same page. PyPedia demonstrates how wiki can provide a collaborative development, sharing and even execution environment for biologists and bioinformaticians that complement existing resources, useful for local and multi-center research teams. PyPedia is available online at: http://www.pypedia.com. The source code and installation instructions are available at: https://github.com/kantale/PyPedia_server. The PyPedia python library is available at: https://github.com/kantale/pypedia. PyPedia is open-source, available under the BSD 2-Clause License.

  12. Online Bioinformatics Tutorials | Office of Cancer Genomics

    Science.gov (United States)

    Bioinformatics is a scientific discipline that applies computer science and information technology to help understand biological processes. The NIH provides a list of free online bioinformatics tutorials, either generated by the NIH Library or other institutes, which includes introductory lectures and "how to" videos on using various tools.

  13. The Revolution in Viral Genomics as Exemplified by the Bioinformatic Analysis of Human Adenoviruses

    Directory of Open Access Journals (Sweden)

    Sarah Torres

    2010-06-01

    Full Text Available Over the past 30 years, genomic and bioinformatic analysis of human adenoviruses has been achieved using a variety of DNA sequencing methods; initially with the use of restriction enzymes and more currently with the use of the GS FLX pyrosequencing technology. Following the conception of DNA sequencing in the 1970s, analysis of adenoviruses has evolved from 100 base pair mRNA fragments to entire genomes. Comparative genomics of adenoviruses made its debut in 1984 when nucleotides and amino acids of coding sequences within the hexon genes of two human adenoviruses (HAdV, HAdV–C2 and HAdV–C5, were compared and analyzed. It was determined that there were three different zones (1-393, 394-1410, 1411-2910 within the hexon gene, of which HAdV–C2 and HAdV–C5 shared zones 1 and 3 with 95% and 89.5% nucleotide identity, respectively. In 1992, HAdV-C5 became the first adenovirus genome to be fully sequenced using the Sanger method. Over the next seven years, whole genome analysis and characterization was completed using bioinformatic tools such as blastn, tblastx, ClustalV and FASTA, in order to determine key proteins in species HAdV-A through HAdV-F. The bioinformatic revolution was initiated with the introduction of a novel species, HAdV-G, that was typed and named by the use of whole genome sequencing and phylogenetics as opposed to traditional serology. HAdV bioinformatics will continue to advance as the latest sequencing technology enables scientists to add to and expand the resource databases. As a result of these advancements, how novel HAdVs are typed has changed. Bioinformatic analysis has become the revolutionary tool that has significantly accelerated the in-depth study of HAdV microevolution through comparative genomics.

  14. Bioinformatic analysis of functional differences between the immunoproteasome and the constitutive proteasome

    DEFF Research Database (Denmark)

    Kesmir, Can; van Noort, V.; de Boer, R.J.

    2003-01-01

    not yet been quantified how different the specificity of two forms of the proteasome are. The main question, which still lacks direct evidence, is whether the immunoproteasome generates more MHC ligands. Here we use bioinformatics tools to quantify these differences and show that the immunoproteasome...

  15. OralCard: a bioinformatic tool for the study of oral proteome.

    Science.gov (United States)

    Arrais, Joel P; Rosa, Nuno; Melo, José; Coelho, Edgar D; Amaral, Diana; Correia, Maria José; Barros, Marlene; Oliveira, José Luís

    2013-07-01

    The molecular complexity of the human oral cavity can only be clarified through identification of components that participate within it. However current proteomic techniques produce high volumes of information that are dispersed over several online databases. Collecting all of this data and using an integrative approach capable of identifying unknown associations is still an unsolved problem. This is the main motivation for this work. We present the online bioinformatic tool OralCard, which comprises results from 55 manually curated articles reflecting the oral molecular ecosystem (OralPhysiOme). It comprises experimental information available from the oral proteome both of human (OralOme) and microbial origin (MicroOralOme) structured in protein, disease and organism. This tool is a key resource for researchers to understand the molecular foundations implicated in biology and disease mechanisms of the oral cavity. The usefulness of this tool is illustrated with the analysis of the oral proteome associated with diabetes melitus type 2. OralCard is available at http://bioinformatics.ua.pt/oralcard. Copyright © 2013 Elsevier Ltd. All rights reserved.

  16. The development and application of bioinformatics core competencies to improve bioinformatics training and education.

    Science.gov (United States)

    Mulder, Nicola; Schwartz, Russell; Brazas, Michelle D; Brooksbank, Cath; Gaeta, Bruno; Morgan, Sarah L; Pauley, Mark A; Rosenwald, Anne; Rustici, Gabriella; Sierk, Michael; Warnow, Tandy; Welch, Lonnie

    2018-02-01

    Bioinformatics is recognized as part of the essential knowledge base of numerous career paths in biomedical research and healthcare. However, there is little agreement in the field over what that knowledge entails or how best to provide it. These disagreements are compounded by the wide range of populations in need of bioinformatics training, with divergent prior backgrounds and intended application areas. The Curriculum Task Force of the International Society of Computational Biology (ISCB) Education Committee has sought to provide a framework for training needs and curricula in terms of a set of bioinformatics core competencies that cut across many user personas and training programs. The initial competencies developed based on surveys of employers and training programs have since been refined through a multiyear process of community engagement. This report describes the current status of the competencies and presents a series of use cases illustrating how they are being applied in diverse training contexts. These use cases are intended to demonstrate how others can make use of the competencies and engage in the process of their continuing refinement and application. The report concludes with a consideration of remaining challenges and future plans.

  17. The development and application of bioinformatics core competencies to improve bioinformatics training and education

    Science.gov (United States)

    Brooksbank, Cath; Morgan, Sarah L.; Rosenwald, Anne; Warnow, Tandy; Welch, Lonnie

    2018-01-01

    Bioinformatics is recognized as part of the essential knowledge base of numerous career paths in biomedical research and healthcare. However, there is little agreement in the field over what that knowledge entails or how best to provide it. These disagreements are compounded by the wide range of populations in need of bioinformatics training, with divergent prior backgrounds and intended application areas. The Curriculum Task Force of the International Society of Computational Biology (ISCB) Education Committee has sought to provide a framework for training needs and curricula in terms of a set of bioinformatics core competencies that cut across many user personas and training programs. The initial competencies developed based on surveys of employers and training programs have since been refined through a multiyear process of community engagement. This report describes the current status of the competencies and presents a series of use cases illustrating how they are being applied in diverse training contexts. These use cases are intended to demonstrate how others can make use of the competencies and engage in the process of their continuing refinement and application. The report concludes with a consideration of remaining challenges and future plans. PMID:29390004

  18. 4273π: bioinformatics education on low cost ARM hardware.

    Science.gov (United States)

    Barker, Daniel; Ferrier, David Ek; Holland, Peter Wh; Mitchell, John Bo; Plaisier, Heleen; Ritchie, Michael G; Smart, Steven D

    2013-08-12

    Teaching bioinformatics at universities is complicated by typical computer classroom settings. As well as running software locally and online, students should gain experience of systems administration. For a future career in biology or bioinformatics, the installation of software is a useful skill. We propose that this may be taught by running the course on GNU/Linux running on inexpensive Raspberry Pi computer hardware, for which students may be granted full administrator access. We release 4273π, an operating system image for Raspberry Pi based on Raspbian Linux. This includes minor customisations for classroom use and includes our Open Access bioinformatics course, 4273π Bioinformatics for Biologists. This is based on the final-year undergraduate module BL4273, run on Raspberry Pi computers at the University of St Andrews, Semester 1, academic year 2012-2013. 4273π is a means to teach bioinformatics, including systems administration tasks, to undergraduates at low cost.

  19. Bioinformatics Identification of Antigenic Peptide: Predicting the Specificity of Major MHC Class I and II Pathway Players

    DEFF Research Database (Denmark)

    Lund, Ole; Karosiene, Edita; Lundegaard, Claus

    2013-01-01

    Bioinformatics methods for immunology have become increasingly used over the last decade and now form an integrated part of most epitope discovery projects. This wide usage has led to the confusion of defining which of the many methods to use for what problems. In this chapter, an overview is given...

  20. Molecular characteristic and pathogenicity of Indonesian H5N1 clade 2.3.2 viruses

    Directory of Open Access Journals (Sweden)

    Dharmayanti NLPI

    2013-06-01

    Full Text Available The outbreak of disease in late 2012 in Indonesia caused high duck mortality. The agent of the disease was identified as H5N1 clade 2.3.2. The disease caused economic loss to the Indonesian duck farmer. The clade 2.3.2 of H5N1 virus has not previously been identified, so this study was conducted to characterize 4 of H5N1 clade 2.3.2 viruses by DNA sequencing in eight genes segment virus namely HA, NA, NS, M, PB1, PB2, PA and NP. The pathogenicity test of clade 2.3.2 viruses in ducks was compared to clade 2.1.3 viruses which predominat circulating in Indonesia. Results of phylogenetic tree analysis showed that the four of clade 2.3.2 viruses isolated in 2012 was the new introduced virus from abroad. Further analysis showed eight genes were in one group with the clade 2.3.2 viruses, especially those from VietNam and did not belong to Indonesia viruses group. The pathogenicity test in ducks showed that virus H5N1 clade 2.3.2 and clade 2.1.3 have similar clinical symptoms and pathogenicity and cause death in 75% of ducks on days 3-6 after infection.

  1. Translational Bioinformatics and Clinical Research (Biomedical) Informatics.

    Science.gov (United States)

    Sirintrapun, S Joseph; Zehir, Ahmet; Syed, Aijazuddin; Gao, JianJiong; Schultz, Nikolaus; Cheng, Donavan T

    2015-06-01

    Translational bioinformatics and clinical research (biomedical) informatics are the primary domains related to informatics activities that support translational research. Translational bioinformatics focuses on computational techniques in genetics, molecular biology, and systems biology. Clinical research (biomedical) informatics involves the use of informatics in discovery and management of new knowledge relating to health and disease. This article details 3 projects that are hybrid applications of translational bioinformatics and clinical research (biomedical) informatics: The Cancer Genome Atlas, the cBioPortal for Cancer Genomics, and the Memorial Sloan Kettering Cancer Center clinical variants and results database, all designed to facilitate insights into cancer biology and clinical/therapeutic correlations. Copyright © 2015 Elsevier Inc. All rights reserved.

  2. A bioinformatics approach to the development of immunoassays for specified risk material in canned meat products

    NARCIS (Netherlands)

    Reece, P.; Bremer, M.G.E.G.; Stones, R.; Danks, C.; Baumgartner, S.; Tomkies, V.; Hemetsberger, C.; Smits, N.G.E.; Lubbe, W.

    2009-01-01

    A bioinformatics approach to developing antibodies to specific proteins has been evaluated for the production of antibodies to heat-processed specified risk tissues from ruminants (brain and eye tissue). The approach involved the identification of proteins specific to ruminant tissues by

  3. Herbivory increases diversification across insect clades.

    Science.gov (United States)

    Wiens, John J; Lapoint, Richard T; Whiteman, Noah K

    2015-09-24

    Insects contain more than half of all living species, but the causes of their remarkable diversity remain poorly understood. Many authors have suggested that herbivory has accelerated diversification in many insect clades. However, others have questioned the role of herbivory in insect diversification. Here, we test the relationships between herbivory and insect diversification across multiple scales. We find a strong, positive relationship between herbivory and diversification among insect orders. However, herbivory explains less variation in diversification within some orders (Diptera, Hemiptera) or shows no significant relationship with diversification in others (Coleoptera, Hymenoptera, Orthoptera). Thus, we support the overall importance of herbivory for insect diversification, but also show that its impacts can vary across scales and clades. In summary, our results illuminate the causes of species richness patterns in a group containing most living species, and show the importance of ecological impacts on diversification in explaining the diversity of life.

  4. The Virtual Xenbase: transitioning an online bioinformatics resource to a private cloud.

    Science.gov (United States)

    Karimi, Kamran; Vize, Peter D

    2014-01-01

    As a model organism database, Xenbase has been providing informatics and genomic data on Xenopus (Silurana) tropicalis and Xenopus laevis frogs for more than a decade. The Xenbase database contains curated, as well as community-contributed and automatically harvested literature, gene and genomic data. A GBrowse genome browser, a BLAST+ server and stock center support are available on the site. When this resource was first built, all software services and components in Xenbase ran on a single physical server, with inherent reliability, scalability and inter-dependence issues. Recent advances in networking and virtualization techniques allowed us to move Xenbase to a virtual environment, and more specifically to a private cloud. To do so we decoupled the different software services and components, such that each would run on a different virtual machine. In the process, we also upgraded many of the components. The resulting system is faster and more reliable. System maintenance is easier, as individual virtual machines can now be updated, backed up and changed independently. We are also experiencing more effective resource allocation and utilization. Database URL: www.xenbase.org. © The Author(s) 2014. Published by Oxford University Press.

  5. [Pharmacogenetics II. Research molecular methods, bioinformatics and ethical concerns].

    Science.gov (United States)

    Daudén, E

    2007-01-01

    Pharmacogenetics refers to the study of the individual pharmacological response based on the genotype. Its objective is to optimize treatment in an individual basis, thereby creating a more efficient and safe personalized therapy. In the second part of this review, the molecular methods of study in pharmacogenetics, including microarray technology or DNA chips, are discussed. Among them we highlight the microarrays used to determine the gene expression that detect specific RNA sequences, and the microarrays employed to determine the genotype that detect specific DNA sequences, including polymorphisms, particularly single nucleotide polymorphisms (SNPs). The relationship between pharmacogenetics, bioinformatics and ethical concerns is reviewed.

  6. Open source approaches to establishing Roseobacter clade bacteria as synthetic biology chassis for biogeoengineering

    Directory of Open Access Journals (Sweden)

    Yanika Borg

    2016-07-01

    Full Text Available Aim. The nascent field of bio-geoengineering stands to benefit from synthetic biologists’ efforts to standardise, and in so doing democratise, biomolecular research methods. Roseobacter clade bacteria comprise 15–20% of oceanic bacterio-plankton communities, making them a prime candidate for establishment of synthetic biology chassis for bio-geoengineering activities such as bioremediation of oceanic waste plastic. Developments such as the increasing affordability of DNA synthesis and laboratory automation continue to foster the establishment of a global ‘do-it-yourself’ research community alongside the more traditional arenas of academe and industry. As a collaborative group of citizen, student and professional scientists we sought to test the following hypotheses: (i that an incubator capable of cultivating bacterial cells can be constructed entirely from non-laboratory items, (ii that marine bacteria from the Roseobacter clade can be established as a genetically tractable synthetic biology chassis using plasmids conforming to the BioBrickTM standard and finally, (iii that identifying and subcloning genes from a Roseobacter clade species can readily by achieved by citizen scientists using open source cloning and bioinformatic tools. Method. We cultivated three Roseobacter species, Roseobacter denitrificans, Oceanobulbus indolifexand Dinoroseobacter shibae. For each species we measured chloramphenicol sensitivity, viability over 11 weeks of glycerol-based cryopreservation and tested the effectiveness of a series of electroporation and heat shock protocols for transformation using a variety of plasmid types. We also attempted construction of an incubator-shaker device using only publicly available components. Finally, a subgroup comprising citizen scientists designed and attempted a procedure for isolating the cold resistance anf1 gene from Oceanobulbus indolifexcells and subcloning it into a BioBrickTM formatted plasmid. Results. All

  7. Photosynthetic pigments of oceanic Chlorophyta belonging to prasinophytes clade VII.

    Science.gov (United States)

    Lopes Dos Santos, Adriana; Gourvil, Priscillia; Rodríguez, Francisco; Garrido, José Luis; Vaulot, Daniel

    2016-02-01

    The ecological importance and diversity of pico/nanoplanktonic algae remains poorly studied in marine waters, in part because many are tiny and without distinctive morphological features. Amongst green algae, Mamiellophyceae such as Micromonas or Bathycoccus are dominant in coastal waters while prasinophytes clade VII, yet not formerly described, appear to be major players in open oceanic waters. The pigment composition of 14 strains representative of different subclades of clade VII was analyzed using a method that improves the separation of loroxanthin and neoxanthin. All the prasinophytes clade VII analyzed here showed a pigment composition similar to that previously reported for RCC287 corresponding to pigment group prasino-2A. However, we detected in addition astaxanthin for which it is the first report in prasinophytes. Among the strains analyzed, the pigment signature is qualitatively similar within subclades A and B. By contrast, RCC3402 from subclade C (Picocystis) lacks loroxanthin, astaxanthin, and antheraxanthin but contains alloxanthin, diatoxanthin, and monadoxanthin that are usually found in diatoms or cryptophytes. For subclades A and B, loroxanthin was lowest at highest light irradiance suggesting a light-harvesting role of this pigment in clade VII as in Tetraselmis. © 2015 Phycological Society of America.

  8. Extending Asia Pacific bioinformatics into new realms in the "-omics" era.

    Science.gov (United States)

    Ranganathan, Shoba; Eisenhaber, Frank; Tong, Joo Chuan; Tan, Tin Wee

    2009-12-03

    The 2009 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation dating back to 1998, was organized as the 8th International Conference on Bioinformatics (InCoB), Sept. 7-11, 2009 at Biopolis, Singapore. Besides bringing together scientists from the field of bioinformatics in this region, InCoB has actively engaged clinicians and researchers from the area of systems biology, to facilitate greater synergy between these two groups. InCoB2009 followed on from a series of successful annual events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea), New Delhi (India), Hong Kong and Taipei (Taiwan), with InCoB2010 scheduled to be held in Tokyo, Japan, Sept. 26-28, 2010. The Workshop on Education in Bioinformatics and Computational Biology (WEBCB) and symposia on Clinical Bioinformatics (CBAS), the Singapore Symposium on Computational Biology (SYMBIO) and training tutorials were scheduled prior to the scientific meeting, and provided ample opportunity for in-depth learning and special interest meetings for educators, clinicians and students. We provide a brief overview of the peer-reviewed bioinformatics manuscripts accepted for publication in this supplement, grouped into thematic areas. In order to facilitate scientific reproducibility and accountability, we have, for the first time, introduced minimum information criteria for our pubilcations, including compliance to a Minimum Information about a Bioinformatics Investigation (MIABi). As the regional research expertise in bioinformatics matures, we have delineated a minimum set of bioinformatics skills required for addressing the computational challenges of the "-omics" era.

  9. Tissue Banking, Bioinformatics, and Electronic Medical Records: The Front-End Requirements for Personalized Medicine

    Science.gov (United States)

    Suh, K. Stephen; Sarojini, Sreeja; Youssif, Maher; Nalley, Kip; Milinovikj, Natasha; Elloumi, Fathi; Russell, Steven; Pecora, Andrew; Schecter, Elyssa; Goy, Andre

    2013-01-01

    Personalized medicine promises patient-tailored treatments that enhance patient care and decrease overall treatment costs by focusing on genetics and “-omics” data obtained from patient biospecimens and records to guide therapy choices that generate good clinical outcomes. The approach relies on diagnostic and prognostic use of novel biomarkers discovered through combinations of tissue banking, bioinformatics, and electronic medical records (EMRs). The analytical power of bioinformatic platforms combined with patient clinical data from EMRs can reveal potential biomarkers and clinical phenotypes that allow researchers to develop experimental strategies using selected patient biospecimens stored in tissue banks. For cancer, high-quality biospecimens collected at diagnosis, first relapse, and various treatment stages provide crucial resources for study designs. To enlarge biospecimen collections, patient education regarding the value of specimen donation is vital. One approach for increasing consent is to offer publically available illustrations and game-like engagements demonstrating how wider sample availability facilitates development of novel therapies. The critical value of tissue bank samples, bioinformatics, and EMR in the early stages of the biomarker discovery process for personalized medicine is often overlooked. The data obtained also require cross-disciplinary collaborations to translate experimental results into clinical practice and diagnostic and prognostic use in personalized medicine. PMID:23818899

  10. Microsoft Biology Initiative: .NET Bioinformatics Platform and Tools

    Science.gov (United States)

    Diaz Acosta, B.

    2011-01-01

    The Microsoft Biology Initiative (MBI) is an effort in Microsoft Research to bring new technology and tools to the area of bioinformatics and biology. This initiative is comprised of two primary components, the Microsoft Biology Foundation (MBF) and the Microsoft Biology Tools (MBT). MBF is a language-neutral bioinformatics toolkit built as an extension to the Microsoft .NET Framework—initially aimed at the area of Genomics research. Currently, it implements a range of parsers for common bioinformatics file formats; a range of algorithms for manipulating DNA, RNA, and protein sequences; and a set of connectors to biological web services such as NCBI BLAST. MBF is available under an open source license, and executables, source code, demo applications, documentation and training materials are freely downloadable from http://research.microsoft.com/bio. MBT is a collection of tools that enable biology and bioinformatics researchers to be more productive in making scientific discoveries.

  11. Designing a course model for distance-based online bioinformatics training in Africa: The H3ABioNet experience

    Science.gov (United States)

    Panji, Sumir; Fernandes, Pedro L.; Judge, David P.; Ghouila, Amel; Salifu, Samson P.; Ahmed, Rehab; Kayondo, Jonathan; Ssemwanga, Deogratius

    2017-01-01

    Africa is not unique in its need for basic bioinformatics training for individuals from a diverse range of academic backgrounds. However, particular logistical challenges in Africa, most notably access to bioinformatics expertise and internet stability, must be addressed in order to meet this need on the continent. H3ABioNet (www.h3abionet.org), the Pan African Bioinformatics Network for H3Africa, has therefore developed an innovative, free-of-charge “Introduction to Bioinformatics” course, taking these challenges into account as part of its educational efforts to provide on-site training and develop local expertise inside its network. A multiple-delivery–mode learning model was selected for this 3-month course in order to increase access to (mostly) African, expert bioinformatics trainers. The content of the course was developed to include a range of fundamental bioinformatics topics at the introductory level. For the first iteration of the course (2016), classrooms with a total of 364 enrolled participants were hosted at 20 institutions across 10 African countries. To ensure that classroom success did not depend on stable internet, trainers pre-recorded their lectures, and classrooms downloaded and watched these locally during biweekly contact sessions. The trainers were available via video conferencing to take questions during contact sessions, as well as via online “question and discussion” forums outside of contact session time. This learning model, developed for a resource-limited setting, could easily be adapted to other settings. PMID:28981516

  12. BioXSD: the common data-exchange format for everyday bioinformatics web services

    DEFF Research Database (Denmark)

    Kalas, M.; Puntervoll, P.; Joseph, A.

    2010-01-01

    Motivation: The world-wide community of life scientists has access to a large number of public bioinformatics databases and tools, which are developed and deployed using diverse technologies and designs. More and more of the resources offer programmatic web-service interface. However, efficient use...... and defines syntax for biological sequences, sequence annotations, alignments and references to resources. We have adapted a set of web services to use BioXSD as the input and output format, and implemented a test-case workflow. This demonstrates that the approach is feasible and provides smooth...... interoperability. Semantics for BioXSD is provided by annotation with the EDAM ontology. We discuss in a separate section how BioXSD relates to other initiatives and approaches, including existing standards and the Semantic Web....

  13. Application of bioinformatics on the detection of pathogens by Pcr

    International Nuclear Information System (INIS)

    Rezig, Slim; Sakhri, Saber

    2007-01-01

    Salmonellas are the main responsible agent for the frequent food-borne gastrointestinal diseases. Their detection using classical methods are laborious and their results take a lot of time to be revealed. In this context, we tried to set up a revealing technique of the invA virulence gene, found in the majority of Salmonella species. After amplification with PCR using specific primers created and verified by bioinformatics programs, two couples of primers were set up and they appeared to be very specific and sensitive for the detection of invA gene. (Author)

  14. Concepts and introduction to RNA bioinformatics

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Hofacker, Ivo L.; Ruzzo, Walter L.

    2014-01-01

    RNA bioinformatics and computational RNA biology have emerged from implementing methods for predicting the secondary structure of single sequences. The field has evolved to exploit multiple sequences to take evolutionary information into account, such as compensating (and structure preserving) base...... for interactions between RNA and proteins.Here, we introduce the basic concepts of predicting RNA secondary structure relevant to the further analyses of RNA sequences. We also provide pointers to methods addressing various aspects of RNA bioinformatics and computational RNA biology....

  15. Workflows in bioinformatics: meta-analysis and prototype implementation of a workflow generator

    Directory of Open Access Journals (Sweden)

    Thoraval Samuel

    2005-04-01

    Full Text Available Abstract Background Computational methods for problem solving need to interleave information access and algorithm execution in a problem-specific workflow. The structures of these workflows are defined by a scaffold of syntactic, semantic and algebraic objects capable of representing them. Despite the proliferation of GUIs (Graphic User Interfaces in bioinformatics, only some of them provide workflow capabilities; surprisingly, no meta-analysis of workflow operators and components in bioinformatics has been reported. Results We present a set of syntactic components and algebraic operators capable of representing analytical workflows in bioinformatics. Iteration, recursion, the use of conditional statements, and management of suspend/resume tasks have traditionally been implemented on an ad hoc basis and hard-coded; by having these operators properly defined it is possible to use and parameterize them as generic re-usable components. To illustrate how these operations can be orchestrated, we present GPIPE, a prototype graphic pipeline generator for PISE that allows the definition of a pipeline, parameterization of its component methods, and storage of metadata in XML formats. This implementation goes beyond the macro capacities currently in PISE. As the entire analysis protocol is defined in XML, a complete bioinformatic experiment (linked sets of methods, parameters and results can be reproduced or shared among users. Availability: http://if-web1.imb.uq.edu.au/Pise/5.a/gpipe.html (interactive, ftp://ftp.pasteur.fr/pub/GenSoft/unix/misc/Pise/ (download. Conclusion From our meta-analysis we have identified syntactic structures and algebraic operators common to many workflows in bioinformatics. The workflow components and algebraic operators can be assimilated into re-usable software components. GPIPE, a prototype implementation of this framework, provides a GUI builder to facilitate the generation of workflows and integration of heterogeneous

  16. Eukaryotic Pathogen Database Resources (EuPathDB)

    Data.gov (United States)

    U.S. Department of Health & Human Services — EuPathDB Bioinformatics Resource Center for Biodefense and Emerging/Re-emerging Infectious Diseases is a portal for accessing genomic-scale datasets associated with...

  17. Developing library bioinformatics services in context: the Purdue University Libraries bioinformationist program.

    Science.gov (United States)

    Rein, Diane C

    2006-07-01

    Purdue University is a major agricultural, engineering, biomedical, and applied life science research institution with an increasing focus on bioinformatics research that spans multiple disciplines and campus academic units. The Purdue University Libraries (PUL) hired a molecular biosciences specialist to discover, engage, and support bioinformatics needs across the campus. After an extended period of information needs assessment and environmental scanning, the specialist developed a week of focused bioinformatics instruction (Bioinformatics Week) to launch system-wide, library-based bioinformatics services. The specialist employed a two-tiered approach to assess user information requirements and expectations. The first phase involved careful observation and collection of information needs in-context throughout the campus, attending laboratory meetings, interviewing department chairs and individual researchers, and engaging in strategic planning efforts. Based on the information gathered during the integration phase, several survey instruments were developed to facilitate more critical user assessment and the recovery of quantifiable data prior to planning. Given information gathered while working with clients and through formal needs assessments, as well as the success of instructional approaches used in Bioinformatics Week, the specialist is developing bioinformatics support services for the Purdue community. The specialist is also engaged in training PUL faculty librarians in bioinformatics to provide a sustaining culture of library-based bioinformatics support and understanding of Purdue's bioinformatics-related decision and policy making.

  18. Phylogeny, evolutionary trends and classification of the Spathelia-Ptaeroxylon clade: morphological and molecular insights.

    Science.gov (United States)

    Appelhans, M S; Smets, E; Razafimandimbison, S G; Haevermans, T; van Marle, E J; Couloux, A; Rabarison, H; Randrianarivelojosia, M; Kessler, P J A

    2011-06-01

    The Spathelia-Ptaeroxylon clade is a group of morphologically diverse plants that have been classified together as a result of molecular phylogenetic studies. The clade is currently included in Rutaceae and recognized at a subfamilial level (Spathelioideae) despite the fact that most of its genera have traditionally been associated with other families and that there are no obvious morphological synapomorphies for the clade. The aim of the present study is to construct phylogenetic trees for the Spathelia-Ptaeroxylon clade and to investigate anatomical characters in order to decide whether it should be kept in Rutaceae or recognized at the familial level. Anatomical characters were plotted on a cladogram to help explain character evolution within the group. Moreover, phylogenetic relationships and generic limits within the clade are also addressed. A species-level phylogenetic analysis of the Spathelia-Ptaeroxylon clade based on five plastid DNA regions (rbcL, atpB, trnL-trnF, rps16 and psbA-trnH) was conducted using Bayesian, maximum parsimony and maximum likelihood methods. Leaf and seed anatomical characters of all genera were (re)investigated by light and scanning electron microscopy. With the exception of Spathelia, all genera of the Spathelila-Ptaeroxylon clade are monophyletic. The typical leaf and seed anatomical characters of Rutaceae were found. Further, the presence of oil cells in the leaves provides a possible synapomorphy for the clade. The Spathelia-Ptaeroxylon clade is well placed in Rutaceae and it is reasonable to unite the genera into one subfamily (Spathelioideae). We propose a new tribal classification of Spathelioideae. A narrow circumscription of Spathelia is established to make the genus monophyletic, and Sohnreyia is resurrected to accommodate the South American species of Spathelia. The most recent common ancestor of Spathelioideae probably had leaves with secretory cavities and oil cells, haplostemonous flowers with appendaged staminal

  19. LXtoo: an integrated live Linux distribution for the bioinformatics community.

    Science.gov (United States)

    Yu, Guangchuang; Wang, Li-Gen; Meng, Xiao-Hua; He, Qing-Yu

    2012-07-19

    Recent advances in high-throughput technologies dramatically increase biological data generation. However, many research groups lack computing facilities and specialists. This is an obstacle that remains to be addressed. Here, we present a Linux distribution, LXtoo, to provide a flexible computing platform for bioinformatics analysis. Unlike most of the existing live Linux distributions for bioinformatics limiting their usage to sequence analysis and protein structure prediction, LXtoo incorporates a comprehensive collection of bioinformatics software, including data mining tools for microarray and proteomics, protein-protein interaction analysis, and computationally complex tasks like molecular dynamics. Moreover, most of the programs have been configured and optimized for high performance computing. LXtoo aims to provide well-supported computing environment tailored for bioinformatics research, reducing duplication of efforts in building computing infrastructure. LXtoo is distributed as a Live DVD and freely available at http://bioinformatics.jnu.edu.cn/LXtoo.

  20. Concepts Of Bioinformatics And Its Application In Veterinary ...

    African Journals Online (AJOL)

    Bioinformatics has advanced the course of research and future veterinary vaccines development because it has provided new tools for identification of vaccine targets from sequenced biological data of organisms. In Nigeria, there is lack of bioinformatics training in the universities, expect for short training courses in which ...

  1. A Survey on Evolutionary Algorithm Based Hybrid Intelligence in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Shan Li

    2014-01-01

    Full Text Available With the rapid advance in genomics, proteomics, metabolomics, and other types of omics technologies during the past decades, a tremendous amount of data related to molecular biology has been produced. It is becoming a big challenge for the bioinformatists to analyze and interpret these data with conventional intelligent techniques, for example, support vector machines. Recently, the hybrid intelligent methods, which integrate several standard intelligent approaches, are becoming more and more popular due to their robustness and efficiency. Specifically, the hybrid intelligent approaches based on evolutionary algorithms (EAs are widely used in various fields due to the efficiency and robustness of EAs. In this review, we give an introduction about the applications of hybrid intelligent methods, in particular those based on evolutionary algorithm, in bioinformatics. In particular, we focus on their applications to three common problems that arise in bioinformatics, that is, feature selection, parameter estimation, and reconstruction of biological networks.

  2. Genome-wide comparison of cowpox viruses reveals a new clade related to Variola virus.

    Directory of Open Access Journals (Sweden)

    Piotr Wojtek Dabrowski

    Full Text Available Zoonotic infections caused by several orthopoxviruses (OPV like monkeypox virus or vaccinia virus have a significant impact on human health. In Europe, the number of diagnosed infections with cowpox viruses (CPXV is increasing in animals as well as in humans. CPXV used to be enzootic in cattle; however, such infections were not being diagnosed over the last decades. Instead, individual cases of cowpox are being found in cats or exotic zoo animals that transmit the infection to humans. Both animals and humans reveal local exanthema on arms and legs or on the face. Although cowpox is generally regarded as a self-limiting disease, immunosuppressed patients can develop a lethal systemic disease resembling smallpox. To date, only limited information on the complex and, compared to other OPV, sparsely conserved CPXV genomes is available. Since CPXV displays the widest host range of all OPV known, it seems important to comprehend the genetic repertoire of CPXV which in turn may help elucidate specific mechanisms of CPXV pathogenesis and origin. Therefore, 22 genomes of independent CPXV strains from clinical cases, involving ten humans, four rats, two cats, two jaguarundis, one beaver, one elephant, one marah and one mongoose, were sequenced by using massive parallel pyrosequencing. The extensive phylogenetic analysis showed that the CPXV strains sequenced clearly cluster into several distinct clades, some of which are closely related to Vaccinia viruses while others represent different clades in a CPXV cluster. Particularly one CPXV clade is more closely related to Camelpox virus, Taterapox virus and Variola virus than to any other known OPV. These results support and extend recent data from other groups who postulate that CPXV does not form a monophyletic clade and should be divided into multiple lineages.

  3. BioWarehouse: a bioinformatics database warehouse toolkit.

    Science.gov (United States)

    Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David W J; Tenenbaum, Jessica D; Karp, Peter D

    2006-03-23

    This article addresses the problem of interoperation of heterogeneous bioinformatics databases. We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. BioWarehouse embodies significant progress on the database integration problem for bioinformatics.

  4. Role of remote sensing, geographical information system (GIS) and bioinformatics in kala-azar epidemiology.

    Science.gov (United States)

    Bhunia, Gouri Sankar; Dikhit, Manas Ranjan; Kesari, Shreekant; Sahoo, Ganesh Chandra; Das, Pradeep

    2011-11-01

    Visceral leishmaniasis or kala-azar is a potent parasitic infection causing death of thousands of people each year. Medicinal compounds currently available for the treatment of kala-azar have serious side effects and decreased efficacy owing to the emergence of resistant strains. The type of immune reaction is also to be considered in patients infected with Leishmania donovani (L. donovani). For complete eradication of this disease, a high level modern research is currently being applied both at the molecular level as well as at the field level. The computational approaches like remote sensing, geographical information system (GIS) and bioinformatics are the key resources for the detection and distribution of vectors, patterns, ecological and environmental factors and genomic and proteomic analysis. Novel approaches like GIS and bioinformatics have been more appropriately utilized in determining the cause of visearal leishmaniasis and in designing strategies for preventing the disease from spreading from one region to another.

  5. Application of Bioinformatics in Chronobiology Research

    Directory of Open Access Journals (Sweden)

    Robson da Silva Lopes

    2013-01-01

    Full Text Available Bioinformatics and other well-established sciences, such as molecular biology, genetics, and biochemistry, provide a scientific approach for the analysis of data generated through “omics” projects that may be used in studies of chronobiology. The results of studies that apply these techniques demonstrate how they significantly aided the understanding of chronobiology. However, bioinformatics tools alone cannot eliminate the need for an understanding of the field of research or the data to be considered, nor can such tools replace analysts and researchers. It is often necessary to conduct an evaluation of the results of a data mining effort to determine the degree of reliability. To this end, familiarity with the field of investigation is necessary. It is evident that the knowledge that has been accumulated through chronobiology and the use of tools derived from bioinformatics has contributed to the recognition and understanding of the patterns and biological rhythms found in living organisms. The current work aims to develop new and important applications in the near future through chronobiology research.

  6. Comparing the Effects of Symbiotic Algae (Symbiodinium) Clades C1 and D on Early Growth Stages of Acropora tenuis

    Science.gov (United States)

    Yuyama, Ikuko; Higuchi, Tomihiko

    2014-01-01

    Reef-building corals switch endosymbiotic algae of the genus Symbiodinium during their early growth stages and during bleaching events. Clade C Symbiodinium algae are dominant in corals, although other clades — including A and D — have also been commonly detected in juvenile Acroporid corals. Previous studies have been reported that only molecular data of Symbiodinium clade were identified within field corals. In this study, we inoculated aposymbiotic juvenile polyps with cultures of clades C1 and D Symbiodinium algae, and investigated the different effect of these two clades of Symbiodinium on juvenile polyps. Our results showed that clade C1 algae did not grow, while clade D algae grew rapidly during the first 2 months after inoculation. Polyps associated with clade C1 algae exhibited bright green fluorescence across the body and tentacles after inoculation. The growth rate of polyp skeletons was lower in polyps associated with clade C1 algae than those associated with clade D algae. On the other hand, antioxidant activity (catalase) of corals was not significantly different between corals with clade C1 and clade D algae. Our results suggested that clade D Symbiodinium algae easily form symbiotic relationships with corals and that these algae could contribute to coral growth in early symbiosis stages. PMID:24914677

  7. Phylogeny, evolutionary trends and classification of the Spathelia–Ptaeroxylon clade: morphological and molecular insights

    Science.gov (United States)

    Appelhans, M. S.; Smets, E.; Razafimandimbison, S. G.; Haevermans, T.; van Marle, E. J.; Couloux, A.; Rabarison, H.; Randrianarivelojosia, M.; Keßler, P. J. A.

    2011-01-01

    Background and Aims The Spathelia–Ptaeroxylon clade is a group of morphologically diverse plants that have been classified together as a result of molecular phylogenetic studies. The clade is currently included in Rutaceae and recognized at a subfamilial level (Spathelioideae) despite the fact that most of its genera have traditionally been associated with other families and that there are no obvious morphological synapomorphies for the clade. The aim of the present study is to construct phylogenetic trees for the Spathelia–Ptaeroxylon clade and to investigate anatomical characters in order to decide whether it should be kept in Rutaceae or recognized at the familial level. Anatomical characters were plotted on a cladogram to help explain character evolution within the group. Moreover, phylogenetic relationships and generic limits within the clade are also addressed. Methods A species-level phylogenetic analysis of the Spathelia–Ptaeroxylon clade based on five plastid DNA regions (rbcL, atpB, trnL–trnF, rps16 and psbA–trnH) was conducted using Bayesian, maximum parsimony and maximum likelihood methods. Leaf and seed anatomical characters of all genera were (re)investigated by light and scanning electron microscopy. Key Results With the exception of Spathelia, all genera of the Spathelila–Ptaeroxylon clade are monophyletic. The typical leaf and seed anatomical characters of Rutaceae were found. Further, the presence of oil cells in the leaves provides a possible synapomorphy for the clade. Conclusions The Spathelia–Ptaeroxylon clade is well placed in Rutaceae and it is reasonable to unite the genera into one subfamily (Spathelioideae). We propose a new tribal classification of Spathelioideae. A narrow circumscription of Spathelia is established to make the genus monophyletic, and Sohnreyia is resurrected to accommodate the South American species of Spathelia. The most recent common ancestor of Spathelioideae probably had leaves with secretory cavities

  8. Diversification of AID/APOBEC-like deaminases in metazoa: multiplicity of clades and widespread roles in immunity.

    Science.gov (United States)

    Krishnan, Arunkumar; Iyer, Lakshminarayan M; Holland, Stephen J; Boehm, Thomas; Aravind, L

    2018-04-03

    AID/APOBEC deaminases (AADs) convert cytidine to uridine in single-stranded nucleic acids. They are involved in numerous mutagenic processes, including those underpinning vertebrate innate and adaptive immunity. Using a multipronged sequence analysis strategy, we uncover several AADs across metazoa, dictyosteliida, and algae, including multiple previously unreported vertebrate clades, and versions from urochordates, nematodes, echinoderms, arthropods, lophotrochozoans, cnidarians, and porifera. Evolutionary analysis suggests a fundamental division of AADs early in metazoan evolution into secreted deaminases (SNADs) and classical AADs, followed by diversification into several clades driven by rapid-sequence evolution, gene loss, lineage-specific expansions, and lateral transfer to various algae. Most vertebrate AADs, including AID and APOBECs1-3, diversified in the vertebrates, whereas the APOBEC4-like clade has a deeper origin in metazoa. Positional entropy analysis suggests that several AAD clades are diversifying rapidly, especially in the positions predicted to interact with the nucleic acid target motif, and with potential viral inhibitors. Further, several AADs have evolved neomorphic metal-binding inserts, especially within loops predicted to interact with the target nucleic acid. We also observe polymorphisms, driven by alternative splicing, gene loss, and possibly intergenic recombination between paralogs. We propose that biological conflicts of AADs with viruses and genomic retroelements are drivers of rapid AAD evolution, suggesting a widespread presence of mutagenesis-based immune-defense systems. Deaminases like AID represent versions "institutionalized" from the broader array of AADs pitted in such arms races for mutagenesis of self-DNA, and similar recruitment might have independently occurred elsewhere in metazoa. Copyright © 2018 the Author(s). Published by PNAS.

  9. Achievements and challenges in structural bioinformatics and computational biophysics.

    Science.gov (United States)

    Samish, Ilan; Bourne, Philip E; Najmanovich, Rafael J

    2015-01-01

    The field of structural bioinformatics and computational biophysics has undergone a revolution in the last 10 years. Developments that are captured annually through the 3DSIG meeting, upon which this article reflects. An increase in the accessible data, computational resources and methodology has resulted in an increase in the size and resolution of studied systems and the complexity of the questions amenable to research. Concomitantly, the parameterization and efficiency of the methods have markedly improved along with their cross-validation with other computational and experimental results. The field exhibits an ever-increasing integration with biochemistry, biophysics and other disciplines. In this article, we discuss recent achievements along with current challenges within the field. © The Author 2014. Published by Oxford University Press.

  10. Establishment of the cross-clade antigen detection system for H5 subtype influenza viruses using peptide monoclonal antibodies specific for influenza virus H5 hemagglutinin.

    Science.gov (United States)

    Takahashi, Hitoshi; Nagata, Shiho; Odagiri, Takato; Kageyama, Tsutomu

    2018-04-15

    The H5 subtype of highly pathogenic avian influenza (H5 HPAI) viruses is a threat to both animal and human public health and has the potential to cause a serious future pandemic in humans. Thus, specific and rapid detection of H5 HPAI viruses is required for infection control in humans. To develop a simple and rapid diagnostic system to detect H5 HPAI viruses with high specificity and sensitivity, we attempted to prepare monoclonal antibodies (mAbs) that specifically recognize linear epitopes in hemagglutinin (HA) of H5 subtype viruses. Nine mAb clones were obtained from mice immunized with a synthetic partial peptide of H5 HA molecules conserved among various H5 HPAI viruses. The antigen-capture enzyme-linked immunosorbent assay using the most suitable combination of these mAbs, which bound specifically to lysed H5 HA under an optimized detergent condition, was specific for H5 viruses and could broadly detect H5 viruses in multiple different clades. Taken together, these peptide mAbs, which recognize linear epitopes in a highly conserved region of H5 HA, may be useful for specific and highly sensitive detection of H5 HPAI viruses and can help in the rapid diagnosis of human, avian, and animal H5 virus infections. Copyright © 2018 Elsevier Inc. All rights reserved.

  11. Bioinformatics tools and database resources for systems genetics analysis in mice-a short review and an evaluation of future needs

    NARCIS (Netherlands)

    Durrant, Caroline; Swertz, Morris A.; Alberts, Rudi; Arends, Danny; Moeller, Steffen; Mott, Richard; Prins, Pjotr; van der Velde, K. Joeri; Jansen, Ritsert C.; Schughart, Klaus

    During a meeting of the SYSGENET working group 'Bioinformatics', currently available software tools and databases for systems genetics in mice were reviewed and the needs for future developments discussed. The group evaluated interoperability and performed initial feasibility studies. To aid future

  12. Phylogenetic Signal of Threatening Processes among Hylids: The Need for Clade-Level Conservation Planning

    Directory of Open Access Journals (Sweden)

    Sarah J. Corey

    2010-01-01

    Full Text Available Rapid, global declines among amphibians are partly alarming because many occur for apparently unknown or enigmatic reasons. Moreover, the relationship between phylogeny and enigmatic declines in higher clades of the amphibian phylogeny appears at first to be an intractable problem. I present a working solution by assessing threatening processes potentially underlying enigmatic declines in the family, Hylidae. Applying comparative methods that account for various evolutionary scenarios, I find extreme concentrations of threatening processes, including pollution and habitat loss, in the clade Hylini, potentially influenced by traits under selection. The analysis highlights hotspots of declines under phylogenetic influence in the genera Isthmohyla, Plectrohyla and Ptychohyla, and geographically in Mexico and Guatemala. The conservation implications of concentrated phylogenetic influence across multiple threatening processes are twofold: Data Deficient species of threatened clades should be prioritized in future surveys and, perhaps, a greater vulnerability should be assigned to such clades for further consideration of clade-level conservation priorities.

  13. Molecular Signatures for the PVC Clade (Planctomycetes, Verrucomicrobia, Chlamydiae and Lentisphaerae of Bacteria Provide Insights into their Evolutionary Relationships

    Directory of Open Access Journals (Sweden)

    Radhey S. Gupta

    2012-09-01

    Full Text Available The PVC superphylum is an amalgamation of species from the phyla Planctomycetes, Verrucomicrobia and Chlamydiae, along with the Lentisphaerae, Poribacteria and two other candidate divisions. The diverse species of this superphylum lack any significant marker that differentiates them from other bacteria. Recently, genome sequences for 37 species covering all of the main PVC groups of bacteria have become available. We have used these sequences to construct a phylogenetic tree based upon concatenated sequences for 16 proteins and identify molecular signatures in protein sequences that are specific for the species from these phyla or those providing molecular links among them. Of the useful molecular markers identified in the present work, 6 conserved signature indels (CSIs in the proteins Cyt c oxidase, UvrD helicase, urease and a helicase-domain containing protein are specific for the species from the Verrucomicrobia phylum; three other CSIs in an ABC transporter protein, cobyrinic acid ac-diamide synthase and SpoVG protein are specific for the Planctomycetes species. Additionally, a 3 aa insert in the RpoB protein is uniquely present in all sequenced Chlamydiae, Verrucomicrobia and Lentisphaerae species, providing evidence for the shared ancestry of the species from these three phyla. Lastly, we have also identified a conserved protein of unknown function that is exclusively found in all sequenced species from the phyla Chlamydiae, Verrucomicrobia, Lentisphaerae and Planctomycetes suggesting a specific linkage among them. The absence of this protein in Poribacteria, which branches separately from other members of the PVC clade, indicates that it is not specifically related to the PVC clade of bacteria. The molecular markers described here in addition to clarifying the evolutionary relationships among the PVC clade of bacteria also provide novel tools for their identification and for genetic and biochemical studies on these organisms.

  14. Skate Genome Project: Cyber-Enabled Bioinformatics Collaboration

    Science.gov (United States)

    Vincent, J.

    2011-01-01

    The Skate Genome Project, a pilot project of the North East Cyber infrastructure Consortium, aims to produce a draft genome sequence of Leucoraja erinacea, the Little Skate. The pilot project was designed to also develop expertise in large scale collaborations across the NECC region. An overview of the bioinformatics and infrastructure challenges faced during the first year of the project will be presented. Results to date and lessons learned from the perspective of a bioinformatics core will be highlighted.

  15. BioWarehouse: a bioinformatics database warehouse toolkit

    Directory of Open Access Journals (Sweden)

    Stringer-Calvert David WJ

    2006-03-01

    Full Text Available Abstract Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the

  16. 5th HUPO BPP Bioinformatics Meeting at the European Bioinformatics Institute in Hinxton, UK--Setting the analysis frame.

    Science.gov (United States)

    Stephan, Christian; Hamacher, Michael; Blüggel, Martin; Körting, Gerhard; Chamrad, Daniel; Scheer, Christian; Marcus, Katrin; Reidegeld, Kai A; Lohaus, Christiane; Schäfer, Heike; Martens, Lennart; Jones, Philip; Müller, Michael; Auyeung, Kevin; Taylor, Chris; Binz, Pierre-Alain; Thiele, Herbert; Parkinson, David; Meyer, Helmut E; Apweiler, Rolf

    2005-09-01

    The Bioinformatics Committee of the HUPO Brain Proteome Project (HUPO BPP) meets regularly to execute the post-lab analyses of the data produced in the HUPO BPP pilot studies. On July 7, 2005 the members came together for the 5th time at the European Bioinformatics Institute (EBI) in Hinxton, UK, hosted by Rolf Apweiler. As a main result, the parameter set of the semi-automated data re-analysis of MS/MS spectra has been elaborated and the subsequent work steps have been defined.

  17. Evolutionary history of LINE-1 in the major clades of placental mammals.

    Directory of Open Access Journals (Sweden)

    Paul D Waters

    2007-01-01

    Full Text Available LINE-1 constitutes an important component of mammalian genomes. It has a dynamic evolutionary history characterized by the rise, fall and replacement of subfamilies. Most data concerning LINE-1 biology and evolution are derived from the human and mouse genomes and are often assumed to hold for all placentals.To examine LINE-1 relationships, sequences from the 3' region of the reverse transcriptase from 21 species (representing 13 orders across Afrotheria, Xenarthra, Supraprimates and Laurasiatheria were obtained from whole genome sequence assemblies, or by PCR with degenerate primers. These sequences were aligned and analysed.Our analysis reflects accepted placental relationships suggesting mostly lineage-specific LINE-1 families. The data provide clear support for several clades including Glires, Supraprimates, Laurasiatheria, Boreoeutheria, Xenarthra and Afrotheria. Within the afrotherian LINE-1 (AfroLINE clade, our tree supports Paenungulata, Afroinsectivora and Afroinsectiphillia. Xenarthran LINE-1 (XenaLINE falls sister to AfroLINE, providing some support for the Atlantogenata (Xenarthra+Afrotheria hypothesis.LINEs and SINEs make up approximately half of all placental genomes, so understanding their dynamics is an essential aspect of comparative genomics. Importantly, a tree of LINE-1 offers a different view of the root, as long edges (branches such as that to marsupials are shortened and/or broken up. Additionally, a robust phylogeny of diverse LINE-1 is essential in testing that site-specific LINE-1 insertions, often regarded as homoplasy-free phylogenetic markers, are indeed unique and not convergent.

  18. Evaluating an Inquiry-Based Bioinformatics Course Using Q Methodology

    Science.gov (United States)

    Ramlo, Susan E.; McConnell, David; Duan, Zhong-Hui; Moore, Francisco B.

    2008-01-01

    Faculty at a Midwestern metropolitan public university recently developed a course on bioinformatics that emphasized collaboration and inquiry. Bioinformatics, essentially the application of computational tools to biological data, is inherently interdisciplinary. Thus part of the challenge of creating this course was serving the needs and…

  19. Applications and methods utilizing the Simple Semantic Web Architecture and Protocol (SSWAP for bioinformatics resource discovery and disparate data and service integration

    Directory of Open Access Journals (Sweden)

    Nelson Rex T

    2010-06-01

    Full Text Available Abstract Background Scientific data integration and computational service discovery are challenges for the bioinformatic community. This process is made more difficult by the separate and independent construction of biological databases, which makes the exchange of data between information resources difficult and labor intensive. A recently described semantic web protocol, the Simple Semantic Web Architecture and Protocol (SSWAP; pronounced "swap" offers the ability to describe data and services in a semantically meaningful way. We report how three major information resources (Gramene, SoyBase and the Legume Information System [LIS] used SSWAP to semantically describe selected data and web services. Methods We selected high-priority Quantitative Trait Locus (QTL, genomic mapping, trait, phenotypic, and sequence data and associated services such as BLAST for publication, data retrieval, and service invocation via semantic web services. Data and services were mapped to concepts and categories as implemented in legacy and de novo community ontologies. We used SSWAP to express these offerings in OWL Web Ontology Language (OWL, Resource Description Framework (RDF and eXtensible Markup Language (XML documents, which are appropriate for their semantic discovery and retrieval. We implemented SSWAP services to respond to web queries and return data. These services are registered with the SSWAP Discovery Server and are available for semantic discovery at http://sswap.info. Results A total of ten services delivering QTL information from Gramene were created. From SoyBase, we created six services delivering information about soybean QTLs, and seven services delivering genetic locus information. For LIS we constructed three services, two of which allow the retrieval of DNA and RNA FASTA sequences with the third service providing nucleic acid sequence comparison capability (BLAST. Conclusions The need for semantic integration technologies has preceded

  20. Phylogenetic and bioinformatic analysis of gap junction-related proteins, innexins, pannexins and connexins.

    Science.gov (United States)

    Fushiki, Daisuke; Hamada, Yasuo; Yoshimura, Ryoichi; Endo, Yasuhisa

    2010-04-01

    All multi-cellular animals, including hydra, insects and vertebrates, develop gap junctions, which communicate directly with neighboring cells. Gap junctions consist of protein families called connexins in vertebrates and innexins in invertebrates. Connexins and innexins have no homology in their amino acid sequence, but both are thought to have some similar characteristics, such as a tetra-membrane-spanning structure, formation of a channel by hexamer, and transmission of small molecules (e.g. ions) to neighboring cells. Pannexins were recently identified as a homolog of innexins in vertebrate genomes. Although pannexins are thought to share the function of intercellular communication with connexins and innexins, there is little information about the relationship among these three protein families of gap junctions. We phylgenetically and bioinformatically examined these protein families and other tetra-membrane-spanning proteins using a database and three analytical softwares. The clades formed by pannexin families do not belong to the species classification but do to paralogs of each member of pannexins. Amino acid sequences of pannexins are closely related to those of innexins but less to those of connexins. These data suggest that innexins and pannexins have a common origin, but the relationship between innexins/pannexins and connexins is as slight as that of other tetra-membrane-spanning members.

  1. Recent developments in life sciences research: Role of bioinformatics

    African Journals Online (AJOL)

    Life sciences research and development has opened up new challenges and opportunities for bioinformatics. The contribution of bioinformatics advances made possible the mapping of the entire human genome and genomes of many other organisms in just over a decade. These discoveries, along with current efforts to ...

  2. Current status and future perspectives of bioinformatics in Tanzania ...

    African Journals Online (AJOL)

    The main bottleneck in advancing genomics in present times is the lack of expertise in using bioinformatics tools and approaches for data mining in raw DNA sequences generated by modern high throughput technologies such as next generation sequencing. Although bioinformatics has been making major progress and ...

  3. Genetic tools for the investigation of Roseobacter clade bacteria

    Directory of Open Access Journals (Sweden)

    Tielen Petra

    2009-12-01

    Full Text Available Abstract Background The Roseobacter clade represents one of the most abundant, metabolically versatile and ecologically important bacterial groups found in marine habitats. A detailed molecular investigation of the regulatory and metabolic networks of these organisms is currently limited for many strains by missing suitable genetic tools. Results Conjugation and electroporation methods for the efficient and stable genetic transformation of selected Roseobacter clade bacteria including Dinoroseobacter shibae, Oceanibulbus indolifex, Phaeobacter gallaeciensis, Phaeobacter inhibens, Roseobacter denitrificans and Roseobacter litoralis were tested. For this purpose an antibiotic resistance screening was performed and suitable genetic markers were selected. Based on these transformation protocols stably maintained plasmids were identified. A plasmid encoded oxygen-independent fluorescent system was established using the flavin mononucleotide-based fluorescent protein FbFP. Finally, a chromosomal gene knockout strategy was successfully employed for the inactivation of the anaerobic metabolism regulatory gene dnr from D. shibae DFL12T. Conclusion A genetic toolbox for members of the Roseobacter clade was established. This provides a solid methodical basis for the detailed elucidation of gene regulatory and metabolic networks underlying the ecological success of this group of marine bacteria.

  4. Buying in to bioinformatics: an introduction to commercial sequence analysis software.

    Science.gov (United States)

    Smith, David Roy

    2015-07-01

    Advancements in high-throughput nucleotide sequencing techniques have brought with them state-of-the-art bioinformatics programs and software packages. Given the importance of molecular sequence data in contemporary life science research, these software suites are becoming an essential component of many labs and classrooms, and as such are frequently designed for non-computer specialists and marketed as one-stop bioinformatics toolkits. Although beautifully designed and powerful, user-friendly bioinformatics packages can be expensive and, as more arrive on the market each year, it can be difficult for researchers, teachers and students to choose the right software for their needs, especially if they do not have a bioinformatics background. This review highlights some of the currently available and most popular commercial bioinformatics packages, discussing their prices, usability, features and suitability for teaching. Although several commercial bioinformatics programs are arguably overpriced and overhyped, many are well designed, sophisticated and, in my opinion, worth the investment. If you are just beginning your foray into molecular sequence analysis or an experienced genomicist, I encourage you to explore proprietary software bundles. They have the potential to streamline your research, increase your productivity, energize your classroom and, if anything, add a bit of zest to the often dry detached world of bioinformatics. © The Author 2014. Published by Oxford University Press.

  5. Mathematics and evolutionary biology make bioinformatics education comprehensible

    Science.gov (United States)

    Weisstein, Anton E.

    2013-01-01

    The patterns of variation within a molecular sequence data set result from the interplay between population genetic, molecular evolutionary and macroevolutionary processes—the standard purview of evolutionary biologists. Elucidating these patterns, particularly for large data sets, requires an understanding of the structure, assumptions and limitations of the algorithms used by bioinformatics software—the domain of mathematicians and computer scientists. As a result, bioinformatics often suffers a ‘two-culture’ problem because of the lack of broad overlapping expertise between these two groups. Collaboration among specialists in different fields has greatly mitigated this problem among active bioinformaticians. However, science education researchers report that much of bioinformatics education does little to bridge the cultural divide, the curriculum too focused on solving narrow problems (e.g. interpreting pre-built phylogenetic trees) rather than on exploring broader ones (e.g. exploring alternative phylogenetic strategies for different kinds of data sets). Herein, we present an introduction to the mathematics of tree enumeration, tree construction, split decomposition and sequence alignment. We also introduce off-line downloadable software tools developed by the BioQUEST Curriculum Consortium to help students learn how to interpret and critically evaluate the results of standard bioinformatics analyses. PMID:23821621

  6. Mathematics and evolutionary biology make bioinformatics education comprehensible.

    Science.gov (United States)

    Jungck, John R; Weisstein, Anton E

    2013-09-01

    The patterns of variation within a molecular sequence data set result from the interplay between population genetic, molecular evolutionary and macroevolutionary processes-the standard purview of evolutionary biologists. Elucidating these patterns, particularly for large data sets, requires an understanding of the structure, assumptions and limitations of the algorithms used by bioinformatics software-the domain of mathematicians and computer scientists. As a result, bioinformatics often suffers a 'two-culture' problem because of the lack of broad overlapping expertise between these two groups. Collaboration among specialists in different fields has greatly mitigated this problem among active bioinformaticians. However, science education researchers report that much of bioinformatics education does little to bridge the cultural divide, the curriculum too focused on solving narrow problems (e.g. interpreting pre-built phylogenetic trees) rather than on exploring broader ones (e.g. exploring alternative phylogenetic strategies for different kinds of data sets). Herein, we present an introduction to the mathematics of tree enumeration, tree construction, split decomposition and sequence alignment. We also introduce off-line downloadable software tools developed by the BioQUEST Curriculum Consortium to help students learn how to interpret and critically evaluate the results of standard bioinformatics analyses.

  7. Neurogenomics: An opportunity to integrate neuroscience, genomics and bioinformatics research in Africa

    Directory of Open Access Journals (Sweden)

    Thomas K. Karikari

    2015-06-01

    Full Text Available Modern genomic approaches have made enormous contributions to improving our understanding of the function, development and evolution of the nervous system, and the diversity within and between species. However, most of these research advances have been recorded in countries with advanced scientific resources and funding support systems. On the contrary, little is known about, for example, the possible interplay between different genes, non-coding elements and environmental factors in modulating neurological diseases among populations in low-income countries, including many African countries. The unique ancestry of African populations suggests that improved inclusion of these populations in neuroscience-related genomic studies would significantly help to identify novel factors that might shape the future of neuroscience research and neurological healthcare. This perspective is strongly supported by the recent identification that diseased individuals and their kindred from specific sub-Saharan African populations lack common neurological disease-associated genetic mutations. This indicates that there may be population-specific causes of neurological diseases, necessitating further investigations into the contribution of additional, presently-unknown genomic factors. Here, we discuss how the development of neurogenomics research in Africa would help to elucidate disease-related genomic variants, and also provide a good basis to develop more effective therapies. Furthermore, neurogenomics would harness African scientists' expertise in neuroscience, genomics and bioinformatics to extend our understanding of the neural basis of behaviour, development and evolution.

  8. Bioinformatics of cardiovascular miRNA biology.

    Science.gov (United States)

    Kunz, Meik; Xiao, Ke; Liang, Chunguang; Viereck, Janika; Pachel, Christina; Frantz, Stefan; Thum, Thomas; Dandekar, Thomas

    2015-12-01

    MicroRNAs (miRNAs) are small ~22 nucleotide non-coding RNAs and are highly conserved among species. Moreover, miRNAs regulate gene expression of a large number of genes associated with important biological functions and signaling pathways. Recently, several miRNAs have been found to be associated with cardiovascular diseases. Thus, investigating the complex regulatory effect of miRNAs may lead to a better understanding of their functional role in the heart. To achieve this, bioinformatics approaches have to be coupled with validation and screening experiments to understand the complex interactions of miRNAs with the genome. This will boost the subsequent development of diagnostic markers and our understanding of the physiological and therapeutic role of miRNAs in cardiac remodeling. In this review, we focus on and explain different bioinformatics strategies and algorithms for the identification and analysis of miRNAs and their regulatory elements to better understand cardiac miRNA biology. Starting with the biogenesis of miRNAs, we present approaches such as LocARNA and miRBase for combining sequence and structure analysis including phylogenetic comparisons as well as detailed analysis of RNA folding patterns, functional target prediction, signaling pathway as well as functional analysis. We also show how far bioinformatics helps to tackle the unprecedented level of complexity and systemic effects by miRNA, underlining the strong therapeutic potential of miRNA and miRNA target structures in cardiovascular disease. In addition, we discuss drawbacks and limitations of bioinformatics algorithms and the necessity of experimental approaches for miRNA target identification. This article is part of a Special Issue entitled 'Non-coding RNAs'. Copyright © 2014 Elsevier Ltd. All rights reserved.

  9. The mystery of clade X: Orciraptor gen. nov. and Viridiraptor gen. nov. are highly specialised, algivorous amoeboflagellates (Glissomonadida, Cercozoa).

    Science.gov (United States)

    Hess, Sebastian; Melkonian, Michael

    2013-09-01

    In freshwater ecosystems a vast diversity of elusive protists exists that specifically feed on microalgae. Due to difficulties in isolation and long-term maintenance, most of these are still poorly known. In this study stable, bacteria-free cultures of several limnetic, algivorous amoeboflagellates were investigated by light microscopy and molecular phylogenetic analyses. All strains represent naked, biflagellate cells, either occurring as rigid flagellates or as surface-attached amoebae. They perforate cell walls of certain Zygnematophyceae and Chlorophyceae (Viridiplantae) and phagocytose algal cell contents. Time-lapse microscopy revealed the feeding behaviour, locomotional processes and life histories of the amoeboflagellates. Clear differences in cell morphology and food range specificity led to the description of two new, monotypic genera Orciraptor and Viridiraptor, which occupy similar, but distinct ecological niches in aquatic ecosystems as 'necrophytophagous' and 'parasitoid' protists, respectively. Molecular phylogenetic analyses based on 18S rDNA sequence data demonstrated that Orciraptor and Viridiraptor belonged to 'clade X' within the order Glissomonadida (Cercozoa, Rhizaria). In conclusion, we established the phenotypic identity of a clade, which until now was exclusively known from environmental sequences, and erect the new family Viridiraptoridae for 'clade X'. Its algivorous members are compared with other glissomonads and nomenclatural, methodological and ecological aspects of these novel 'raptorial' amoeboflagellates are discussed. Copyright © 2013 Elsevier GmbH. All rights reserved.

  10. Assessment of a Bioinformatics across Life Science Curricula Initiative

    Science.gov (United States)

    Howard, David R.; Miskowski, Jennifer A.; Grunwald, Sandra K.; Abler, Michael L.

    2007-01-01

    At the University of Wisconsin-La Crosse, we have undertaken a program to integrate the study of bioinformatics across the undergraduate life science curricula. Our efforts have included incorporating bioinformatics exercises into courses in the biology, microbiology, and chemistry departments, as well as coordinating the efforts of faculty within…

  11. Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum

    Energy Technology Data Exchange (ETDEWEB)

    Ditty, Jayna L.; Kvaal, Christopher A.; Goodner, Brad; Freyermuth, Sharyn K.; Bailey, Cheryl; Britton, Robert A.; Gordon, Stuart G.; Heinhorst, Sabine; Reed, Kelynne; Xu, Zhaohui; Sanders-Lorenz, Erin R.; Axen, Seth; Kim, Edwin; Johns, Mitrick; Scott, Kathleen; Kerfeld, Cheryl A.

    2011-08-01

    Undergraduate life sciences education needs an overhaul, as clearly described in the National Research Council of the National Academies publication BIO 2010: Transforming Undergraduate Education for Future Research Biologists. Among BIO 2010's top recommendations is the need to involve students in working with real data and tools that reflect the nature of life sciences research in the 21st century. Education research studies support the importance of utilizing primary literature, designing and implementing experiments, and analyzing results in the context of a bona fide scientific question in cultivating the analytical skills necessary to become a scientist. Incorporating these basic scientific methodologies in undergraduate education leads to increased undergraduate and post-graduate retention in the sciences. Toward this end, many undergraduate teaching organizations offer training and suggestions for faculty to update and improve their teaching approaches to help students learn as scientists, through design and discovery (e.g., Council of Undergraduate Research [www.cur.org] and Project Kaleidoscope [www.pkal.org]). With the advent of genome sequencing and bioinformatics, many scientists now formulate biological questions and interpret research results in the context of genomic information. Just as the use of bioinformatic tools and databases changed the way scientists investigate problems, it must change how scientists teach to create new opportunities for students to gain experiences reflecting the influence of genomics, proteomics, and bioinformatics on modern life sciences research. Educators have responded by incorporating bioinformatics into diverse life science curricula. While these published exercises in, and guidelines for, bioinformatics curricula are helpful and inspirational, faculty new to the area of bioinformatics inevitably need training in the theoretical underpinnings of the algorithms. Moreover, effectively integrating bioinformatics

  12. Integrative Taxonomy of Amazon Reefs' Arenosclera spp.: A New Clade in the Haplosclerida (Demospongiae

    Directory of Open Access Journals (Sweden)

    Camille V. Leal

    2017-10-01

    Full Text Available Two new Arenosclera are described here on the basis of materials obtained from Amazon reefs in 2014, A. amazonensis sp. nov. and A. klausi sp. nov. Both are clearly distinct from all other Arenosclera by their erect, solid funnel to lamellate habit, larger oxeas, and ectosomal architecture bearing occasional multispicular tracts. An integrative approach to find the best classification for both new species failed to group them and A. heroni, the genus' type species. Nearly complete 28S rRNA sequences obtained from these species' metagenomes suggested instead a better placement for the new species and A. brasiliensis in clade C (sensu Redmond et al., 2013, while A. heroni fits best in clade A. We propose to name three clades according to the rules of the PhyloCode: Arenospiculap, Dactyclonap, and Dactyspiculap, respectively for the clade originating with the most recent common ancestor of the three Brazilian Arenosclera spp.; the most inclusive clade containing Dactylia varia (Gray, 1843 and Haliclona curacaoensis (van Soest, 1980; and the least inclusive clade containing Arenospiculap and Dactyclonap. A Karlin dinucleotide dissimilarity analysis of metagenomes carried out on cryopreserved samples recognized A. amazonensis sp. nov. as the most dissimilar species, thus suggesting a more particular microbiota is present in this Amazon species, an open avenue for extended applied study of this holobiont.

  13. A novel strategy for efficient production of anti-V3 human scFvs against HIV-1 clade C

    Directory of Open Access Journals (Sweden)

    Kumar Rajesh

    2012-11-01

    Full Text Available Abstract Background Production of human monoclonal antibodies that exhibit broadly neutralizing activity is needed for preventing HIV-1 infection, however only a few such antibodies have been generated till date. Isolation of antibodies by the hybridoma technology is a cumbersome process with fewer yields. Further, the loss of unstable or slowly growing clones which may have unique binding specificities often occurs during cloning and propagation and the strongly positive clones are often lost. This has been avoided by the process described in this paper, wherein, by combining the strategy of EBV transformation and recombinant DNA technology, we constructed human single chain variable fragments (scFvs against the third variable region (V3 of the clade C HIV-1 envelope. Results An antigen specific phage library of 7000 clones was constructed from the enriched V3- positive antibody secreting EBV transformed cells. By ligation of the digested scFv DNA into phagemid vector and bio panning against the HIV-1 consensus C and B V3 peptides followed by random selection of 40 clones, we identified 15 clones that showed V3 reactivity in phage ELISA. DNA fingerprinting analysis and sequencing showed that 13 out of the 15 clones were distinct. Expression of the positive clones was tested by SDS-PAGE and Western blot. All the 13 anti-V3 scFvs showed cross-reactivity against both the clade C and B V3 peptides and did not show any reactivity against other unrelated peptides in ELISA. Preliminary neutralization assays indicated varying degrees of neutralization of clade C and B viruses. EBV transformation, followed by antigen selection of lines to identify specific binders, enabled the selection of phage from un-cloned lines for scFv generation, thus avoiding the problems of hybridoma technology. Moreover, as the clones were pretested for antigen binding, a comparatively small library sufficed for the selection of a considerable number of unique antigen binding

  14. A Bioinformatics Facility for NASA

    Science.gov (United States)

    Schweighofer, Karl; Pohorille, Andrew

    2006-01-01

    Building on an existing prototype, we have fielded a facility with bioinformatics technologies that will help NASA meet its unique requirements for biological research. This facility consists of a cluster of computers capable of performing computationally intensive tasks, software tools, databases and knowledge management systems. Novel computational technologies for analyzing and integrating new biological data and already existing knowledge have been developed. With continued development and support, the facility will fulfill strategic NASA s bioinformatics needs in astrobiology and space exploration. . As a demonstration of these capabilities, we will present a detailed analysis of how spaceflight factors impact gene expression in the liver and kidney for mice flown aboard shuttle flight STS-108. We have found that many genes involved in signal transduction, cell cycle, and development respond to changes in microgravity, but that most metabolic pathways appear unchanged.

  15. Comprehensive decision tree models in bioinformatics.

    Directory of Open Access Journals (Sweden)

    Gregor Stiglic

    Full Text Available PURPOSE: Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. METHODS: This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. RESULTS: The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. CONCLUSIONS: The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets

  16. Comprehensive decision tree models in bioinformatics.

    Science.gov (United States)

    Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter

    2012-01-01

    Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class attributes and a high number of possibly

  17. Predicting Mycobacterium tuberculosis Complex Clades Using Knowledge-Based Bayesian Networks

    Directory of Open Access Journals (Sweden)

    Minoo Aminian

    2014-01-01

    Full Text Available We develop a novel approach for incorporating expert rules into Bayesian networks for classification of Mycobacterium tuberculosis complex (MTBC clades. The proposed knowledge-based Bayesian network (KBBN treats sets of expert rules as prior distributions on the classes. Unlike prior knowledge-based support vector machine approaches which require rules expressed as polyhedral sets, KBBN directly incorporates the rules without any modification. KBBN uses data to refine rule-based classifiers when the rule set is incomplete or ambiguous. We develop a predictive KBBN model for 69 MTBC clades found in the SITVIT international collection. We validate the approach using two testbeds that model knowledge of the MTBC obtained from two different experts and large DNA fingerprint databases to predict MTBC genetic clades and sublineages. These models represent strains of MTBC using high-throughput biomarkers called spacer oligonucleotide types (spoligotypes, since these are routinely gathered from MTBC isolates of tuberculosis (TB patients. Results show that incorporating rules into problems can drastically increase classification accuracy if data alone are insufficient. The SITVIT KBBN is publicly available for use on the World Wide Web.

  18. When process mining meets bioinformatics

    NARCIS (Netherlands)

    Jagadeesh Chandra Bose, R.P.; Aalst, van der W.M.P.; Nurcan, S.

    2011-01-01

    Process mining techniques can be used to extract non-trivial process related knowledge and thus generate interesting insights from event logs. Similarly, bioinformatics aims at increasing the understanding of biological processes through the analysis of information associated with biological

  19. 9th International Conference on Practical Applications of Computational Biology and Bioinformatics

    CERN Document Server

    Rocha, Miguel; Fdez-Riverola, Florentino; Paz, Juan

    2015-01-01

    This proceedings presents recent practical applications of Computational Biology and  Bioinformatics. It contains the proceedings of the 9th International Conference on Practical Applications of Computational Biology & Bioinformatics held at University of Salamanca, Spain, at June 3rd-5th, 2015. The International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB) is an annual international meeting dedicated to emerging and challenging applied research in Bioinformatics and Computational Biology. Biological and biomedical research are increasingly driven by experimental techniques that challenge our ability to analyse, process and extract meaningful knowledge from the underlying data. The impressive capabilities of next generation sequencing technologies, together with novel and ever evolving distinct types of omics data technologies, have put an increasingly complex set of challenges for the growing fields of Bioinformatics and Computational Biology. The analysis o...

  20. Virulence differences among Francisella tularensis subsp. tularensis clades in mice.

    Directory of Open Access Journals (Sweden)

    Claudia R Molins

    Full Text Available Francisella tularensis subspecies tularensis (type A and holarctica (type B are of clinical importance in causing tularemia. Molecular typing methods have further separated type A strains into three genetically distinct clades, A1a, A1b and A2. Epidemiological analyses of human infections in the United States suggest that A1b infections are associated with a significantly higher mortality rate as compared to infections caused by A1a, A2 and type B. To determine if genetic differences as defined by molecular typing directly correlate with differences in virulence, A1a, A1b, A2 and type B strains were compared in C57BL/6 mice. Here we demonstrate significant differences between survival curves for infections caused by A1b versus A1a, A2 and type B, with A1b infected mice dying earlier than mice infected with A1a, A2 or type B; these results were conserved among multiple strains. Differences were also detected among type A clades as well as between type A clades and type B with respect to bacterial burdens, and gross anatomy in infected mice. Our results indicate that clades defined within F. tularensis subsp. tularensis by molecular typing methods correlate with virulence differences, with A1b strains more virulent than A1a, A2 and type B strains. These findings indicate type A strains are not equivalent with respect to virulence and have important implications for public health as well as basic research programs.

  1. Bioconductor: open software development for computational biology and bioinformatics

    DEFF Research Database (Denmark)

    Gentleman, R.C.; Carey, V.J.; Bates, D.M.

    2004-01-01

    The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisci......The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry...... into interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples....

  2. Peer Mentoring for Bioinformatics presentation

    OpenAIRE

    Budd, Aidan

    2014-01-01

    A handout used in a HUB (Heidelberg Unseminars in Bioinformatics) meeting focused on career development for bioinformaticians. It describes an activity for use to help introduce the idea of peer mentoring, potnetially acting as an opportunity to create peer-mentoring groups.

  3. PubData: search engine for bioinformatics databases worldwide

    OpenAIRE

    Vand, Kasra; Wahlestedt, Thor; Khomtchouk, Kelly; Sayed, Mohammed; Wahlestedt, Claes; Khomtchouk, Bohdan

    2016-01-01

    We propose a search engine and file retrieval system for all bioinformatics databases worldwide. PubData searches biomedical data in a user-friendly fashion similar to how PubMed searches biomedical literature. PubData is built on novel network programming, natural language processing, and artificial intelligence algorithms that can patch into the file transfer protocol servers of any user-specified bioinformatics database, query its contents, retrieve files for download, and adapt to the use...

  4. Divergent regulation of Arabidopsis SAUR genes: a focus on the SAUR10-clade.

    Science.gov (United States)

    van Mourik, Hilda; van Dijk, Aalt D J; Stortenbeker, Niek; Angenent, Gerco C; Bemer, Marian

    2017-12-19

    Small Auxin-Upregulated RNA (SAUR) genes encode growth regulators that induce cell elongation. Arabidopsis contains more than 70 SAUR genes, of which the growth-promoting function has been unveiled in seedlings, while their role in other tissues remained largely unknown. Here, we focus on the regulatory regions of Arabidopsis SAUR genes, to predict the processes in which they play a role, and understand the dynamics of plant growth. In this study, we characterized in detail the entire SAUR10-clade: SAUR8, SAUR9, SAUR10, SAUR12, SAUR16, SAUR50, SAUR51 and SAUR54. Overexpression analysis revealed that the different proteins fulfil similar functions, while the SAUR expression patterns were highly diverse, showing expression throughout plant development in a variety of tissues. In addition, the response to application of different hormones largely varied between the different genes. These tissue-specific and hormone-specific responses could be linked to transcription factor binding sites using in silico analyses. These analyses also supported the existence of two groups of SAURs in Arabidopsis: Class I genes can be induced by combinatorial action of ARF-BZR-PIF transcription factors, while Class II genes are not regulated by auxin. SAUR10-clade genes generally induce cell-elongation, but exhibit diverse expression patterns and responses to hormones. Our experimental and in silico analyses suggest that transcription factors involved in plant development determine the tissue specific expression of the different SAUR genes, whereas the amplitude of this expression can often be controlled by hormone response transcription factors. This allows the plant to fine tune growth in a variety of tissues in response to internal and external signals.

  5. Clade identification of symbiotic zooxanthellae of dominant ...

    African Journals Online (AJOL)

    Partial 28S nuclear ribosomal (nr) DNA of Symbiodinium were amplified by polymerase chain reaction (PCR) and then PCR products were analyzed by the phylogenetic analyses of the LSU DNA sequences based on PAUP and Clustal X software. The results showed that there are at least two clades of Symbiodinium from ...

  6. Bioinformatics Evaluation of Plant Chlorophyllase, the Key Enzyme in Chlorophyll Degradation

    Directory of Open Access Journals (Sweden)

    Ebrahim Sharafi

    2017-06-01

    Full Text Available Background and Objective: Chlorophyllase catalyzes the hydrolysis of chlorophylls to chlorophyllide and phytol. Recently, several applications including removal of chlorophylls from vegetable oils, use in laundry detergents and production of chlorophyllides have been described for chlorophyllase. However, there is little information about the biochemical characteristics of chlorophyllases.Material and Methods: 35 chlorophyllase protein sequences were obtained from the National Centre for Biotechnology Information database. All of the sequences were analyzed using bioinformatics tools for their conserved domain, phylogenetic relationships and biochemical characteristics.Results and Conclusion: The overall domain architecture of chlorophyllases consisted of the esterases/lipases superfamily domain over their full length and the alpha/beta hydrolase family domain over the middle part of their sequences. Plant chlorophyllases could be classified into 4 clades. Molecular weight and pI of the chlorophyllases ranged 32.65-37.77 kDa and 4.80-8.97, respectively. The most stable chlorophyllase is probably obtained from Malus domestica. Chlorophyllases form Solanum pennellii, Triticum aestivum, Triticum urartu, Arabidopsis lyrata, Pachira macrocarpa, Prunus mume and Malus domestica were predicted to be soluble upon overexpression in Escherichia coli, Beta vulgaris and Chenopodium album chlorophyllases were predicted to form no disulfide bond. Chlorophyllases from Jatropha curcas, Amborella trichopod, Setaria italica, Piper betle, Triticum urartu and Arabidopsis thaliana were predicted to be in non-N-glycosylated form.Conflict of interest: The authors declare no conflict of interest.

  7. G-DOC Plus - an integrative bioinformatics platform for precision medicine.

    Science.gov (United States)

    Bhuvaneshwar, Krithika; Belouali, Anas; Singh, Varun; Johnson, Robert M; Song, Lei; Alaoui, Adil; Harris, Michael A; Clarke, Robert; Weiner, Louis M; Gusev, Yuriy; Madhavan, Subha

    2016-04-30

    G-DOC Plus is a data integration and bioinformatics platform that uses cloud computing and other advanced computational tools to handle a variety of biomedical BIG DATA including gene expression arrays, NGS and medical images so that they can be analyzed in the full context of other omics and clinical information. G-DOC Plus currently holds data from over 10,000 patients selected from private and public resources including Gene Expression Omnibus (GEO), The Cancer Genome Atlas (TCGA) and the recently added datasets from REpository for Molecular BRAin Neoplasia DaTa (REMBRANDT), caArray studies of lung and colon cancer, ImmPort and the 1000 genomes data sets. The system allows researchers to explore clinical-omic data one sample at a time, as a cohort of samples; or at the level of population, providing the user with a comprehensive view of the data. G-DOC Plus tools have been leveraged in cancer and non-cancer studies for hypothesis generation and validation; biomarker discovery and multi-omics analysis, to explore somatic mutations and cancer MRI images; as well as for training and graduate education in bioinformatics, data and computational sciences. Several of these use cases are described in this paper to demonstrate its multifaceted usability. G-DOC Plus can be used to support a variety of user groups in multiple domains to enable hypothesis generation for precision medicine research. The long-term vision of G-DOC Plus is to extend this translational bioinformatics platform to stay current with emerging omics technologies and analysis methods to continue supporting novel hypothesis generation, analysis and validation for integrative biomedical research. By integrating several aspects of the disease and exposing various data elements, such as outpatient lab workup, pathology, radiology, current treatments, molecular signatures and expected outcomes over a web interface, G-DOC Plus will continue to strengthen precision medicine research. G-DOC Plus is available

  8. Bioinformatics and its application in animal health: a review | Soetan ...

    African Journals Online (AJOL)

    Bioinformatics is an interdisciplinary subject, which uses computer application, statistics, mathematics and engineering for the analysis and management of biological information. It has become an important tool for basic and applied research in veterinary sciences. Bioinformatics has brought about advancements into ...

  9. A generally applicable lightweight method for calculating a value structure for tools and services in bioinformatics infrastructure projects.

    Science.gov (United States)

    Mayer, Gerhard; Quast, Christian; Felden, Janine; Lange, Matthias; Prinz, Manuel; Pühler, Alfred; Lawerenz, Chris; Scholz, Uwe; Glöckner, Frank Oliver; Müller, Wolfgang; Marcus, Katrin; Eisenacher, Martin

    2017-10-30

    Sustainable noncommercial bioinformatics infrastructures are a prerequisite to use and take advantage of the potential of big data analysis for research and economy. Consequently, funders, universities and institutes as well as users ask for a transparent value model for the tools and services offered. In this article, a generally applicable lightweight method is described by which bioinformatics infrastructure projects can estimate the value of tools and services offered without determining exactly the total costs of ownership. Five representative scenarios for value estimation from a rough estimation to a detailed breakdown of costs are presented. To account for the diversity in bioinformatics applications and services, the notion of service-specific 'service provision units' is introduced together with the factors influencing them and the main underlying assumptions for these 'value influencing factors'. Special attention is given on how to handle personnel costs and indirect costs such as electricity. Four examples are presented for the calculation of the value of tools and services provided by the German Network for Bioinformatics Infrastructure (de.NBI): one for tool usage, one for (Web-based) database analyses, one for consulting services and one for bioinformatics training events. Finally, from the discussed values, the costs of direct funding and the costs of payment of services by funded projects are calculated and compared. © The Author 2017. Published by Oxford University Press.

  10. Assessment of Data Reliability of Wireless Sensor Network for Bioinformatics

    Directory of Open Access Journals (Sweden)

    Ting Dong

    2017-09-01

    Full Text Available As a focal point of biotechnology, bioinformatics integrates knowledge from biology, mathematics, physics, chemistry, computer science and information science. It generally deals with genome informatics, protein structure and drug design. However, the data or information thus acquired from the main areas of bioinformatics may not be effective. Some researchers combined bioinformatics with wireless sensor network (WSN into biosensor and other tools, and applied them to such areas as fermentation, environmental monitoring, food engineering, clinical medicine and military. In the combination, the WSN is used to collect data and information. The reliability of the WSN in bioinformatics is the prerequisite to effective utilization of information. It is greatly influenced by factors like quality, benefits, service, timeliness and stability, some of them are qualitative and some are quantitative. Hence, it is necessary to develop a method that can handle both qualitative and quantitative assessment of information. A viable option is the fuzzy linguistic method, especially 2-tuple linguistic model, which has been extensively used to cope with such issues. As a result, this paper introduces 2-tuple linguistic representation to assist experts in giving their opinions on different WSNs in bioinformatics that involve multiple factors. Moreover, the author proposes a novel way to determine attribute weights and uses the method to weigh the relative importance of different influencing factors which can be considered as attributes in the assessment of the WSN in bioinformatics. Finally, an illustrative example is given to provide a reasonable solution for the assessment.

  11. Reproducible Bioinformatics Research for Biologists

    Science.gov (United States)

    This book chapter describes the current Big Data problem in Bioinformatics and the resulting issues with performing reproducible computational research. The core of the chapter provides guidelines and summaries of current tools/techniques that a noncomputational researcher would need to learn to pe...

  12. Bioinformatics of genomic association mapping

    NARCIS (Netherlands)

    Vaez Barzani, Ahmad

    2015-01-01

    In this thesis we present an overview of bioinformatics-based approaches for genomic association mapping, with emphasis on human quantitative traits and their contribution to complex diseases. We aim to provide a comprehensive walk-through of the classic steps of genomic association mapping

  13. Bioinformatic tools for PCR Primer design

    African Journals Online (AJOL)

    ES

    Bioinformatics is an emerging scientific discipline that uses information ... complex biological questions. ... and computer programs for various purposes of primer ..... polymerase chain reaction: Human Immunodeficiency Virus 1 model studies.

  14. PayDIBI: Pay-as-you-go data integration for bioinformatics

    NARCIS (Netherlands)

    Wanders, B.

    2012-01-01

    Background: Scientific research in bio-informatics is often data-driven and supported by biolog- ical databases. In a growing number of research projects, researchers like to ask questions that require the combination of information from more than one database. Most bio-informatics papers do not

  15. Open source tools and toolkits for bioinformatics: significance, and where are we?

    Science.gov (United States)

    Stajich, Jason E; Lapp, Hilmar

    2006-09-01

    This review summarizes important work in open-source bioinformatics software that has occurred over the past couple of years. The survey is intended to illustrate how programs and toolkits whose source code has been developed or released under an Open Source license have changed informatics-heavy areas of life science research. Rather than creating a comprehensive list of all tools developed over the last 2-3 years, we use a few selected projects encompassing toolkit libraries, analysis tools, data analysis environments and interoperability standards to show how freely available and modifiable open-source software can serve as the foundation for building important applications, analysis workflows and resources.

  16. Phylogeny and taxonomy of the North American clade of the Ceratocystis fimbriata complex.

    Science.gov (United States)

    Johnson, Jason A; Harrington, Thomas C; Engelbrecht, C J B

    2005-01-01

    Ceratocystis fimbriata is a widely distributed, plant pathogenic fungus that causes wilts and cankers on many woody hosts. Earlier phylogenetic analyses of DNA sequences revealed three geographic clades within the C. fimbriata complex that are centered respectively in North America, Latin America and Asia. This study looked for cryptic species within the North American clade. The internal transcribed spacer regions (ITS) of the rDNA were sequenced, and phylogenetic analysis indicated that most isolates from the North American clade group into four host-associated lineages, referred to as the aspen, hickory, oak and cherry lineages, which were isolated primarily from wounds or diseased trees of Populus, Carya, Quercus and Prunus, respectively. A single isolate collected from P. serotina in Wisconsin had a unique ITS sequence. Allozyme electromorphs also were highly polymorphic within the North American clade, and the inferred phylogenies from these data were congruent with the ITS-rDNA analyses. In pairing experiments isolates from the aspen, hickory, oak and cherry lineages were interfertile only with other isolates from their respective lineages. Inoculation experiments with isolates of the four host-associated groupings showed strong host specialization by isolates from the aspen and hickory lineages on Populus tremuloides and Carya illinoensis, respectively, but isolates from the oak and cherry lineages did not consistently reveal host specialization. Morphological features distinguish isolates in the North American clade from those of the Latin American clade (including C. fimbriata sensu stricto). Based on the phylogenetic evidence, interfertility, host specialization and morphology, the oak and cherry lineages are recognized as the earlier described C. variospora, the poplar lineage as C. populicola sp. nov., and the hickory lineage as C. caryae sp. nov. A new species associated with the bark beetle Scolytus quadrispinosus on Carya is closely related to C

  17. An internet-based bioinformatics toolkit for plant biosecurity diagnosis and surveillance of viruses and viroids.

    Science.gov (United States)

    Barrero, Roberto A; Napier, Kathryn R; Cunnington, James; Liefting, Lia; Keenan, Sandi; Frampton, Rebekah A; Szabo, Tamas; Bulman, Simon; Hunter, Adam; Ward, Lisa; Whattam, Mark; Bellgard, Matthew I

    2017-01-11

    Detection and preventing entry of exotic viruses and viroids at the border is critical for protecting plant industries trade worldwide. Existing post entry quarantine screening protocols rely on time-consuming biological indicators and/or molecular assays that require knowledge of infecting viral pathogens. Plants have developed the ability to recognise and respond to viral infections through Dicer-like enzymes that cleave viral sequences into specific small RNA products. Many studies reported the use of a broad range of small RNAs encompassing the product sizes of several Dicer enzymes involved in distinct biological pathways. Here we optimise the assembly of viral sequences by using specific small RNA subsets. We sequenced the small RNA fractions of 21 plants held at quarantine glasshouse facilities in Australia and New Zealand. Benchmarking of several de novo assembler tools yielded SPAdes using a kmer of 19 to produce the best assembly outcomes. We also found that de novo assembly using 21-25 nt small RNAs can result in chimeric assemblies of viral sequences and plant host sequences. Such non-specific assemblies can be resolved by using 21-22 nt or 24 nt small RNAs subsets. Among the 21 selected samples, we identified contigs with sequence similarity to 18 viruses and 3 viroids in 13 samples. Most of the viruses were assembled using only 21-22 nt long virus-derived siRNAs (viRNAs), except for one Citrus endogenous pararetrovirus that was more efficiently assembled using 24 nt long viRNAs. All three viroids found in this study were fully assembled using either 21-22 nt or 24 nt viRNAs. Optimised analysis workflows were customised within the Yabi web-based analytical environment. We present a fully automated viral surveillance and diagnosis web-based bioinformatics toolkit that provides a flexible, user-friendly, robust and scalable interface for the discovery and diagnosis of viral pathogens. We have implemented an automated viral surveillance and

  18. Taking Bioinformatics to Systems Medicine

    NARCIS (Netherlands)

    van Kampen, Antoine H. C.; Moerland, Perry D.

    2016-01-01

    Systems medicine promotes a range of approaches and strategies to study human health and disease at a systems level with the aim of improving the overall well-being of (healthy) individuals, and preventing, diagnosing, or curing disease. In this chapter we discuss how bioinformatics critically

  19. Metagenomics and Bioinformatics in Microbial Ecology: Current Status and Beyond.

    Science.gov (United States)

    Hiraoka, Satoshi; Yang, Ching-Chia; Iwasaki, Wataru

    2016-09-29

    Metagenomic approaches are now commonly used in microbial ecology to study microbial communities in more detail, including many strains that cannot be cultivated in the laboratory. Bioinformatic analyses make it possible to mine huge metagenomic datasets and discover general patterns that govern microbial ecosystems. However, the findings of typical metagenomic and bioinformatic analyses still do not completely describe the ecology and evolution of microbes in their environments. Most analyses still depend on straightforward sequence similarity searches against reference databases. We herein review the current state of metagenomics and bioinformatics in microbial ecology and discuss future directions for the field. New techniques will allow us to go beyond routine analyses and broaden our knowledge of microbial ecosystems. We need to enrich reference databases, promote platforms that enable meta- or comprehensive analyses of diverse metagenomic datasets, devise methods that utilize long-read sequence information, and develop more powerful bioinformatic methods to analyze data from diverse perspectives.

  20. An Adaptive Hybrid Multiprocessor technique for bioinformatics sequence alignment

    KAUST Repository

    Bonny, Talal

    2012-07-28

    Sequence alignment algorithms such as the Smith-Waterman algorithm are among the most important applications in the development of bioinformatics. Sequence alignment algorithms must process large amounts of data which may take a long time. Here, we introduce our Adaptive Hybrid Multiprocessor technique to accelerate the implementation of the Smith-Waterman algorithm. Our technique utilizes both the graphics processing unit (GPU) and the central processing unit (CPU). It adapts to the implementation according to the number of CPUs given as input by efficiently distributing the workload between the processing units. Using existing resources (GPU and CPU) in an efficient way is a novel approach. The peak performance achieved for the platforms GPU + CPU, GPU + 2CPUs, and GPU + 3CPUs is 10.4 GCUPS, 13.7 GCUPS, and 18.6 GCUPS, respectively (with the query length of 511 amino acid). © 2010 IEEE.

  1. Anuran trypanosomes: phylogenetic evidence for new clades in Brazil.

    Science.gov (United States)

    da S Ferreira, Juliana I G; da Costa, Andrea P; Ramirez, Diego; Roldan, Jairo A M; Saraiva, Danilo; da S Founier, Gislene F R; Sue, Ana; Zambelli, Erick R; Minervino, Antonio H H; Verdade, Vanessa K; Gennari, Solange M; Marcili, Arlei

    2015-05-01

    Trypanosomes of anurans and fish are grouped into the Aquatic Clade which includes species isolated from fish, amphibians, turtles and platypus, usually transmitted by leeches and phlebotomine sand flies. Trypanosomes from Brazilian frogs are grouped within the Aquatic Clade with other anuran trypanosome species, where there seems to be coevolutionary patterns with vertebrate hosts and association to Brazilian biomes (Atlantic Forest, Pantanal and Amazonia Rainforest). We characterised the anuran trypanosomes from two different areas of the Cerrado biome and examined their phylogenetic relationships based on the SSU rRNA gene. A total of 112 anurans of six species was analysed and trypanosome prevalence evaluated through haemoculture was found to be 7% (8 positive frogs). However, only three isolates (2.7%) from two anuran species were recovered and cryopreserved. Analysis including SSU rDNA sequences from previous studies segregated the anuran trypanosomes into six groups, the previously reported An01 to An04, and An05 and An06 reported herein. Clade An05 comprises the isolates from Leptodactylus latrans (Steffen) and Pristimantis sp. captured in the Cerrado biome and Trypanosoma chattoni Mathis & Leger, 1911. The inclusion of new isolates in the phylogenetic analyses provided evidence for a new group (An06) of parasites from phlebotomine hosts. Our results indicate that the diversity of trypanosome species is underestimated since studies conducted in Brazil and other regions of the world are still few.

  2. Bioinformatics and the Undergraduate Curriculum

    Science.gov (United States)

    Maloney, Mark; Parker, Jeffrey; LeBlanc, Mark; Woodard, Craig T.; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of…

  3. Bringing Web 2.0 to bioinformatics.

    Science.gov (United States)

    Zhang, Zhang; Cheung, Kei-Hoi; Townsend, Jeffrey P

    2009-01-01

    Enabling deft data integration from numerous, voluminous and heterogeneous data sources is a major bioinformatic challenge. Several approaches have been proposed to address this challenge, including data warehousing and federated databasing. Yet despite the rise of these approaches, integration of data from multiple sources remains problematic and toilsome. These two approaches follow a user-to-computer communication model for data exchange, and do not facilitate a broader concept of data sharing or collaboration among users. In this report, we discuss the potential of Web 2.0 technologies to transcend this model and enhance bioinformatics research. We propose a Web 2.0-based Scientific Social Community (SSC) model for the implementation of these technologies. By establishing a social, collective and collaborative platform for data creation, sharing and integration, we promote a web services-based pipeline featuring web services for computer-to-computer data exchange as users add value. This pipeline aims to simplify data integration and creation, to realize automatic analysis, and to facilitate reuse and sharing of data. SSC can foster collaboration and harness collective intelligence to create and discover new knowledge. In addition to its research potential, we also describe its potential role as an e-learning platform in education. We discuss lessons from information technology, predict the next generation of Web (Web 3.0), and describe its potential impact on the future of bioinformatics studies.

  4. Efficient Feature Selection and Classification of Protein Sequence Data in Bioinformatics

    Science.gov (United States)

    Faye, Ibrahima; Samir, Brahim Belhaouari; Md Said, Abas

    2014-01-01

    Bioinformatics has been an emerging area of research for the last three decades. The ultimate aims of bioinformatics were to store and manage the biological data, and develop and analyze computational tools to enhance their understanding. The size of data accumulated under various sequencing projects is increasing exponentially, which presents difficulties for the experimental methods. To reduce the gap between newly sequenced protein and proteins with known functions, many computational techniques involving classification and clustering algorithms were proposed in the past. The classification of protein sequences into existing superfamilies is helpful in predicting the structure and function of large amount of newly discovered proteins. The existing classification results are unsatisfactory due to a huge size of features obtained through various feature encoding methods. In this work, a statistical metric-based feature selection technique has been proposed in order to reduce the size of the extracted feature vector. The proposed method of protein classification shows significant improvement in terms of performance measure metrics: accuracy, sensitivity, specificity, recall, F-measure, and so forth. PMID:25045727

  5. Application of bioinformatics tools and databases in microbial dehalogenation research (a review).

    Science.gov (United States)

    Satpathy, R; Konkimalla, V B; Ratha, J

    2015-01-01

    Microbial dehalogenation is a biochemical process in which the halogenated substances are catalyzed enzymatically in to their non-halogenated form. The microorganisms have a wide range of organohalogen degradation ability both explicit and non-specific in nature. Most of these halogenated organic compounds being pollutants need to be remediated; therefore, the current approaches are to explore the potential of microbes at a molecular level for effective biodegradation of these substances. Several microorganisms with dehalogenation activity have been identified and characterized. In this aspect, the bioinformatics plays a key role to gain deeper knowledge in this field of dehalogenation. To facilitate the data mining, many tools have been developed to annotate these data from databases. Therefore, with the discovery of a microorganism one can predict a gene/protein, sequence analysis, can perform structural modelling, metabolic pathway analysis, biodegradation study and so on. This review highlights various methods of bioinformatics approach that describes the application of various databases and specific tools in the microbial dehalogenation fields with special focus on dehalogenase enzymes. Attempts have also been made to decipher some recent applications of in silico modeling methods that comprise of gene finding, protein modelling, Quantitative Structure Biodegradibility Relationship (QSBR) study and reconstruction of metabolic pathways employed in dehalogenation research area.

  6. Greatly reduced phylogenetic structure in the cultivated potato clade of potatoes, Solanum section Petota

    Science.gov (United States)

    The species boundaries of wild and cultivated potatoes, Solanum section Petota, are controversial with most of the taxonomic problems in a clade containing cultivated potatoes. We here provide the first in-depth phylogenetic study of the cultivated potato clade to explore possible causes of these pr...

  7. The secondary metabolite bioinformatics portal: Computational tools to facilitate synthetic biology of secondary metabolite production

    Directory of Open Access Journals (Sweden)

    Tilmann Weber

    2016-06-01

    Full Text Available Natural products are among the most important sources of lead molecules for drug discovery. With the development of affordable whole-genome sequencing technologies and other ‘omics tools, the field of natural products research is currently undergoing a shift in paradigms. While, for decades, mainly analytical and chemical methods gave access to this group of compounds, nowadays genomics-based methods offer complementary approaches to find, identify and characterize such molecules. This paradigm shift also resulted in a high demand for computational tools to assist researchers in their daily work. In this context, this review gives a summary of tools and databases that currently are available to mine, identify and characterize natural product biosynthesis pathways and their producers based on ‘omics data. A web portal called Secondary Metabolite Bioinformatics Portal (SMBP at http://www.secondarymetabolites.org is introduced to provide a one-stop catalog and links to these bioinformatics resources. In addition, an outlook is presented how the existing tools and those to be developed will influence synthetic biology approaches in the natural products field.

  8. BioShaDock: a community driven bioinformatics shared Docker-based tools registry.

    Science.gov (United States)

    Moreews, François; Sallou, Olivier; Ménager, Hervé; Le Bras, Yvan; Monjeaud, Cyril; Blanchet, Christophe; Collin, Olivier

    2015-01-01

    Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community.

  9. Fundamentals of bioinformatics and computational biology methods and exercises in matlab

    CERN Document Server

    Singh, Gautam B

    2015-01-01

    This book offers comprehensive coverage of all the core topics of bioinformatics, and includes practical examples completed using the MATLAB bioinformatics toolbox™. It is primarily intended as a textbook for engineering and computer science students attending advanced undergraduate and graduate courses in bioinformatics and computational biology. The book develops bioinformatics concepts from the ground up, starting with an introductory chapter on molecular biology and genetics. This chapter will enable physical science students to fully understand and appreciate the ultimate goals of applying the principles of information technology to challenges in biological data management, sequence analysis, and systems biology. The first part of the book also includes a survey of existing biological databases, tools that have become essential in today’s biotechnology research. The second part of the book covers methodologies for retrieving biological information, including fundamental algorithms for sequence compar...

  10. Draft genome and sequence variant data of the oomycete Pythium insidiosum strain Pi45 from the phylogenetically-distinct Clade-III

    Directory of Open Access Journals (Sweden)

    Weerayuth Kittichotirat

    2017-12-01

    Full Text Available Pythium insidiosum is a unique oomycete microorganism, capable of infecting humans and animals. The organism can be phylogenetically categorized into three distinct clades: Clade-I (strains from the Americas; Clade-II (strains from Asia and Australia, and Clade–III (strains from Thailand and the United States. Two draft genomes of the P. insidiosum Clade-I strain CDC-B5653 and Clade-II strain Pi-S are available in the public domain. The genome of P. insidiosum from the distinct Clade-III, which is distantly-related to the other two clades, is lacking. Here, we report the draft genome sequence of the P. insidiosum strain Pi45 (also known as MCC13; isolated from a Thai patient with pythiosis; accession numbers BCFM01000001-BCFM01017277 as a representative strain of the phylogenetically-distinct Clade-III. We also report a genome-scale data set of sequence variants (i.e., SNPs and INDELs found in P. insidiosum (accessible online at the Mendeley database: http://dx.doi.org/10.17632/r75799jy6c.1. Keywords: Pythium insidiosum, Pythiosis, Draft genome, Sequence variant

  11. Evolution and functional insights of different ancestral orthologous clades of chitin synthase genes in the fungal tree of life

    Directory of Open Access Journals (Sweden)

    Mu eLi

    2016-02-01

    Full Text Available Chitin synthases (CHSs are key enzymes in the biosynthesis of chitin, an important structural component of fungal cell walls that can trigger innate immune responses in host plants and animals. Members of CHS gene family perform various functions in fungal cellular processes. Previous studies focused primarily on classifying diverse CHSs into different classes, regardless of their functional diversification, or on characterizing their functions in individual fungal species. A complete and systematic comparative analysis of CHS genes based on their orthologous relationships will be valuable for elucidating the evolution and functions of different CHS genes in fungi. Here, we identified and compared members of the CHS gene family across the fungal tree of life, including 18 divergent fungal lineages. Phylogenetic analysis revealed that the fungal CHS gene family is comprised of at least 10 ancestral orthologous clades, which have undergone multiple independent duplications and losses in different fungal lineages during evolution. Interestingly, one of these CHS clades (class III was expanded in plant or animal pathogenic fungi belonging to different fungal lineages. Two clades (classes VIb and VIc identified for the first time in this study occurred mainly in plant pathogenic fungi from Sordariomycetes and Dothideomycetes. Moreover, members of classes III and VIb were specifically up-regulated during plant infection, suggesting important roles in pathogenesis. In addition, CHS-associated networks conserved among plant pathogenic fungi are involved in various biological processes, including sexual reproduction and plant infection. We also identified specificity-determining sites, many of which are located at or adjacent to important structural and functional sites that are potentially responsible for functional divergence of different CHS classes. Overall, our results provide new insights into the evolution and function of members of CHS gene

  12. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    Science.gov (United States)

    Magana, Alejandra J.; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the…

  13. Systems Bioinformatics: increasing precision of computational diagnostics and therapeutics through network-based approaches.

    Science.gov (United States)

    Oulas, Anastasis; Minadakis, George; Zachariou, Margarita; Sokratous, Kleitos; Bourdakou, Marilena M; Spyrou, George M

    2017-11-27

    Systems Bioinformatics is a relatively new approach, which lies in the intersection of systems biology and classical bioinformatics. It focuses on integrating information across different levels using a bottom-up approach as in systems biology with a data-driven top-down approach as in bioinformatics. The advent of omics technologies has provided the stepping-stone for the emergence of Systems Bioinformatics. These technologies provide a spectrum of information ranging from genomics, transcriptomics and proteomics to epigenomics, pharmacogenomics, metagenomics and metabolomics. Systems Bioinformatics is the framework in which systems approaches are applied to such data, setting the level of resolution as well as the boundary of the system of interest and studying the emerging properties of the system as a whole rather than the sum of the properties derived from the system's individual components. A key approach in Systems Bioinformatics is the construction of multiple networks representing each level of the omics spectrum and their integration in a layered network that exchanges information within and between layers. Here, we provide evidence on how Systems Bioinformatics enhances computational therapeutics and diagnostics, hence paving the way to precision medicine. The aim of this review is to familiarize the reader with the emerging field of Systems Bioinformatics and to provide a comprehensive overview of its current state-of-the-art methods and technologies. Moreover, we provide examples of success stories and case studies that utilize such methods and tools to significantly advance research in the fields of systems biology and systems medicine. © The Author 2017. Published by Oxford University Press.

  14. The growing need for microservices in bioinformatics

    Directory of Open Access Journals (Sweden)

    Christopher L Williams

    2016-01-01

    Full Text Available Objective: Within the information technology (IT industry, best practices and standards are constantly evolving and being refined. In contrast, computer technology utilized within the healthcare industry often evolves at a glacial pace, with reduced opportunities for justified innovation. Although the use of timely technology refreshes within an enterprise′s overall technology stack can be costly, thoughtful adoption of select technologies with a demonstrated return on investment can be very effective in increasing productivity and at the same time, reducing the burden of maintenance often associated with older and legacy systems. In this brief technical communication, we introduce the concept of microservices as applied to the ecosystem of data analysis pipelines. Microservice architecture is a framework for dividing complex systems into easily managed parts. Each individual service is limited in functional scope, thereby conferring a higher measure of functional isolation and reliability to the collective solution. Moreover, maintenance challenges are greatly simplified by virtue of the reduced architectural complexity of each constitutive module. This fact notwithstanding, rendered overall solutions utilizing a microservices-based approach provide equal or greater levels of functionality as compared to conventional programming approaches. Bioinformatics, with its ever-increasing demand for performance and new testing algorithms, is the perfect use-case for such a solution. Moreover, if promulgated within the greater development community as an open-source solution, such an approach holds potential to be transformative to current bioinformatics software development. Context: Bioinformatics relies on nimble IT framework which can adapt to changing requirements. Aims: To present a well-established software design and deployment strategy as a solution for current challenges within bioinformatics Conclusions: Use of the microservices framework

  15. The growing need for microservices in bioinformatics.

    Science.gov (United States)

    Williams, Christopher L; Sica, Jeffrey C; Killen, Robert T; Balis, Ulysses G J

    2016-01-01

    Within the information technology (IT) industry, best practices and standards are constantly evolving and being refined. In contrast, computer technology utilized within the healthcare industry often evolves at a glacial pace, with reduced opportunities for justified innovation. Although the use of timely technology refreshes within an enterprise's overall technology stack can be costly, thoughtful adoption of select technologies with a demonstrated return on investment can be very effective in increasing productivity and at the same time, reducing the burden of maintenance often associated with older and legacy systems. In this brief technical communication, we introduce the concept of microservices as applied to the ecosystem of data analysis pipelines. Microservice architecture is a framework for dividing complex systems into easily managed parts. Each individual service is limited in functional scope, thereby conferring a higher measure of functional isolation and reliability to the collective solution. Moreover, maintenance challenges are greatly simplified by virtue of the reduced architectural complexity of each constitutive module. This fact notwithstanding, rendered overall solutions utilizing a microservices-based approach provide equal or greater levels of functionality as compared to conventional programming approaches. Bioinformatics, with its ever-increasing demand for performance and new testing algorithms, is the perfect use-case for such a solution. Moreover, if promulgated within the greater development community as an open-source solution, such an approach holds potential to be transformative to current bioinformatics software development. Bioinformatics relies on nimble IT framework which can adapt to changing requirements. To present a well-established software design and deployment strategy as a solution for current challenges within bioinformatics. Use of the microservices framework is an effective methodology for the fabrication and

  16. The growing need for microservices in bioinformatics

    Science.gov (United States)

    Williams, Christopher L.; Sica, Jeffrey C.; Killen, Robert T.; Balis, Ulysses G. J.

    2016-01-01

    Objective: Within the information technology (IT) industry, best practices and standards are constantly evolving and being refined. In contrast, computer technology utilized within the healthcare industry often evolves at a glacial pace, with reduced opportunities for justified innovation. Although the use of timely technology refreshes within an enterprise's overall technology stack can be costly, thoughtful adoption of select technologies with a demonstrated return on investment can be very effective in increasing productivity and at the same time, reducing the burden of maintenance often associated with older and legacy systems. In this brief technical communication, we introduce the concept of microservices as applied to the ecosystem of data analysis pipelines. Microservice architecture is a framework for dividing complex systems into easily managed parts. Each individual service is limited in functional scope, thereby conferring a higher measure of functional isolation and reliability to the collective solution. Moreover, maintenance challenges are greatly simplified by virtue of the reduced architectural complexity of each constitutive module. This fact notwithstanding, rendered overall solutions utilizing a microservices-based approach provide equal or greater levels of functionality as compared to conventional programming approaches. Bioinformatics, with its ever-increasing demand for performance and new testing algorithms, is the perfect use-case for such a solution. Moreover, if promulgated within the greater development community as an open-source solution, such an approach holds potential to be transformative to current bioinformatics software development. Context: Bioinformatics relies on nimble IT framework which can adapt to changing requirements. Aims: To present a well-established software design and deployment strategy as a solution for current challenges within bioinformatics Conclusions: Use of the microservices framework is an effective

  17. Characterization of humoral responses to soluble trimeric HIV gp140 from a clade A Ugandan field isolate.

    Science.gov (United States)

    Visciano, Maria Luisa; Tagliamonte, Maria; Stewart-Jones, Guillaume; Heyndrickx, Leo; Vanham, Guido; Jansson, Marianne; Fomsgaard, Anders; Grevstad, Berit; Ramaswamy, Meghna; Buonaguro, Franco M; Tornesello, Maria Lina; Biswas, Priscilla; Scarlatti, Gabriella; Buonaguro, Luigi

    2013-07-08

    Trimeric soluble forms of HIV gp140 envelope glycoproteins represent one of the closest molecular structures compared to native spikes present on intact virus particles. Trimeric soluble gp140 have been generated by several groups and such molecules have been shown to induce antibodies with neutralizing activity against homologous and heterologous viruses. In the present study, we generated a recombinant trimeric soluble gp140, derived from a previously identified Ugandan A-clade HIV field isolate (gp14094UG018). Antibodies elicited in immunized rabbits show a broad binding pattern to HIV envelopes of different clades. An epitope mapping analysis reveals that, on average, the binding is mostly focused on the C1, C2, V3, V5 and C5 regions. Immune sera show neutralization activity to Tier 1 isolates of different clades, demonstrating cross clade neutralizing activity which needs to be further broadened by possible structural modifications of the clade A gp14094UG018. Our results provide a rationale for the design and evaluation of immunogens and the clade A gp14094UG018 shows promising characteristics for potential involvement in an effective HIV vaccine with broad activity.

  18. ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis.

    Science.gov (United States)

    Römer, Michael; Eichner, Johannes; Dräger, Andreas; Wrzodek, Clemens; Wrzodek, Finja; Zell, Andreas

    2016-01-01

    Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/.

  19. An interdepartmental Ph.D. program in computational biology and bioinformatics: the Yale perspective.

    Science.gov (United States)

    Gerstein, Mark; Greenbaum, Dov; Cheung, Kei; Miller, Perry L

    2007-02-01

    Computational biology and bioinformatics (CBB), the terms often used interchangeably, represent a rapidly evolving biological discipline. With the clear potential for discovery and innovation, and the need to deal with the deluge of biological data, many academic institutions are committing significant resources to develop CBB research and training programs. Yale formally established an interdepartmental Ph.D. program in CBB in May 2003. This paper describes Yale's program, discussing the scope of the field, the program's goals and curriculum, as well as a number of issues that arose in implementing the program. (Further updated information is available from the program's website, www.cbb.yale.edu.)

  20. Unique Phylogenetic Lineage Found in the Fusarium-like Clade after Re-examining BCCM/IHEM Fungal Culture Collection Material.

    Science.gov (United States)

    Triest, David; De Cremer, Koen; Piérard, Denis; Hendrickx, Marijke

    2016-09-01

    Recently, the Fusarium genus has been narrowed based upon phylogenetic analyses and a Fusarium -like clade was adopted. The few species of the Fusarium -like clade were moved to new, re-installed or existing genera or provisionally retained as " Fusarium ." Only a limited number of reference strains and DNA marker sequences are available for this clade and not much is known about its actual species diversity. Here, we report six strains, preserved by the Belgian fungal culture collection BCCM/IHEM as a Fusarium species, that belong to the Fusarium -like clade. They showed a slow growth and produced pionnotes, typical morphological characteristics of many Fusarium -like species. Multilocus sequencing with comparative sequence analyses in GenBank and phylogenetic analyses, using reference sequences of type material, confirmed that they were indeed member of the Fusarium -like clade. One strain was identified as "Fusarium" ciliatum whereas another strain was identified as Fusicolla merismoides . The four remaining strains were shown to represent a unique phylogenetic lineage in the Fusarium -like clade and were also found morphologically distinct from other members of the Fusarium -like clade. Based upon phylogenetic considerations, a new genus, Pseudofusicolla gen. nov., and a new species, Pseudofusicolla belgica sp. nov., were installed for this lineage. A formal description is provided in this study. Additional sampling will be required to gather isolates other than the historical strains presented in the present study as well as to further reveal the actual species diversity in the Fusarium -like clade.

  1. A Portable Bioinformatics Course for Upper-Division Undergraduate Curriculum in Sciences

    Science.gov (United States)

    Floraino, Wely B.

    2008-01-01

    This article discusses the challenges that bioinformatics education is facing and describes a bioinformatics course that is successfully taught at the California State Polytechnic University, Pomona, to the fourth year undergraduate students in biological sciences, chemistry, and computer science. Information on lecture and computer practice…

  2. Computer Programming and Biomolecular Structure Studies: A Step beyond Internet Bioinformatics

    Science.gov (United States)

    Likic, Vladimir A.

    2006-01-01

    This article describes the experience of teaching structural bioinformatics to third year undergraduate students in a subject titled "Biomolecular Structure and Bioinformatics." Students were introduced to computer programming and used this knowledge in a practical application as an alternative to the well established Internet bioinformatics…

  3. clubber: removing the bioinformatics bottleneck in big data analyses

    Science.gov (United States)

    Miller, Maximilian; Zhu, Chengsheng; Bromberg, Yana

    2018-01-01

    With the advent of modern day high-throughput technologies, the bottleneck in biological discovery has shifted from the cost of doing experiments to that of analyzing results. clubber is our automated cluster-load balancing system developed for optimizing these “big data” analyses. Its plug-and-play framework encourages re-use of existing solutions for bioinformatics problems. clubber’s goals are to reduce computation times and to facilitate use of cluster computing. The first goal is achieved by automating the balance of parallel submissions across available high performance computing (HPC) resources. Notably, the latter can be added on demand, including cloud-based resources, and/or featuring heterogeneous environments. The second goal of making HPCs user-friendly is facilitated by an interactive web interface and a RESTful API, allowing for job monitoring and result retrieval. We used clubber to speed up our pipeline for annotating molecular functionality of metagenomes. Here, we analyzed the Deepwater Horizon oil-spill study data to quantitatively show that the beach sands have not yet entirely recovered. Further, our analysis of the CAMI-challenge data revealed that microbiome taxonomic shifts do not necessarily correlate with functional shifts. These examples (21 metagenomes processed in 172 min) clearly illustrate the importance of clubber in the everyday computational biology environment. PMID:28609295

  4. clubber: removing the bioinformatics bottleneck in big data analyses.

    Science.gov (United States)

    Miller, Maximilian; Zhu, Chengsheng; Bromberg, Yana

    2017-06-13

    With the advent of modern day high-throughput technologies, the bottleneck in biological discovery has shifted from the cost of doing experiments to that of analyzing results. clubber is our automated cluster-load balancing system developed for optimizing these "big data" analyses. Its plug-and-play framework encourages re-use of existing solutions for bioinformatics problems. clubber's goals are to reduce computation times and to facilitate use of cluster computing. The first goal is achieved by automating the balance of parallel submissions across available high performance computing (HPC) resources. Notably, the latter can be added on demand, including cloud-based resources, and/or featuring heterogeneous environments. The second goal of making HPCs user-friendly is facilitated by an interactive web interface and a RESTful API, allowing for job monitoring and result retrieval. We used clubber to speed up our pipeline for annotating molecular functionality of metagenomes. Here, we analyzed the Deepwater Horizon oil-spill study data to quantitatively show that the beach sands have not yet entirely recovered. Further, our analysis of the CAMI-challenge data revealed that microbiome taxonomic shifts do not necessarily correlate with functional shifts. These examples (21 metagenomes processed in 172 min) clearly illustrate the importance of clubber in the everyday computational biology environment.

  5. clubber: removing the bioinformatics bottleneck in big data analyses

    Directory of Open Access Journals (Sweden)

    Miller Maximilian

    2017-06-01

    Full Text Available With the advent of modern day high-throughput technologies, the bottleneck in biological discovery has shifted from the cost of doing experiments to that of analyzing results. clubber is our automated cluster-load balancing system developed for optimizing these “big data” analyses. Its plug-and-play framework encourages re-use of existing solutions for bioinformatics problems. clubber’s goals are to reduce computation times and to facilitate use of cluster computing. The first goal is achieved by automating the balance of parallel submissions across available high performance computing (HPC resources. Notably, the latter can be added on demand, including cloud-based resources, and/or featuring heterogeneous environments. The second goal of making HPCs user-friendly is facilitated by an interactive web interface and a RESTful API, allowing for job monitoring and result retrieval. We used clubber to speed up our pipeline for annotating molecular functionality of metagenomes. Here, we analyzed the Deepwater Horizon oil-spill study data to quantitatively show that the beach sands have not yet entirely recovered. Further, our analysis of the CAMI-challenge data revealed that microbiome taxonomic shifts do not necessarily correlate with functional shifts. These examples (21 metagenomes processed in 172 min clearly illustrate the importance of clubber in the everyday computational biology environment.

  6. Plastid genome evolution across the genus Cuscuta (Convolvulaceae): two clades within subgenus Grammica exhibit extensive gene loss.

    Science.gov (United States)

    Braukmann, Thomas; Kuzmina, Maria; Stefanovic, Sasa

    2013-02-01

    The genus Cuscuta (Convolvulaceae, the morning glory family) is one of the most intensely studied lineages of parasitic plants. Whole plastome sequencing of four Cuscuta species has demonstrated changes to both plastid gene content and structure. The presence of photosynthetic genes under purifying selection indicates that Cuscuta is cryptically photosynthetic. However, the tempo and mode of plastid genome evolution across the diversity of this group (~200 species) remain largely unknown. A comparative investigation of plastid genome content, grounded within a phylogenetic framework, was conducted using a slot-blot Southern hybridization approach. Cuscuta was extensively sampled (~56% of species), including groups previously suggested to possess more altered plastomes compared with other members of this genus. A total of 56 probes derived from all categories of protein-coding genes, typically found within the plastomes of flowering plants, were used. The results indicate that two clades within subgenus Grammica (clades 'O' and 'K') exhibit substantially more plastid gene loss relative to other members of Cuscuta. All surveyed members of the 'O' clade show extensive losses of plastid genes from every category of genes typically found in the plastome, including otherwise highly conserved small and large ribosomal subunits. The extent of plastid gene losses within this clade is similar in magnitude to that observed previously in some non-asterid holoparasites, in which the very presence of a plastome has been questioned. The 'K' clade also exhibits considerable loss of plastid genes. Unlike in the 'O' clade, in which all species seem to be affected, the losses in clade 'K' progress phylogenetically, following a pattern consistent with the Evolutionary Transition Series hypothesis. This clade presents an ideal opportunity to study the reduction of the plastome of parasites 'in action'. The widespread plastid gene loss in these two clades is hypothesized to be a

  7. Phylogeny of the Ampelocissus-Vitis clade in Vitaceae supports the New World origin of the grape genus.

    Science.gov (United States)

    Liu, Xiu-Qun; Ickert-Bond, Stefanie M; Nie, Ze-Long; Zhou, Zhuo; Chen, Long-Qing; Wen, Jun

    2016-02-01

    The grapes and the close allies in Vitaceae are of great agronomic and economic importance. Our previous studies showed that the grape genus Vitis was closely related to three tropical genera, which formed the Ampelocissus-Vitis clade (including Vitis, Ampelocissus, Nothocissus and Pterisanthes). Yet the phylogenetic relationships of the four genera within this clade remain poorly resolved. Furthermore, the geographic origin of Vitis is still controversial, because the sampling of the close relatives of Vitis was too limited in the previous studies. This study reconstructs the phylogenetic relationships within the clade, and hypothesizes the origin of Vitis in a broader phylogenetic framework, using five plastid and two nuclear markers. The Ampelocissus-Vitis clade is supported to be composed of five main lineages. Vitis includes two described subgenera each as a monophyletic group. Ampelocissus is paraphyletic. The New World Ampelocissus does not form a clade and shows a complex phylogenetic relationship, with A. acapulcensis and A. javalensis forming a clade, and A. erdvendbergiana sister to Vitis. The majority of the Asian Ampelocissus species form a clade, within which Pterisanthes is nested. Pterisanthes is polyphyletic, suggesting that the lamellate inflorescence characteristic of the genus represents convergence. Nothocissus is sister to the clade of Asian Ampelocissus and Pterisanthes. The African Ampelocissus forms a clade with several Asian species. Based on the Bayesian dating and both the RASP and Lagrange analyses, Vitis is inferred to have originated in the New World during the late Eocene (39.4Ma, 95% HPD: 32.6-48.6Ma), then migrated to Eurasia in the late Eocene (37.3Ma, 95% HPD: 30.9-45.1Ma). The North Atlantic land bridges (NALB) are hypothesized to be the most plausible route for the Vitis migration from the New World to Eurasia, while intercontinental long distance dispersal (LDD) cannot be eliminated as a likely mechanism. Copyright © 2015

  8. Chapter 16: text mining for translational bioinformatics.

    Science.gov (United States)

    Cohen, K Bretonnel; Hunter, Lawrence E

    2013-04-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.

  9. Applying Instructional Design Theories to Bioinformatics Education in Microarray Analysis and Primer Design Workshops

    Science.gov (United States)

    Shachak, Aviv; Ophir, Ron; Rubin, Eitan

    2005-01-01

    The need to support bioinformatics training has been widely recognized by scientists, industry, and government institutions. However, the discussion of instructional methods for teaching bioinformatics is only beginning. Here we report on a systematic attempt to design two bioinformatics workshops for graduate biology students on the basis of…

  10. BOWS (bioinformatics open web services) to centralize bioinformatics tools in web services.

    Science.gov (United States)

    Velloso, Henrique; Vialle, Ricardo A; Ortega, J Miguel

    2015-06-02

    Bioinformaticians face a range of difficulties to get locally-installed tools running and producing results; they would greatly benefit from a system that could centralize most of the tools, using an easy interface for input and output. Web services, due to their universal nature and widely known interface, constitute a very good option to achieve this goal. Bioinformatics open web services (BOWS) is a system based on generic web services produced to allow programmatic access to applications running on high-performance computing (HPC) clusters. BOWS intermediates the access to registered tools by providing front-end and back-end web services. Programmers can install applications in HPC clusters in any programming language and use the back-end service to check for new jobs and their parameters, and then to send the results to BOWS. Programs running in simple computers consume the BOWS front-end service to submit new processes and read results. BOWS compiles Java clients, which encapsulate the front-end web service requisitions, and automatically creates a web page that disposes the registered applications and clients. Bioinformatics open web services registered applications can be accessed from virtually any programming language through web services, or using standard java clients. The back-end can run in HPC clusters, allowing bioinformaticians to remotely run high-processing demand applications directly from their machines.

  11. SPECIES DATABASES AND THE BIOINFORMATICS REVOLUTION.

    Science.gov (United States)

    Biological databases are having a growth spurt. Much of this results from research in genetics and biodiversity, coupled with fast-paced developments in information technology. The revolution in bioinformatics, defined by Sugden and Pennisi (2000) as the "tools and techniques for...

  12. BioSmalltalk: a pure object system and library for bioinformatics.

    Science.gov (United States)

    Morales, Hernán F; Giovambattista, Guillermo

    2013-09-15

    We have developed BioSmalltalk, a new environment system for pure object-oriented bioinformatics programming. Adaptive end-user programming systems tend to become more important for discovering biological knowledge, as is demonstrated by the emergence of open-source programming toolkits for bioinformatics in the past years. Our software is intended to bridge the gap between bioscientists and rapid software prototyping while preserving the possibility of scaling to whole-system biology applications. BioSmalltalk performs better in terms of execution time and memory usage than Biopython and BioPerl for some classical situations. BioSmalltalk is cross-platform and freely available (MIT license) through the Google Project Hosting at http://code.google.com/p/biosmalltalk hernan.morales@gmail.com Supplementary data are available at Bioinformatics online.

  13. Synergistic activity profile of griffithsin in combination with tenofovir, maraviroc and enfuvirtide against HIV-1 clade C

    International Nuclear Information System (INIS)

    Ferir, Geoffrey; Palmer, Kenneth E.; Schols, Dominique

    2011-01-01

    Griffithsin (GRFT) is possibly the most potent anti-HIV peptide found in natural sources. Due to its potent and broad-spectrum antiviral activity and unique safety profile it has great potential as topical microbicide component. Here, we evaluated various combinations of GRFT against HIV-1 clade B and clade C isolates in primary peripheral blood mononuclear cells (PBMCs) and in CD4 + MT-4 cells. In all combinations tested, GRFT showed synergistic activity profile with tenofovir, maraviroc and enfuvirtide based on the median effect principle with combination indices (CI) varying between 0.34 and 0.79 at the calculated EC 95 level. Furthermore, the different glycosylation patterns on the viral envelope of clade B and clade C gp120 had no observable effect on the synergistic interactions. Overall, we can conclude that the evaluated two-drug combination increases their antiviral potency and supports further clinical investigations in pre-exposure prophylaxis for GRFT combinations in the context of HIV-1 clade C infection.

  14. Expansion of a urethritis-associated Neisseria meningitidis clade in the United States with concurrent acquisition of N. gonorrhoeae alleles.

    Science.gov (United States)

    Retchless, Adam C; Kretz, Cécilia B; Chang, How-Yi; Bazan, Jose A; Abrams, A Jeanine; Norris Turner, Abigail; Jenkins, Laurel T; Trees, David L; Tzeng, Yih-Ling; Stephens, David S; MacNeil, Jessica R; Wang, Xin

    2018-03-02

    Increased reports of Neisseria meningitidis urethritis in multiple U.S. cities during 2015 have been attributed to the emergence of a novel clade of nongroupable N. meningitidis within the ST-11 clonal complex, the "U.S. NmNG urethritis clade". Genetic recombination with N. gonorrhoeae has been proposed to enable efficient sexual transmission by this clade. To understand the evolutionary origin and diversification of the U.S. NmNG urethritis clade, whole-genome phylogenetic analysis was performed to identify its members among the N. meningitidis strain collection from the Centers for Disease Control and Prevention, including 209 urogenital and rectal N. meningitidis isolates submitted by U.S. public health departments in eleven states starting in 2015. The earliest representatives of the U.S. NmNG urethritis clade were identified from cases of invasive disease that occurred in 2013. Among 209 urogenital and rectal isolates submitted from January 2015 to September 2016, the clade accounted for 189/198 male urogenital isolates, 3/4 female urogenital isolates, and 1/7 rectal isolates. In total, members of the clade were isolated in thirteen states between 2013 and 2016, which evolved from a common ancestor that likely existed during 2011. The ancestor contained N. gonorrhoeae-like alleles in three regions of its genome, two of which may facilitate nitrite-dependent anaerobic growth during colonization of urogenital sites. Additional gonococcal-like alleles were acquired as the clade diversified. Notably, one isolate contained a sequence associated with azithromycin resistance in N. gonorrhoeae, but no other gonococcal antimicrobial resistance determinants were detected. Interspecies genetic recombination contributed to the early evolution and subsequent diversification of the U.S. NmNG urethritis clade. Ongoing acquisition of N. gonorrhoeae alleles by the U.S. NmNG urethritis clade may facilitate the expansion of its ecological niche while also increasing the

  15. e-MIR2: a public online inventory of medical informatics resources.

    Science.gov (United States)

    de la Calle, Guillermo; García-Remesal, Miguel; Nkumu-Mbomio, Nelida; Kulikowski, Casimir; Maojo, Victor

    2012-08-02

    Over the past years, the number of available informatics resources in medicine has grown exponentially. While specific inventories of such resources have already begun to be developed for Bioinformatics (BI), comparable inventories are as yet not available for the Medical Informatics (MI) field, so that locating and accessing them currently remains a difficult and time-consuming task. We have created a repository of MI resources from the scientific literature, providing free access to its contents through a web-based service. We define informatics resources as all those elements that constitute, serve to define or are used by informatics systems, ranging from architectures or development methodologies to terminologies, vocabularies, databases or tools. Relevant information describing the resources is automatically extracted from manuscripts published in top-ranked MI journals. We used a pattern matching approach to detect the resources' names and their main features. Detected resources are classified according to three different criteria: functionality, resource type and domain. To facilitate these tasks, we have built three different classification schemas by following a novel approach based on folksonomies and social tagging. We adopted the terminology most frequently used by MI researchers in their publications to create the concepts and hierarchical relationships belonging to the classification schemas. The classification algorithm identifies the categories associated with resources and annotates them accordingly. The database is then populated with this data after manual curation and validation. We have created an online repository of MI resources to assist researchers in locating and accessing the most suitable resources to perform specific tasks. The database contains 609 resources at the time of writing and is available at http://www.gib.fi.upm.es/eMIR2. We are continuing to expand the number of available resources by taking into account further

  16. Unique Phylogenetic Lineage Found in the Fusarium-like Clade after Re-examining BCCM/IHEM Fungal Culture Collection Material

    Science.gov (United States)

    De Cremer, Koen; Piérard, Denis; Hendrickx, Marijke

    2016-01-01

    Recently, the Fusarium genus has been narrowed based upon phylogenetic analyses and a Fusarium-like clade was adopted. The few species of the Fusarium-like clade were moved to new, re-installed or existing genera or provisionally retained as "Fusarium." Only a limited number of reference strains and DNA marker sequences are available for this clade and not much is known about its actual species diversity. Here, we report six strains, preserved by the Belgian fungal culture collection BCCM/IHEM as a Fusarium species, that belong to the Fusarium-like clade. They showed a slow growth and produced pionnotes, typical morphological characteristics of many Fusarium-like species. Multilocus sequencing with comparative sequence analyses in GenBank and phylogenetic analyses, using reference sequences of type material, confirmed that they were indeed member of the Fusarium-like clade. One strain was identified as "Fusarium" ciliatum whereas another strain was identified as Fusicolla merismoides. The four remaining strains were shown to represent a unique phylogenetic lineage in the Fusarium-like clade and were also found morphologically distinct from other members of the Fusarium-like clade. Based upon phylogenetic considerations, a new genus, Pseudofusicolla gen. nov., and a new species, Pseudofusicolla belgica sp. nov., were installed for this lineage. A formal description is provided in this study. Additional sampling will be required to gather isolates other than the historical strains presented in the present study as well as to further reveal the actual species diversity in the Fusarium-like clade. PMID:27790062

  17. A bioinformatics roadmap for the human vaccines project.

    Science.gov (United States)

    Scheuermann, Richard H; Sinkovits, Robert S; Schenkelberg, Theodore; Koff, Wayne C

    2017-06-01

    Biomedical research has become a data intensive science in which high throughput experimentation is producing comprehensive data about biological systems at an ever-increasing pace. The Human Vaccines Project is a new public-private partnership, with the goal of accelerating development of improved vaccines and immunotherapies for global infectious diseases and cancers by decoding the human immune system. To achieve its mission, the Project is developing a Bioinformatics Hub as an open-source, multidisciplinary effort with the overarching goal of providing an enabling infrastructure to support the data processing, analysis and knowledge extraction procedures required to translate high throughput, high complexity human immunology research data into biomedical knowledge, to determine the core principles driving specific and durable protective immune responses.

  18. e-MIR2: a public online inventory of medical informatics resources

    Directory of Open Access Journals (Sweden)

    de la Calle Guillermo

    2012-08-01

    Full Text Available Abstract Background Over the past years, the number of available informatics resources in medicine has grown exponentially. While specific inventories of such resources have already begun to be developed for Bioinformatics (BI, comparable inventories are as yet not available for the Medical Informatics (MI field, so that locating and accessing them currently remains a difficult and time-consuming task. Description We have created a repository of MI resources from the scientific literature, providing free access to its contents through a web-based service. We define informatics resources as all those elements that constitute, serve to define or are used by informatics systems, ranging from architectures or development methodologies to terminologies, vocabularies, databases or tools. Relevant information describing the resources is automatically extracted from manuscripts published in top-ranked MI journals. We used a pattern matching approach to detect the resources’ names and their main features. Detected resources are classified according to three different criteria: functionality, resource type and domain. To facilitate these tasks, we have built three different classification schemas by following a novel approach based on folksonomies and social tagging. We adopted the terminology most frequently used by MI researchers in their publications to create the concepts and hierarchical relationships belonging to the classification schemas. The classification algorithm identifies the categories associated with resources and annotates them accordingly. The database is then populated with this data after manual curation and validation. Conclusions We have created an online repository of MI resources to assist researchers in locating and accessing the most suitable resources to perform specific tasks. The database contains 609 resources at the time of writing and is available at http://www.gib.fi.upm.es/eMIR2. We are continuing to expand the number

  19. Bioinformatics in New Generation Flavivirus Vaccines

    Directory of Open Access Journals (Sweden)

    Penelope Koraka

    2010-01-01

    Full Text Available Flavivirus infections are the most prevalent arthropod-borne infections world wide, often causing severe disease especially among children, the elderly, and the immunocompromised. In the absence of effective antiviral treatment, prevention through vaccination would greatly reduce morbidity and mortality associated with flavivirus infections. Despite the success of the empirically developed vaccines against yellow fever virus, Japanese encephalitis virus and tick-borne encephalitis virus, there is an increasing need for a more rational design and development of safe and effective vaccines. Several bioinformatic tools are available to support such rational vaccine design. In doing so, several parameters have to be taken into account, such as safety for the target population, overall immunogenicity of the candidate vaccine, and efficacy and longevity of the immune responses triggered. Examples of how bio-informatics is applied to assist in the rational design and improvements of vaccines, particularly flavivirus vaccines, are presented and discussed.

  20. Multifunctionality and diversity of GDSL esterase/lipase gene family in rice (Oryza sativa L. japonica genome: new insights from bioinformatics analysis

    Directory of Open Access Journals (Sweden)

    Chepyshko Hanna

    2012-07-01

    Full Text Available Abstract Background GDSL esterases/lipases are a newly discovered subclass of lipolytic enzymes that are very important and attractive research subjects because of their multifunctional properties, such as broad substrate specificity and regiospecificity. Compared with the current knowledge regarding these enzymes in bacteria, our understanding of the plant GDSL enzymes is very limited, although the GDSL gene family in plant species include numerous members in many fully sequenced plant genomes. Only two genes from a large rice GDSL esterase/lipase gene family were previously characterised, and the majority of the members remain unknown. In the present study, we describe the rice OsGELP (Oryza sativa GDSL esterase/lipase protein gene family at the genomic and proteomic levels, and use this knowledge to provide insights into the multifunctionality of the rice OsGELP enzymes. Results In this study, an extensive bioinformatics analysis identified 114 genes in the rice OsGELP gene family. A complete overview of this family in rice is presented, including the chromosome locations, gene structures, phylogeny, and protein motifs. Among the OsGELPs and the plant GDSL esterase/lipase proteins of known functions, 41 motifs were found that represent the core secondary structure elements or appear specifically in different phylogenetic subclades. The specification and distribution of identified putative conserved clade-common and -specific peptide motifs, and their location on the predicted protein three dimensional structure may possibly signify their functional roles. Potentially important regions for substrate specificity are highlighted, in accordance with protein three-dimensional model and location of the phylogenetic specific conserved motifs. The differential expression of some representative genes were confirmed by quantitative real-time PCR. The phylogenetic analysis, together with protein motif architectures, and the expression profiling were

  1. Cloning and bioinformatic analysis of lovastatin biosynthesis regulatory gene lovE.

    Science.gov (United States)

    Huang, Xin; Li, Hao-ming

    2009-08-05

    Lovastatin is an effective drug for treatment of hyperlipidemia. This study aimed to clone lovastatin biosynthesis regulatory gene lovE and analyze the structure and function of its encoding protein. According to the lovastatin synthase gene sequence from genebank, primers were designed to amplify and clone the lovastatin biosynthesis regulatory gene lovE from Aspergillus terrus genomic DNA. Bioinformatic analysis of lovE and its encoding animo acid sequence was performed through internet resources and software like DNAMAN. Target fragment lovE, almost 1500 bp in length, was amplified from Aspergillus terrus genomic DNA and the secondary and three-dimensional structures of LovE protein were predicted. In the lovastatin biosynthesis process lovE is a regulatory gene and LovE protein is a GAL4-like transcriptional factor.

  2. Genes of the most conserved WOX clade in plants affect root and flower development in Arabidopsis

    Directory of Open Access Journals (Sweden)

    Moreau Hervé

    2008-10-01

    Full Text Available Abstract Background The Wuschel related homeobox (WOX family proteins are key regulators implicated in the determination of cell fate in plants by preventing cell differentiation. A recent WOX phylogeny, based on WOX homeodomains, showed that all of the Physcomitrella patens and Selaginella moellendorffii WOX proteins clustered into a single orthologous group. We hypothesized that members of this group might preferentially share a significant part of their function in phylogenetically distant organisms. Hence, we first validated the limits of the WOX13 orthologous group (WOX13 OG using the occurrence of other clade specific signatures and conserved intron insertion sites. Secondly, a functional analysis using expression data and mutants was undertaken. Results The WOX13 OG contained the most conserved plant WOX proteins including the only WOX detected in the highly proliferating basal unicellular and photosynthetic organism Ostreococcus tauri. A large expansion of the WOX family was observed after the separation of mosses from other land plants and before monocots and dicots have arisen. In Arabidopsis thaliana, AtWOX13 was dynamically expressed during primary and lateral root initiation and development, in gynoecium and during embryo development. AtWOX13 appeared to affect the floral transition. An intriguing clade, represented by the functional AtWOX14 gene inside the WOX13 OG, was only found in the Brassicaceae. Compared to AtWOX13, the gene expression profile of AtWOX14 was restricted to the early stages of lateral root formation and specific to developing anthers. A mutational insertion upstream of the AtWOX14 homeodomain sequence led to abnormal root development, a delay in the floral transition and premature anther differentiation. Conclusion Our data provide evidence in favor of the WOX13 OG as the clade containing the most conserved WOX genes and established a functional link to organ initiation and development in Arabidopsis, most

  3. Correspondence regarding Zhong et al., BMC Bioinformatics 2013 Mar 7;14:89.

    Science.gov (United States)

    Kuhn, Alexandre

    2014-11-28

    Computational expression deconvolution aims to estimate the contribution of individual cell populations to expression profiles measured in samples of heterogeneous composition. Zhong et al. recently proposed Digital Sorting Algorithm (BMC Bioinformatics 2013 Mar 7;14:89) and showed that they could accurately estimate population-specific expression levels and expression differences between two populations. They compared DSA with Population-Specific Expression Analysis (PSEA), a previous deconvolution method that we developed to detect expression changes occurring within the same population between two conditions (e.g. disease versus non-disease). However, Zhong et al. compared PSEA-derived specific expression levels across different cell populations. Specific expression levels obtained with PSEA cannot be directly compared across different populations as they are on a relative scale. They are accurate as we demonstrate by deconvolving the same dataset used by Zhong et al. and, importantly, allow for comparison of population-specific expression across conditions.

  4. Atomic force microscopy imaging reveals the formation of ASIC/ENaC cross-clade ion channels

    International Nuclear Information System (INIS)

    Jeggle, Pia; Smith, Ewan St. J.; Stewart, Andrew P.; Haerteis, Silke; Korbmacher, Christoph; Edwardson, J. Michael

    2015-01-01

    ASIC and ENaC are co-expressed in various cell types, and there is evidence for a close association between them. Here, we used atomic force microscopy (AFM) to determine whether ASIC1a and ENaC subunits are able to form cross-clade hybrid ion channels. ASIC1a and ENaC could be co-isolated from detergent extracts of tsA 201 cells co-expressing the two subunits. Isolated proteins were incubated with antibodies against ENaC and Fab fragments against ASIC1a. AFM imaging revealed proteins that were decorated by both an antibody and a Fab fragment with an angle of ∼120° between them, indicating the formation of ASIC1a/ENaC heterotrimers. - Highlights: • There is evidence for a close association between ASIC and ENaC. • We used AFM to test whether ASIC1a and ENaC subunits form cross-clade ion channels. • Isolated proteins were incubated with subunit-specific antibodies and Fab fragments. • Some proteins were doubly decorated at ∼120° by an antibody and a Fab fragment. • Our results indicate the formation of ASIC1a/ENaC heterotrimers

  5. Atomic force microscopy imaging reveals the formation of ASIC/ENaC cross-clade ion channels

    Energy Technology Data Exchange (ETDEWEB)

    Jeggle, Pia; Smith, Ewan St. J.; Stewart, Andrew P. [Department of Pharmacology, University of Cambridge, Tennis Court Road, Cambridge CB2 1PD (United Kingdom); Haerteis, Silke; Korbmacher, Christoph [Institut für Zelluläre und Molekulare Physiologie, Friedrich-Alexander-Universität Erlangen-Nürnberg, Waldstrasse 6, 91054 Erlangen (Germany); Edwardson, J. Michael, E-mail: jme1000@cam.ac.uk [Department of Pharmacology, University of Cambridge, Tennis Court Road, Cambridge CB2 1PD (United Kingdom)

    2015-08-14

    ASIC and ENaC are co-expressed in various cell types, and there is evidence for a close association between them. Here, we used atomic force microscopy (AFM) to determine whether ASIC1a and ENaC subunits are able to form cross-clade hybrid ion channels. ASIC1a and ENaC could be co-isolated from detergent extracts of tsA 201 cells co-expressing the two subunits. Isolated proteins were incubated with antibodies against ENaC and Fab fragments against ASIC1a. AFM imaging revealed proteins that were decorated by both an antibody and a Fab fragment with an angle of ∼120° between them, indicating the formation of ASIC1a/ENaC heterotrimers. - Highlights: • There is evidence for a close association between ASIC and ENaC. • We used AFM to test whether ASIC1a and ENaC subunits form cross-clade ion channels. • Isolated proteins were incubated with subunit-specific antibodies and Fab fragments. • Some proteins were doubly decorated at ∼120° by an antibody and a Fab fragment. • Our results indicate the formation of ASIC1a/ENaC heterotrimers.

  6. mORCA: sailing bioinformatics world with mobile devices.

    Science.gov (United States)

    Díaz-Del-Pino, Sergio; Falgueras, Juan; Perez-Wohlfeil, Esteban; Trelles, Oswaldo

    2018-03-01

    Nearly 10 years have passed since the first mobile apps appeared. Given the fact that bioinformatics is a web-based world and that mobile devices are endowed with web-browsers, it seemed natural that bioinformatics would transit from personal computers to mobile devices but nothing could be further from the truth. The transition demands new paradigms, designs and novel implementations. Throughout an in-depth analysis of requirements of existing bioinformatics applications we designed and deployed an easy-to-use web-based lightweight mobile client. Such client is able to browse, select, compose automatically interface parameters, invoke services and monitor the execution of Web Services using the service's metadata stored in catalogs or repositories. mORCA is available at http://bitlab-es.com/morca/app as a web-app. It is also available in the App store by Apple and Play Store by Google. The software will be available for at least 2 years. ortrelles@uma.es. Source code, final web-app, training material and documentation is available at http://bitlab-es.com/morca. © The Author(s) 2017. Published by Oxford University Press.

  7. p3d--Python module for structural bioinformatics.

    Science.gov (United States)

    Fufezan, Christian; Specht, Michael

    2009-08-21

    High-throughput bioinformatic analysis tools are needed to mine the large amount of structural data via knowledge based approaches. The development of such tools requires a robust interface to access the structural data in an easy way. For this the Python scripting language is the optimal choice since its philosophy is to write an understandable source code. p3d is an object oriented Python module that adds a simple yet powerful interface to the Python interpreter to process and analyse three dimensional protein structure files (PDB files). p3d's strength arises from the combination of a) very fast spatial access to the structural data due to the implementation of a binary space partitioning (BSP) tree, b) set theory and c) functions that allow to combine a and b and that use human readable language in the search queries rather than complex computer language. All these factors combined facilitate the rapid development of bioinformatic tools that can perform quick and complex analyses of protein structures. p3d is the perfect tool to quickly develop tools for structural bioinformatics using the Python scripting language.

  8. Taxonomic evaluation of species in the Streptomyces hirsutus clade using multi-locus sequence analysis and proposals to reclassify several species in this clade

    Science.gov (United States)

    Previous phylogenetic analyses of species of Streptomyces based on 16S rRNA gene sequences resulted in a statistically well-supported clade (100% bootstrap value) containing 8 species that exhibited very similar gross morphology in producing open looped (Retinaculum-Apertum) to spiral (Spira) chains...

  9. A global perspective on evolving bioinformatics and data science training needs.

    Science.gov (United States)

    Attwood, Teresa K; Blackford, Sarah; Brazas, Michelle D; Davies, Angela; Schneider, Maria Victoria

    2017-08-29

    Bioinformatics is now intrinsic to life science research, but the past decade has witnessed a continuing deficiency in this essential expertise. Basic data stewardship is still taught relatively rarely in life science education programmes, creating a chasm between theory and practice, and fuelling demand for bioinformatics training across all educational levels and career roles. Concerned by this, surveys have been conducted in recent years to monitor bioinformatics and computational training needs worldwide. This article briefly reviews the principal findings of a number of these studies. We see that there is still a strong appetite for short courses to improve expertise and confidence in data analysis and interpretation; strikingly, however, the most urgent appeal is for bioinformatics to be woven into the fabric of life science degree programmes. Satisfying the relentless training needs of current and future generations of life scientists will require a concerted response from stakeholders across the globe, who need to deliver sustainable solutions capable of both transforming education curricula and cultivating a new cadre of trainer scientists. © The Author 2017. Published by Oxford University Press.

  10. Bioinformatic tools for PCR Primer design

    African Journals Online (AJOL)

    ES

    reaction (PCR), oligo hybridization and DNA sequencing. Proper primer design is actually one of the most important factors/steps in successful DNA sequencing. Various bioinformatics programs are available for selection of primer pairs from a template sequence. The plethora programs for PCR primer design reflects the.

  11. Bird evolution: testing the Metaves clade with six new mitochondrial genomes

    Directory of Open Access Journals (Sweden)

    Phillips Matthew J

    2008-01-01

    Full Text Available Abstract Background Evolutionary biologists are often misled by convergence of morphology and this has been common in the study of bird evolution. However, the use of molecular data sets have their own problems and phylogenies based on short DNA sequences have the potential to mislead us too. The relationships among clades and timing of the evolution of modern birds (Neoaves has not yet been well resolved. Evidence of convergence of morphology remain controversial. With six new bird mitochondrial genomes (hummingbird, swift, kagu, rail, flamingo and grebe we test the proposed Metaves/Coronaves division within Neoaves and the parallel radiations in this primary avian clade. Results Our mitochondrial trees did not return the Metaves clade that had been proposed based on one nuclear intron sequence. We suggest that the high number of indels within the seventh intron of the β-fibrinogen gene at this phylogenetic level, which left a dataset with not a single site across the alignment shared by all taxa, resulted in artifacts during analysis. With respect to the overall avian tree, we find the flamingo and grebe are sister taxa and basal to the shorebirds (Charadriiformes. Using a novel site-stripping technique for noise-reduction we found this relationship to be stable. The hummingbird/swift clade is outside the large and very diverse group of raptors, shore and sea birds. Unexpectedly the kagu is not closely related to the rail in our analysis, but because neither the kagu nor the rail have close affinity to any taxa within this dataset of 41 birds, their placement is not yet resolved. Conclusion Our phylogenetic hypothesis based on 41 avian mitochondrial genomes (13,229 bp rejects monophyly of seven Metaves species and we therefore conclude that the members of Metaves do not share a common evolutionary history within the Neoaves.

  12. Decoupled form and function in disparate herbivorous dinosaur clades

    Science.gov (United States)

    Lautenschlager, Stephan; Brassey, Charlotte A.; Button, David J.; Barrett, Paul M.

    2016-05-01

    Convergent evolution, the acquisition of morphologically similar traits in unrelated taxa due to similar functional demands or environmental factors, is a common phenomenon in the animal kingdom. Consequently, the occurrence of similar form is used routinely to address fundamental questions in morphofunctional research and to infer function in fossils. However, such qualitative assessments can be misleading and it is essential to test form/function relationships quantitatively. The parallel occurrence of a suite of morphologically convergent craniodental characteristics in three herbivorous, phylogenetically disparate dinosaur clades (Sauropodomorpha, Ornithischia, Theropoda) provides an ideal test case. A combination of computational biomechanical models (Finite Element Analysis, Multibody Dynamics Analysis) demonstrate that despite a high degree of morphological similarity between representative taxa (Plateosaurus engelhardti, Stegosaurus stenops, Erlikosaurus andrewsi) from these clades, their biomechanical behaviours are notably different and difficult to predict on the basis of form alone. These functional differences likely reflect dietary specialisations, demonstrating the value of quantitative biomechanical approaches when evaluating form/function relationships in extinct taxa.

  13. Decoupled form and function in disparate herbivorous dinosaur clades.

    Science.gov (United States)

    Lautenschlager, Stephan; Brassey, Charlotte A; Button, David J; Barrett, Paul M

    2016-05-20

    Convergent evolution, the acquisition of morphologically similar traits in unrelated taxa due to similar functional demands or environmental factors, is a common phenomenon in the animal kingdom. Consequently, the occurrence of similar form is used routinely to address fundamental questions in morphofunctional research and to infer function in fossils. However, such qualitative assessments can be misleading and it is essential to test form/function relationships quantitatively. The parallel occurrence of a suite of morphologically convergent craniodental characteristics in three herbivorous, phylogenetically disparate dinosaur clades (Sauropodomorpha, Ornithischia, Theropoda) provides an ideal test case. A combination of computational biomechanical models (Finite Element Analysis, Multibody Dynamics Analysis) demonstrate that despite a high degree of morphological similarity between representative taxa (Plateosaurus engelhardti, Stegosaurus stenops, Erlikosaurus andrewsi) from these clades, their biomechanical behaviours are notably different and difficult to predict on the basis of form alone. These functional differences likely reflect dietary specialisations, demonstrating the value of quantitative biomechanical approaches when evaluating form/function relationships in extinct taxa.

  14. Introductory Bioinformatics Exercises Utilizing Hemoglobin and Chymotrypsin to Reinforce the Protein Sequence-Structure-Function Relationship

    Science.gov (United States)

    Inlow, Jennifer K.; Miller, Paige; Pittman, Bethany

    2007-01-01

    We describe two bioinformatics exercises intended for use in a computer laboratory setting in an upper-level undergraduate biochemistry course. To introduce students to bioinformatics, the exercises incorporate several commonly used bioinformatics tools, including BLAST, that are freely available online. The exercises build upon the students'…

  15. Culture and hybridization experiments on an ulva clade including the Qingdao strain blooming in the yellow sea.

    Directory of Open Access Journals (Sweden)

    Masanori Hiraoka

    2011-05-01

    Full Text Available In the summer of 2008, immediately prior to the Beijing Olympics, a massive green tide of the genus Ulva covered the Qingdao coast of the Yellow Sea in China. Based on molecular analyses using the nuclear encoded rDNA internal transcribed spacer (ITS region, the Qingdao strains dominating the green tide were reported to be included in a single phylogenetic clade, currently regarded as a single species. On the other hand, our detailed phylogenetic analyses of the clade, using a higher resolution DNA marker, suggested that two genetically separate entities could be included within the clade. However, speciation within the Ulva clade has not yet been examined. We examined the occurrence of an intricate speciation within the clade, including the Qingdao strains, via combined studies of culture, hybridization and phylogenetic analysis. The two entities separated by our phylogenetic analyses of the clade were simply distinguished as U. linza and U. prolifera morphologically by the absence or presence of branches in cultured thalli. The inclusion of sexual strains and several asexual strains were found in each taxon. Hybridizations among the sexual strains also supported the separation by a partial gamete incompatibility. The sexually reproducing Qingdao strains crossed with U. prolifera without any reproductive boundary, but a complete reproductive isolation to U. linza occurred by gamete incompatibility. The results demonstrate that the U. prolifera group includes two types of sexual strains distinguishable by crossing affinity to U. linza. Species identification within the Ulva clade requires high resolution DNA markers and/or hybridization experiments and is not possible by reliance on the ITS markers alone.

  16. Culture and Hybridization Experiments on an Ulva Clade Including the Qingdao Strain Blooming in the Yellow Sea

    Science.gov (United States)

    Hiraoka, Masanori; Ichihara, Kensuke; Zhu, Wenrong; Ma, Jiahai; Shimada, Satoshi

    2011-01-01

    In the summer of 2008, immediately prior to the Beijing Olympics, a massive green tide of the genus Ulva covered the Qingdao coast of the Yellow Sea in China. Based on molecular analyses using the nuclear encoded rDNA internal transcribed spacer (ITS) region, the Qingdao strains dominating the green tide were reported to be included in a single phylogenetic clade, currently regarded as a single species. On the other hand, our detailed phylogenetic analyses of the clade, using a higher resolution DNA marker, suggested that two genetically separate entities could be included within the clade. However, speciation within the Ulva clade has not yet been examined. We examined the occurrence of an intricate speciation within the clade, including the Qingdao strains, via combined studies of culture, hybridization and phylogenetic analysis. The two entities separated by our phylogenetic analyses of the clade were simply distinguished as U. linza and U. prolifera morphologically by the absence or presence of branches in cultured thalli. The inclusion of sexual strains and several asexual strains were found in each taxon. Hybridizations among the sexual strains also supported the separation by a partial gamete incompatibility. The sexually reproducing Qingdao strains crossed with U. prolifera without any reproductive boundary, but a complete reproductive isolation to U. linza occurred by gamete incompatibility. The results demonstrate that the U. prolifera group includes two types of sexual strains distinguishable by crossing affinity to U. linza. Species identification within the Ulva clade requires high resolution DNA markers and/or hybridization experiments and is not possible by reliance on the ITS markers alone. PMID:21573216

  17. Advance in structural bioinformatics

    CERN Document Server

    Wei, Dongqing; Zhao, Tangzhen; Dai, Hao

    2014-01-01

    This text examines in detail mathematical and physical modeling, computational methods and systems for obtaining and analyzing biological structures, using pioneering research cases as examples. As such, it emphasizes programming and problem-solving skills. It provides information on structure bioinformatics at various levels, with individual chapters covering introductory to advanced aspects, from fundamental methods and guidelines on acquiring and analyzing genomics and proteomics sequences, the structures of protein, DNA and RNA, to the basics of physical simulations and methods for conform

  18. Regressive Evolution of Photosynthesis in the Roseobacter Clade

    Czech Academy of Sciences Publication Activity Database

    Koblížek, Michal; Zeng, Yonghui; Horák, A.; Oborník, Miroslav

    2013-01-01

    Roč. 66, č. 2013 (2013), s. 385-405 ISSN 0065-2296 R&D Projects: GA ČR GAP501/10/0221; GA ČR GBP501/12/G055; GA MŠk ED2.1.00/03.0110 Institutional support: RVO:61388971 Keywords : roseobacter clade * photosynthesis * marine microbial communities Subject RIV: EE - Microbiology, Virology Impact factor: 1.740, year: 2013

  19. Major clades of Agaricales: a multilocus phylogenetic overview.

    Science.gov (United States)

    P. Brandon Matheny; Judd M. Curtis; Valerie Hofstetter; M. Catherine Aime; Jean-Marc Moncalvo; Zai-Wei Ge; Zhu-Liang Yang; Joseph F. Ammirati; Timothy J. Baroni; Neale L. Bougher; Karen W. Lodge Hughes; Richard W. Kerrigan; Michelle T. Seidl; Aanen; Matthew Duur K. DeNitis; Graciela M. Daniele; Dennis E. Desjardin; Bradley R. Kropp; Lorelei L. Norvell; Andrew Parker; Else C. Vellinga; Rytas Vilgalys; David S. Hibbett

    2006-01-01

    An overview of the phylogeny of the Agaricales is presented based on a multilocus analysis of a six-gene region supermatrix. Bayesian analyses of 5611 nucleotide characters of rpb1, rpb1-intron 2, rpb2 and 18S, 25S, and 5.8S ribosomal RNA genes recovered six major clades, which are recognized informally and labeled the Agaricoid, Tricholomatoid, Marasmioid, Pluteoid,...

  20. Origin and Population Dynamics of a Novel HIV-1 Subtype G Clade Circulating in Cape Verde and Portugal.

    Science.gov (United States)

    de Pina-Araujo, Isabel Inês M; Delatorre, Edson; Guimarães, Monick L; Morgado, Mariza G; Bello, Gonzalo

    2015-01-01

    The human immunodeficiency virus type 1 (HIV-1) subtype G is the most prevalent and second most prevalent HIV-1 clade in Cape Verde and Portugal, respectively; but there is no information about the origin and spatiotemporal dispersal pattern of this HIV-1 clade circulating in those countries. To this end, we used Maximum Likelihood and Bayesian coalescent-based methods to analyze a collection of 578 HIV-1 subtype G pol sequences sampled throughout Portugal, Cape Verde and 11 other countries from West and Central Africa over a period of 22 years (1992 to 2013). Our analyses indicate that most subtype G sequences from Cape Verde (80%) and Portugal (95%) branched together in a distinct monophyletic cluster (here called G(CV-PT)). The G(CV-PT) clade probably emerged after a single migration of the virus out of Central Africa into Cape Verde between the late 1970s and the middle 1980s, followed by a rapid dissemination to Portugal a couple of years later. Reconstruction of the demographic history of the G(CV-PT) clade circulating in Cape Verde and Portugal indicates that this viral clade displayed an initial phase of exponential growth during the 1980s and 1990s, followed by a decline in growth rate since the early 2000s. Our data also indicate that during the exponential growth phase the G(CV-PT) clade recombined with a preexisting subtype B viral strain circulating in Portugal, originating the CRF14_BG clade that was later disseminated to Spain and Cape Verde. Historical and recent human population movements between Angola, Cape Verde and Portugal probably played a key role in the origin and dispersal of the G(CV-PT )and CRF14_BG clades.

  1. Broad antibody mediated cross-neutralization and preclinical immunogenicity of new codon-optimized HIV-1 clade CRF02_AG and G primary isolates.

    Directory of Open Access Journals (Sweden)

    Simon M Agwale

    Full Text Available Creation of an effective vaccine for HIV has been an elusive goal of the scientific community for almost 30 years. Neutralizing antibodies are assumed to be pivotal to the success of a prophylactic vaccine but previous attempts to make an immunogen capable of generating neutralizing antibodies to primary "street strain" isolates have resulted in responses of very limited breadth and potency. The objective of the study was to determine the breadth and strength of neutralizing antibodies against autologous and heterologous primary isolates in a cohort of HIV-1 infected Nigerians and to characterize envelopes from subjects with particularly broad or strong immune responses for possible use as vaccine candidates in regions predominated by HIV-1 CRF02_AG and G subtypes. Envelope vectors from a panel of primary Nigerian isolates were constructed and tested with plasma/sera from the same cohort using the PhenoSense HIV neutralizing antibody assay (Monogram Biosciences Inc, USA to assess the breadth and potency of neutralizing antibodies. The immediate goal of this study was realized by the recognition of three broadly cross-neutralizing sera: (NG2-clade CRF02_AG, NG3-clade CRF02_AG and NG9- clade G. Based on these findings, envelope gp140 sequences from NG2 and NG9, complemented with a gag sequence (Clade G and consensus tat (CRF02_AG and G antigens have been codon-optimized, synthesized, cloned and evaluated in BALB/c mice. The intramuscular administration of these plasmid DNA constructs, followed by two booster DNA immunizations, induced substantial specific humoral response against all constructs and strong cellular responses against the gag and tat constructs. These preclinical findings provide a framework for the design of candidate vaccine for use in regions where the HIV-1 epidemic is driven by clades CRF02_AG and G.

  2. Broad antibody mediated cross-neutralization and preclinical immunogenicity of new codon-optimized HIV-1 clade CRF02_AG and G primary isolates.

    Science.gov (United States)

    Agwale, Simon M; Forbi, Joseph C; Notka, Frank; Wrin, Terri; Wild, Jens; Wagner, Ralf; Wolf, Hans

    2011-01-01

    Creation of an effective vaccine for HIV has been an elusive goal of the scientific community for almost 30 years. Neutralizing antibodies are assumed to be pivotal to the success of a prophylactic vaccine but previous attempts to make an immunogen capable of generating neutralizing antibodies to primary "street strain" isolates have resulted in responses of very limited breadth and potency. The objective of the study was to determine the breadth and strength of neutralizing antibodies against autologous and heterologous primary isolates in a cohort of HIV-1 infected Nigerians and to characterize envelopes from subjects with particularly broad or strong immune responses for possible use as vaccine candidates in regions predominated by HIV-1 CRF02_AG and G subtypes. Envelope vectors from a panel of primary Nigerian isolates were constructed and tested with plasma/sera from the same cohort using the PhenoSense HIV neutralizing antibody assay (Monogram Biosciences Inc, USA) to assess the breadth and potency of neutralizing antibodies. The immediate goal of this study was realized by the recognition of three broadly cross-neutralizing sera: (NG2-clade CRF02_AG, NG3-clade CRF02_AG and NG9- clade G). Based on these findings, envelope gp140 sequences from NG2 and NG9, complemented with a gag sequence (Clade G) and consensus tat (CRF02_AG and G) antigens have been codon-optimized, synthesized, cloned and evaluated in BALB/c mice. The intramuscular administration of these plasmid DNA constructs, followed by two booster DNA immunizations, induced substantial specific humoral response against all constructs and strong cellular responses against the gag and tat constructs. These preclinical findings provide a framework for the design of candidate vaccine for use in regions where the HIV-1 epidemic is driven by clades CRF02_AG and G.

  3. What is bioinformatics? A proposed definition and overview of the field.

    Science.gov (United States)

    Luscombe, N M; Greenbaum, D; Gerstein, M

    2001-01-01

    The recent flood of data from genome sequences and functional genomics has given rise to new field, bioinformatics, which combines elements of biology and computer science. Here we propose a definition for this new field and review some of the research that is being pursued, particularly in relation to transcriptional regulatory systems. Our definition is as follows: Bioinformatics is conceptualizing biology in terms of macromolecules (in the sense of physical-chemistry) and then applying "informatics" techniques (derived from disciplines such as applied maths, computer science, and statistics) to understand and organize the information associated with these molecules, on a large-scale. Analyses in bioinformatics predominantly focus on three types of large datasets available in molecular biology: macromolecular structures, genome sequences, and the results of functional genomics experiments (e.g. expression data). Additional information includes the text of scientific papers and "relationship data" from metabolic pathways, taxonomy trees, and protein-protein interaction networks. Bioinformatics employs a wide range of computational techniques including sequence and structural alignment, database design and data mining, macromolecular geometry, phylogenetic tree construction, prediction of protein structure and function, gene finding, and expression data clustering. The emphasis is on approaches integrating a variety of computational methods and heterogeneous data sources. Finally, bioinformatics is a practical discipline. We survey some representative applications, such as finding homologues, designing drugs, and performing large-scale censuses. Additional information pertinent to the review is available over the web at http://bioinfo.mbb.yale.edu/what-is-it.

  4. "Extreme Programming" in a Bioinformatics Class

    Science.gov (United States)

    Kelley, Scott; Alger, Christianna; Deutschman, Douglas

    2009-01-01

    The importance of Bioinformatics tools and methodology in modern biological research underscores the need for robust and effective courses at the college level. This paper describes such a course designed on the principles of cooperative learning based on a computer software industry production model called "Extreme Programming" (EP).…

  5. Bioinformatics in Undergraduate Education: Practical Examples

    Science.gov (United States)

    Boyle, John A.

    2004-01-01

    Bioinformatics has emerged as an important research tool in recent years. The ability to mine large databases for relevant information has become increasingly central to many different aspects of biochemistry and molecular biology. It is important that undergraduates be introduced to the available information and methodologies. We present a…

  6. Modern bioinformatics meets traditional Chinese medicine.

    Science.gov (United States)

    Gu, Peiqin; Chen, Huajun

    2014-11-01

    Traditional Chinese medicine (TCM) is gaining increasing attention with the emergence of integrative medicine and personalized medicine, characterized by pattern differentiation on individual variance and treatments based on natural herbal synergism. Investigating the effectiveness and safety of the potential mechanisms of TCM and the combination principles of drug therapies will bridge the cultural gap with Western medicine and improve the development of integrative medicine. Dealing with rapidly growing amounts of biomedical data and their heterogeneous nature are two important tasks among modern biomedical communities. Bioinformatics, as an emerging interdisciplinary field of computer science and biology, has become a useful tool for easing the data deluge pressure by automating the computation processes with informatics methods. Using these methods to retrieve, store and analyze the biomedical data can effectively reveal the associated knowledge hidden in the data, and thus promote the discovery of integrated information. Recently, these techniques of bioinformatics have been used for facilitating the interactional effects of both Western medicine and TCM. The analysis of TCM data using computational technologies provides biological evidence for the basic understanding of TCM mechanisms, safety and efficacy of TCM treatments. At the same time, the carrier and targets associated with TCM remedies can inspire the rethinking of modern drug development. This review summarizes the significant achievements of applying bioinformatics techniques to many aspects of the research in TCM, such as analysis of TCM-related '-omics' data and techniques for analyzing biological processes and pharmaceutical mechanisms of TCM, which have shown certain potential of bringing new thoughts to both sides. © The Author 2013. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  7. Transmission of a heterologous clade C Symbiodinium in a model anemone infection system via asexual reproduction

    Directory of Open Access Journals (Sweden)

    Wan-Nan U. Chen

    2016-08-01

    Full Text Available Anemones of genus Exaiptasia are used as model organisms for the study of cnidarian-dinoflagellate (genus Symbiodinium endosymbiosis. However, while most reef-building corals harbor Symbiodinium of clade C, Exaiptasia spp. anemones mainly harbor clade B Symbiodinium (ITS2 type B1 populations. In this study, we reveal for the first time that bleached Exaiptasia pallida anemones can establish a symbiotic relationship with a clade C Symbiodinium (ITS2 type C1. We further found that anemones can transmit the exogenously supplied clade C Symbiodinium cells to their offspring by asexual reproduction (pedal laceration. In order to corroborate the establishment of stable symbiosis, we used microscopic techniques and genetic analyses to examine several generations of anemones, and the results of these endeavors confirmed the sustainability of the system. These findings provide a framework for understanding the differences in infection dynamics between homologous and heterologous dinoflagellate types using a model anemone infection system.

  8. The Bioinformatics of Integrative Medical Insights: Proposals for an International Psycho-Social and Cultural Bioinformatics Project

    Directory of Open Access Journals (Sweden)

    Ernest Rossi

    2006-01-01

    Full Text Available We propose the formation of an International Psycho-Social and Cultural Bioinformatics Project (IPCBP to explore the research foundations of Integrative Medical Insights (IMI on all levels from the molecular-genomic to the psychological, cultural, social, and spiritual. Just as The Human Genome Project identified the molecular foundations of modern medicine with the new technology of sequencing DNA during the past decade, the IPCBP would extend and integrate this neuroscience knowledge base with the technology of gene expression via DNA/proteomic microarray research and brain imaging in development, stress, healing, rehabilitation, and the psychotherapeutic facilitation of existentional wellness. We anticipate that the IPCBP will require a unique international collaboration of, academic institutions, researchers, and clinical practioners for the creation of a new neuroscience of mind-body communication, brain plasticity, memory, learning, and creative processing during optimal experiential states of art, beauty, and truth. We illustrate this emerging integration of bioinformatics with medicine with a videotape of the classical 4-stage creative process in a neuroscience approach to psychotherapy.

  9. The Bioinformatics of Integrative Medical Insights: Proposals for an International PsychoSocial and Cultural Bioinformatics Project

    Directory of Open Access Journals (Sweden)

    Ernest Rossi

    2006-01-01

    Full Text Available We propose the formation of an International PsychoSocial and Cultural Bioinformatics Project (IPCBP to explore the research foundations of Integrative Medical Insights (IMI on all levels from the molecular-genomic to the psychological, cultural, social, and spiritual. Just as The Human Genome Project identified the molecular foundations of modern medicine with the new technology of sequencing DNA during the past decade, the IPCBP would extend and integrate this neuroscience knowledge base with the technology of gene expression via DNA/proteomic microarray research and brain imaging in development, stress, healing, rehabilitation, and the psychotherapeutic facilitation of existentional wellness. We anticipate that the IPCBP will require a unique international collaboration of, academic institutions, researchers, and clinical practioners for the creation of a new neuroscience of mind-body communication, brain plasticity, memory, learning, and creative processing during optimal experiential states of art, beauty, and truth. We illustrate this emerging integration of bioinformatics with medicine with a videotape of the classical 4-stage creative process in a neuroscience approach to psychotherapy.

  10. Differential Expression of Proteins Associated with the Hair Follicle Cycle - Proteomics and Bioinformatics Analyses.

    Directory of Open Access Journals (Sweden)

    Lei Wang

    Full Text Available Hair follicle cycling can be divided into the following three stages: anagen, catagen, and telogen. The molecular signals that orchestrate the follicular transition between phases are still unknown. To better understand the detailed protein networks controlling this process, proteomics and bioinformatics analyses were performed to construct comparative protein profiles of mouse skin at specific time points (0, 8, and 20 days. Ninety-five differentially expressed protein spots were identified by MALDI-TOF/TOF as 44 proteins, which were found to change during hair follicle cycle transition. Proteomics analysis revealed that these changes in protein expression are involved in Ca2+-regulated biological processes, migration, and regulation of signal transduction, among other processes. Subsequently, three proteins were selected to validate the reliability of expression patterns using western blotting. Cluster analysis revealed three expression patterns, and each pattern correlated with specific cell processes that occur during the hair cycle. Furthermore, bioinformatics analysis indicated that the differentially expressed proteins impacted multiple biological networks, after which detailed functional analyses were performed. Taken together, the above data may provide insight into the three stages of mouse hair follicle morphogenesis and provide a solid basis for potential therapeutic molecular targets for this hair disease.

  11. Incorporating a Collaborative Web-Based Virtual Laboratory in an Undergraduate Bioinformatics Course

    Science.gov (United States)

    Weisman, David

    2010-01-01

    Face-to-face bioinformatics courses commonly include a weekly, in-person computer lab to facilitate active learning, reinforce conceptual material, and teach practical skills. Similarly, fully-online bioinformatics courses employ hands-on exercises to achieve these outcomes, although students typically perform this work offsite. Combining a…

  12. A Summer Program Designed to Educate College Students for Careers in Bioinformatics

    Science.gov (United States)

    Krilowicz, Beverly; Johnston, Wendie; Sharp, Sandra B.; Warter-Perez, Nancy; Momand, Jamil

    2007-01-01

    A summer program was created for undergraduates and graduate students that teaches bioinformatics concepts, offers skills in professional development, and provides research opportunities in academic and industrial institutions. We estimate that 34 of 38 graduates (89%) are in a career trajectory that will use bioinformatics. Evidence from…

  13. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis.

    Science.gov (United States)

    Noar, Roslyn D; Daub, Margaret E

    2016-01-01

    Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity) for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity) to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that they may encode

  14. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis.

    Directory of Open Access Journals (Sweden)

    Roslyn D Noar

    Full Text Available Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that

  15. How the strengths of Lisp-family languages facilitate building complex and flexible bioinformatics applications.

    Science.gov (United States)

    Khomtchouk, Bohdan B; Weitz, Edmund; Karp, Peter D; Wahlestedt, Claes

    2016-12-31

    We present a rationale for expanding the presence of the Lisp family of programming languages in bioinformatics and computational biology research. Put simply, Lisp-family languages enable programmers to more quickly write programs that run faster than in other languages. Languages such as Common Lisp, Scheme and Clojure facilitate the creation of powerful and flexible software that is required for complex and rapidly evolving domains like biology. We will point out several important key features that distinguish languages of the Lisp family from other programming languages, and we will explain how these features can aid researchers in becoming more productive and creating better code. We will also show how these features make these languages ideal tools for artificial intelligence and machine learning applications. We will specifically stress the advantages of domain-specific languages (DSLs): languages that are specialized to a particular area, and thus not only facilitate easier research problem formulation, but also aid in the establishment of standards and best programming practices as applied to the specific research field at hand. DSLs are particularly easy to build in Common Lisp, the most comprehensive Lisp dialect, which is commonly referred to as the 'programmable programming language'. We are convinced that Lisp grants programmers unprecedented power to build increasingly sophisticated artificial intelligence systems that may ultimately transform machine learning and artificial intelligence research in bioinformatics and computational biology. © The Author 2016. Published by Oxford University Press.

  16. BioMaS: a modular pipeline for Bioinformatic analysis of Metagenomic AmpliconS.

    Science.gov (United States)

    Fosso, Bruno; Santamaria, Monica; Marzano, Marinella; Alonso-Alemany, Daniel; Valiente, Gabriel; Donvito, Giacinto; Monaco, Alfonso; Notarangelo, Pasquale; Pesole, Graziano

    2015-07-01

    Substantial advances in microbiology, molecular evolution and biodiversity have been carried out in recent years thanks to Metagenomics, which allows to unveil the composition and functions of mixed microbial communities in any environmental niche. If the investigation is aimed only at the microbiome taxonomic structure, a target-based metagenomic approach, here also referred as Meta-barcoding, is generally applied. This approach commonly involves the selective amplification of a species-specific genetic marker (DNA meta-barcode) in the whole taxonomic range of interest and the exploration of its taxon-related variants through High-Throughput Sequencing (HTS) technologies. The accessibility to proper computational systems for the large-scale bioinformatic analysis of HTS data represents, currently, one of the major challenges in advanced Meta-barcoding projects. BioMaS (Bioinformatic analysis of Metagenomic AmpliconS) is a new bioinformatic pipeline designed to support biomolecular researchers involved in taxonomic studies of environmental microbial communities by a completely automated workflow, comprehensive of all the fundamental steps, from raw sequence data upload and cleaning to final taxonomic identification, that are absolutely required in an appropriately designed Meta-barcoding HTS-based experiment. In its current version, BioMaS allows the analysis of both bacterial and fungal environments starting directly from the raw sequencing data from either Roche 454 or Illumina HTS platforms, following two alternative paths, respectively. BioMaS is implemented into a public web service available at https://recasgateway.ba.infn.it/ and is also available in Galaxy at http://galaxy.cloud.ba.infn.it:8080 (only for Illumina data). BioMaS is a friendly pipeline for Meta-barcoding HTS data analysis specifically designed for users without particular computing skills. A comparative benchmark, carried out by using a simulated dataset suitably designed to broadly represent

  17. bioalcidae, samjs and vcffilterjs: object-oriented formatters and filters for bioinformatics files.

    Science.gov (United States)

    Lindenbaum, Pierre; Redon, Richard

    2018-04-01

    Reformatting and filtering bioinformatics files are common tasks for bioinformaticians. Standard Linux tools and specific programs are usually used to perform such tasks but there is still a gap between using these tools and the programming interface of some existing libraries. In this study, we developed a set of tools namely bioalcidae, samjs and vcffilterjs that reformat or filter files using a JavaScript engine or a pure java expression and taking advantage of the java API for high-throughput sequencing data (htsjdk). https://github.com/lindenb/jvarkit. pierre.lindenbaum@univ-nantes.fr.

  18. New Link in Bioinformatics Services Value Chain: Position, Organization and Business Model

    Directory of Open Access Journals (Sweden)

    Mladen Čudanov

    2012-11-01

    Full Text Available This paper presents development in the bioinformatics services industry value chain, based on cloud computing paradigm. As genome sequencing costs per Megabase exponentially drop, industry needs to adopt. Paper has two parts: theoretical analysis and practical example of Seven Bridges Genomics Company. We are focused on explaining organizational, business and financial aspects of new business model in bioinformatics services, rather than technical side of the problem. In the light of that we present twofold business model fit for core bioinformatics research and Information and Communication Technologie (ICT support in the new environment, with higher level of capital utilization and better resistance to business risks.

  19. An overview of bioinformatics methods for modeling biological pathways in yeast.

    Science.gov (United States)

    Hou, Jie; Acharya, Lipi; Zhu, Dongxiao; Cheng, Jianlin

    2016-03-01

    The advent of high-throughput genomics techniques, along with the completion of genome sequencing projects, identification of protein-protein interactions and reconstruction of genome-scale pathways, has accelerated the development of systems biology research in the yeast organism Saccharomyces cerevisiae In particular, discovery of biological pathways in yeast has become an important forefront in systems biology, which aims to understand the interactions among molecules within a cell leading to certain cellular processes in response to a specific environment. While the existing theoretical and experimental approaches enable the investigation of well-known pathways involved in metabolism, gene regulation and signal transduction, bioinformatics methods offer new insights into computational modeling of biological pathways. A wide range of computational approaches has been proposed in the past for reconstructing biological pathways from high-throughput datasets. Here we review selected bioinformatics approaches for modeling biological pathways inS. cerevisiae, including metabolic pathways, gene-regulatory pathways and signaling pathways. We start with reviewing the research on biological pathways followed by discussing key biological databases. In addition, several representative computational approaches for modeling biological pathways in yeast are discussed. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  20. Bioinformatics in High School Biology Curricula: A Study of State Science Standards

    Science.gov (United States)

    Wefer, Stephen H.; Sheppard, Keith

    2008-01-01

    The proliferation of bioinformatics in modern biology marks a modern revolution in science that promises to influence science education at all levels. This study analyzed secondary school science standards of 49 U.S. states (Iowa has no science framework) and the District of Columbia for content related to bioinformatics. The bioinformatics…

  1. Genomic differentiation among two strains of the PS1 clade isolated from geographically separated marine habitats

    KAUST Repository

    Jimenez Infante, Francy M.

    2014-05-22

    Using dilution-to-extinction cultivation, we isolated a strain affiliated with the PS1 clade from surface waters of the Red Sea. Strain RS24 represents the second isolate of this group of marine Alphaproteobacteria after IMCC14465 that was isolated from the East (Japan) Sea. The PS1 clade is a sister group to the OCS116 clade, together forming a putatively novel order closely related to Rhizobiales. While most genomic features and most of the genetic content are conserved between RS24 and IMCC14465, their average nucleotide identity (ANI) is < 81%, suggesting two distinct species of the PS1 clade. Next to encoding two different variants of proteorhodopsin genes, they also harbor several unique genomic islands that contain genes related to degradation of aromatic compounds in IMCC14465 and in polymer degradation in RS24, possibly reflecting the physicochemical differences in the environment they were isolated from. No clear differences in abundance of the genomic content of either strain could be found in fragment recruitment analyses using different metagenomic datasets, in which both genomes were detectable albeit as minor part of the communities. The comparative genomic analysis of both isolates of the PS1 clade and the fragment recruitment analysis provide first insights into the ecology of this group. © 2014 Federation of European Microbiological Societies.

  2. Implementing bioinformatic workflows within the bioextract server

    Science.gov (United States)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows typically require the integrated use of multiple, distributed data sources and analytic tools. The BioExtract Server (http://bioextract.org) is a distributed servi...

  3. Bioinformatics in Middle East Program Curricula--A Focus on the Arabian Gulf

    Science.gov (United States)

    Loucif, Samia

    2014-01-01

    The purpose of this paper is to investigate the inclusion of bioinformatics in program curricula in the Middle East, focusing on educational institutions in the Arabian Gulf. Bioinformatics is a multidisciplinary field which has emerged in response to the need for efficient data storage and retrieval, and accurate and fast computational and…

  4. Combining medical informatics and bioinformatics toward tools for personalized medicine.

    Science.gov (United States)

    Sarachan, B D; Simmons, M K; Subramanian, P; Temkin, J M

    2003-01-01

    Key bioinformatics and medical informatics research areas need to be identified to advance knowledge and understanding of disease risk factors and molecular disease pathology in the 21 st century toward new diagnoses, prognoses, and treatments. Three high-impact informatics areas are identified: predictive medicine (to identify significant correlations within clinical data using statistical and artificial intelligence methods), along with pathway informatics and cellular simulations (that combine biological knowledge with advanced informatics to elucidate molecular disease pathology). Initial predictive models have been developed for a pilot study in Huntington's disease. An initial bioinformatics platform has been developed for the reconstruction and analysis of pathways, and work has begun on pathway simulation. A bioinformatics research program has been established at GE Global Research Center as an important technology toward next generation medical diagnostics. We anticipate that 21 st century medical research will be a combination of informatics tools with traditional biology wet lab research, and that this will translate to increased use of informatics techniques in the clinic.

  5. ALEA: a toolbox for allele-specific epigenomics analysis.

    Science.gov (United States)

    Younesy, Hamid; Möller, Torsten; Heravi-Moussavi, Alireza; Cheng, Jeffrey B; Costello, Joseph F; Lorincz, Matthew C; Karimi, Mohammad M; Jones, Steven J M

    2014-04-15

    The assessment of expression and epigenomic status using sequencing based methods provides an unprecedented opportunity to identify and correlate allelic differences with epigenomic status. We present ALEA, a computational toolbox for allele-specific epigenomics analysis, which incorporates allelic variation data within existing resources, allowing for the identification of significant associations between epigenetic modifications and specific allelic variants in human and mouse cells. ALEA provides a customizable pipeline of command line tools for allele-specific analysis of next-generation sequencing data (ChIP-seq, RNA-seq, etc.) that takes the raw sequencing data and produces separate allelic tracks ready to be viewed on genome browsers. The pipeline has been validated using human and hybrid mouse ChIP-seq and RNA-seq data. The package, test data and usage instructions are available online at http://www.bcgsc.ca/platform/bioinfo/software/alea CONTACT: : mkarimi1@interchange.ubc.ca or sjones@bcgsc.ca Supplementary information: Supplementary data are available at Bioinformatics online. © The Author 2013. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  6. Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud.

    Directory of Open Access Journals (Sweden)

    Enis Afgan

    Full Text Available Analyzing high throughput genomics data is a complex and compute intensive task, generally requiring numerous software tools and large reference data sets, tied together in successive stages of data transformation and visualisation. A computational platform enabling best practice genomics analysis ideally meets a number of requirements, including: a wide range of analysis and visualisation tools, closely linked to large user and reference data sets; workflow platform(s enabling accessible, reproducible, portable analyses, through a flexible set of interfaces; highly available, scalable computational resources; and flexibility and versatility in the use of these resources to meet demands and expertise of a variety of users. Access to an appropriate computational platform can be a significant barrier to researchers, as establishing such a platform requires a large upfront investment in hardware, experience, and expertise.We designed and implemented the Genomics Virtual Laboratory (GVL as a middleware layer of machine images, cloud management tools, and online services that enable researchers to build arbitrarily sized compute clusters on demand, pre-populated with fully configured bioinformatics tools, reference datasets and workflow and visualisation options. The platform is flexible in that users can conduct analyses through web-based (Galaxy, RStudio, IPython Notebook or command-line interfaces, and add/remove compute nodes and data resources as required. Best-practice tutorials and protocols provide a path from introductory training to practice. The GVL is available on the OpenStack-based Australian Research Cloud (http://nectar.org.au and the Amazon Web Services cloud. The principles, implementation and build process are designed to be cloud-agnostic.This paper provides a blueprint for the design and implementation of a cloud-based Genomics Virtual Laboratory. We discuss scope, design considerations and technical and logistical constraints

  7. Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud.

    Science.gov (United States)

    Afgan, Enis; Sloggett, Clare; Goonasekera, Nuwan; Makunin, Igor; Benson, Derek; Crowe, Mark; Gladman, Simon; Kowsar, Yousef; Pheasant, Michael; Horst, Ron; Lonie, Andrew

    2015-01-01

    Analyzing high throughput genomics data is a complex and compute intensive task, generally requiring numerous software tools and large reference data sets, tied together in successive stages of data transformation and visualisation. A computational platform enabling best practice genomics analysis ideally meets a number of requirements, including: a wide range of analysis and visualisation tools, closely linked to large user and reference data sets; workflow platform(s) enabling accessible, reproducible, portable analyses, through a flexible set of interfaces; highly available, scalable computational resources; and flexibility and versatility in the use of these resources to meet demands and expertise of a variety of users. Access to an appropriate computational platform can be a significant barrier to researchers, as establishing such a platform requires a large upfront investment in hardware, experience, and expertise. We designed and implemented the Genomics Virtual Laboratory (GVL) as a middleware layer of machine images, cloud management tools, and online services that enable researchers to build arbitrarily sized compute clusters on demand, pre-populated with fully configured bioinformatics tools, reference datasets and workflow and visualisation options. The platform is flexible in that users can conduct analyses through web-based (Galaxy, RStudio, IPython Notebook) or command-line interfaces, and add/remove compute nodes and data resources as required. Best-practice tutorials and protocols provide a path from introductory training to practice. The GVL is available on the OpenStack-based Australian Research Cloud (http://nectar.org.au) and the Amazon Web Services cloud. The principles, implementation and build process are designed to be cloud-agnostic. This paper provides a blueprint for the design and implementation of a cloud-based Genomics Virtual Laboratory. We discuss scope, design considerations and technical and logistical constraints, and explore the

  8. p3d – Python module for structural bioinformatics

    Directory of Open Access Journals (Sweden)

    Fufezan Christian

    2009-08-01

    Full Text Available Abstract Background High-throughput bioinformatic analysis tools are needed to mine the large amount of structural data via knowledge based approaches. The development of such tools requires a robust interface to access the structural data in an easy way. For this the Python scripting language is the optimal choice since its philosophy is to write an understandable source code. Results p3d is an object oriented Python module that adds a simple yet powerful interface to the Python interpreter to process and analyse three dimensional protein structure files (PDB files. p3d's strength arises from the combination of a very fast spatial access to the structural data due to the implementation of a binary space partitioning (BSP tree, b set theory and c functions that allow to combine a and b and that use human readable language in the search queries rather than complex computer language. All these factors combined facilitate the rapid development of bioinformatic tools that can perform quick and complex analyses of protein structures. Conclusion p3d is the perfect tool to quickly develop tools for structural bioinformatics using the Python scripting language.

  9. A comparison of common programming languages used in bioinformatics.

    Science.gov (United States)

    Fourment, Mathieu; Gillings, Michael R

    2008-02-05

    The performance of different programming languages has previously been benchmarked using abstract mathematical algorithms, but not using standard bioinformatics algorithms. We compared the memory usage and speed of execution for three standard bioinformatics methods, implemented in programs using one of six different programming languages. Programs for the Sellers algorithm, the Neighbor-Joining tree construction algorithm and an algorithm for parsing BLAST file outputs were implemented in C, C++, C#, Java, Perl and Python. Implementations in C and C++ were fastest and used the least memory. Programs in these languages generally contained more lines of code. Java and C# appeared to be a compromise between the flexibility of Perl and Python and the fast performance of C and C++. The relative performance of the tested languages did not change from Windows to Linux and no clear evidence of a faster operating system was found. Source code and additional information are available from http://www.bioinformatics.org/benchmark/. This benchmark provides a comparison of six commonly used programming languages under two different operating systems. The overall comparison shows that a developer should choose an appropriate language carefully, taking into account the performance expected and the library availability for each language.

  10. The eBioKit, a stand-alone educational platform for bioinformatics.

    Science.gov (United States)

    Hernández-de-Diego, Rafael; de Villiers, Etienne P; Klingström, Tomas; Gourlé, Hadrien; Conesa, Ana; Bongcam-Rudloff, Erik

    2017-09-01

    Bioinformatics skills have become essential for many research areas; however, the availability of qualified researchers is usually lower than the demand and training to increase the number of able bioinformaticians is an important task for the bioinformatics community. When conducting training or hands-on tutorials, the lack of control over the analysis tools and repositories often results in undesirable situations during training, as unavailable online tools or version conflicts may delay, complicate, or even prevent the successful completion of a training event. The eBioKit is a stand-alone educational platform that hosts numerous tools and databases for bioinformatics research and allows training to take place in a controlled environment. A key advantage of the eBioKit over other existing teaching solutions is that all the required software and databases are locally installed on the system, significantly reducing the dependence on the internet. Furthermore, the architecture of the eBioKit has demonstrated itself to be an excellent balance between portability and performance, not only making the eBioKit an exceptional educational tool but also providing small research groups with a platform to incorporate bioinformatics analysis in their research. As a result, the eBioKit has formed an integral part of training and research performed by a wide variety of universities and organizations such as the Pan African Bioinformatics Network (H3ABioNet) as part of the initiative Human Heredity and Health in Africa (H3Africa), the Southern Africa Network for Biosciences (SAnBio) initiative, the Biosciences eastern and central Africa (BecA) hub, and the International Glossina Genome Initiative.

  11. Resource quantity and quality determine the inter-specific associations between ecosystem engineers and resource users in a cavity-nest web.

    Science.gov (United States)

    Robles, Hugo; Martin, Kathy

    2013-01-01

    While ecosystem engineering is a widespread structural force of ecological communities, the mechanisms underlying the inter-specific associations between ecosystem engineers and resource users are poorly understood. A proper knowledge of these mechanisms is, however, essential to understand how communities are structured. Previous studies suggest that increasing the quantity of resources provided by ecosystem engineers enhances populations of resource users. In a long-term study (1995-2011), we show that the quality of the resources (i.e. tree cavities) provided by ecosystem engineers is also a key feature that explains the inter-specific associations in a tree cavity-nest web. Red-naped sapsuckers (Sphyrapicusnuchalis) provided the most abundant cavities (52% of cavities, 0.49 cavities/ha). These cavities were less likely to be used than other cavity types by mountain bluebirds (Sialiacurrucoides), but provided numerous nest-sites (41% of nesting cavities) to tree swallows (Tachycinetabicolour). Swallows experienced low reproductive outputs in northern flicker (Colaptesauratus) cavities compared to those in sapsucker cavities (1.1 vs. 2.1 fledglings/nest), but the highly abundant flickers (33% of cavities, 0.25 cavities/ha) provided numerous suitable nest-sites for bluebirds (58%). The relative shortage of cavities supplied by hairy woodpeckers (Picoidesvillosus) and fungal/insect decay (high quality nest-sites for both bluebirds and swallows. Because both the quantity and quality of resources supplied by different ecosystem engineers may explain the amount of resources used by each resource user, conservation strategies may require different management actions to be implemented for the key ecosystem engineer of each resource user. We, therefore, urge the incorporation of both resource quantity and quality into models that assess community dynamics to improve conservation actions and our understanding of ecological communities based on ecosystem engineering.

  12. The structural bioinformatics library: modeling in biomolecular science and beyond.

    Science.gov (United States)

    Cazals, Frédéric; Dreyfus, Tom

    2017-04-01

    Software in structural bioinformatics has mainly been application driven. To favor practitioners seeking off-the-shelf applications, but also developers seeking advanced building blocks to develop novel applications, we undertook the design of the Structural Bioinformatics Library ( SBL , http://sbl.inria.fr ), a generic C ++/python cross-platform software library targeting complex problems in structural bioinformatics. Its tenet is based on a modular design offering a rich and versatile framework allowing the development of novel applications requiring well specified complex operations, without compromising robustness and performances. The SBL involves four software components (1-4 thereafter). For end-users, the SBL provides ready to use, state-of-the-art (1) applications to handle molecular models defined by unions of balls, to deal with molecular flexibility, to model macro-molecular assemblies. These applications can also be combined to tackle integrated analysis problems. For developers, the SBL provides a broad C ++ toolbox with modular design, involving core (2) algorithms , (3) biophysical models and (4) modules , the latter being especially suited to develop novel applications. The SBL comes with a thorough documentation consisting of user and reference manuals, and a bugzilla platform to handle community feedback. The SBL is available from http://sbl.inria.fr. Frederic.Cazals@inria.fr. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  13. A genetic polymorphism evolving in parallel in two cell compartments and in two clades

    Directory of Open Access Journals (Sweden)

    Watt Ward B

    2013-01-01

    Full Text Available Abstract Background The enzyme phosphoenolpyruvate carboxykinase, PEPCK, occurs in its guanosine-nucleotide-using form in animals and a few prokaryotes. We study its natural genetic variation in Colias (Lepidoptera, Pieridae. PEPCK offers a route, alternative to pyruvate kinase, for carbon skeletons to move between cytosolic glycolysis and mitochondrial Krebs cycle reactions. Results PEPCK is expressed in both cytosol and mitochondrion, but differently in diverse animal clades. In vertebrates and independently in Drosophila, compartment-specific paralogous genes occur. In a contrasting expression strategy, compartment-specific PEPCKs of Colias and of the silkmoth, Bombyx, differ only in their first, 5′, exons; these are alternatively spliced onto a common series of following exons. In two Colias species from distinct clades, PEPCK sequence is highly variable at nonsynonymous and synonymous sites, mainly in its common exons. Three major amino acid polymorphisms, Gly 335 ↔ Ser, Asp 503 ↔ Glu, and Ile 629 ↔ Val occur in both species, and in the first two cases are similar in frequency between species. Homology-based structural modelling shows that the variants can alter hydrogen bonding, salt bridging, or van der Waals interactions of amino acid side chains, locally or at one another′s sites which are distant in PEPCK′s structure, and thus may affect its enzyme function. We ask, using coalescent simulations, if these polymorphisms′ cross-species similarities are compatible with neutral evolution by genetic drift, but find the probability of this null hypothesis is 0.001 ≤ P ≤ 0.006 under differing scenarios. Conclusion Our results make the null hypothesis of neutrality of these PEPCK polymorphisms quite unlikely, but support an alternative hypothesis that they are maintained by natural selection in parallel in the two species. This alternative can now be justifiably tested further via studies of PEPCK genotypes′ effects

  14. The RHNumtS compilation: Features and bioinformatics approaches to locate and quantify Human NumtS

    Directory of Open Access Journals (Sweden)

    Saccone Cecilia

    2008-06-01

    Full Text Available Abstract Background To a greater or lesser extent, eukaryotic nuclear genomes contain fragments of their mitochondrial genome counterpart, deriving from the random insertion of damaged mtDNA fragments. NumtS (Nuclear mt Sequences are not equally abundant in all species, and are redundant and polymorphic in terms of copy number. In population and clinical genetics, it is important to have a complete overview of NumtS quantity and location. Searching PubMed for NumtS or Mitochondrial pseudo-genes yields hundreds of papers reporting Human NumtS compilations produced by in silico or wet-lab approaches. A comparison of published compilations clearly shows significant discrepancies among data, due both to unwise application of Bioinformatics methods and to a not yet correctly assembled nuclear genome. To optimize quantification and location of NumtS, we produced a consensus compilation of Human NumtS by applying various bioinformatics approaches. Results Location and quantification of NumtS may be achieved by applying database similarity searching methods: we have applied various methods such as Blastn, MegaBlast and BLAT, changing both parameters and database; the results were compared, further analysed and checked against the already published compilations, thus producing the Reference Human Numt Sequences (RHNumtS compilation. The resulting NumtS total 190. Conclusion The RHNumtS compilation represents a highly reliable reference basis, which may allow designing a lab protocol to test the actual existence of each NumtS. Here we report preliminary results based on PCR amplification and sequencing on 41 NumtS selected from RHNumtS among those with lower score. In parallel, we are currently designing the RHNumtS database structure for implementation in the HmtDB resource. In the future, the same database will host NumtS compilations from other organisms, but these will be generated only when the nuclear genome of a specific organism has reached a high

  15. Green Fluorescent Protein-Focused Bioinformatics Laboratory Experiment Suitable for Undergraduates in Biochemistry Courses

    Science.gov (United States)

    Rowe, Laura

    2017-01-01

    An introductory bioinformatics laboratory experiment focused on protein analysis has been developed that is suitable for undergraduate students in introductory biochemistry courses. The laboratory experiment is designed to be potentially used as a "stand-alone" activity in which students are introduced to basic bioinformatics tools and…

  16. Quantifying variation in speciation and extinction rates with clade data.

    Science.gov (United States)

    Paradis, Emmanuel; Tedesco, Pablo A; Hugueny, Bernard

    2013-12-01

    High-level phylogenies are very common in evolutionary analyses, although they are often treated as incomplete data. Here, we provide statistical tools to analyze what we name "clade data," which are the ages of clades together with their numbers of species. We develop a general approach for the statistical modeling of variation in speciation and extinction rates, including temporal variation, unknown variation, and linear and nonlinear modeling. We show how this approach can be generalized to a wide range of situations, including testing the effects of life-history traits and environmental variables on diversification rates. We report the results of an extensive simulation study to assess the performance of some statistical tests presented here as well as of the estimators of speciation and extinction rates. These latter results suggest the possibility to estimate correctly extinction rate in the absence of fossils. An example with data on fish is presented. © 2013 The Author(s). Evolution © 2013 The Society for the Study of Evolution.

  17. Rough-fuzzy pattern recognition applications in bioinformatics and medical imaging

    CERN Document Server

    Maji, Pradipta

    2012-01-01

    Learn how to apply rough-fuzzy computing techniques to solve problems in bioinformatics and medical image processing Emphasizing applications in bioinformatics and medical image processing, this text offers a clear framework that enables readers to take advantage of the latest rough-fuzzy computing techniques to build working pattern recognition models. The authors explain step by step how to integrate rough sets with fuzzy sets in order to best manage the uncertainties in mining large data sets. Chapters are logically organized according to the major phases of pattern recognition systems dev

  18. Identification of ochratoxin A producing Aspergillus carbonarius and A. niger clade isolated from grapes using the loop-mediated isothermal amplification (LAMP) reaction.

    Science.gov (United States)

    Storari, M; von Rohr, R; Pertot, I; Gessler, C; Broggini, G A L

    2013-04-01

    To develop two assays based on the loop-mediated isothermal amplification (LAMP) of DNA for the quick and specific identification of Aspergillus carbonarius and ochratoxigenic strains of the Aspergillus niger clade isolated from grapes. Two sets of primers were designed based on the polyketide synthase genes involved or putatively involved in ochratoxin A (OTA) biosynthesis in A. carbonarius and A. niger clade. Hydroxynaphthol blue was used as indirect method to indicate DNA amplification. The limit of detection of both assays was comparable to that of a PCR reaction. Specificities of the reactions were tested using DNA from different black aspergilli isolated from grapes. The two LAMP assays were then used to identify A. carbonarius and ochratoxigenic A. niger and A. awamori grown in pure cultures without a prior DNA extraction. The two LAMP assays permitted to quickly and specifically identify DNA from OTA-producing black aspergilli, as well as isolates grown in pure culture. Monitoring vineyards for the presence of OTA-producing strains is part of the measures to minimize the occurrence of OTA in grape products. The two LAMP assays developed here could be potentially used to speed the screening process of vineyards for the presence of OTA-producing black aspergilli. © 2013 The Society for Applied Microbiology.

  19. The OAuth 2.0 Web Authorization Protocol for the Internet Addiction Bioinformatics (IABio) Database.

    Science.gov (United States)

    Choi, Jeongseok; Kim, Jaekwon; Lee, Dong Kyun; Jang, Kwang Soo; Kim, Dai-Jin; Choi, In Young

    2016-03-01

    Internet addiction (IA) has become a widespread and problematic phenomenon as smart devices pervade society. Moreover, internet gaming disorder leads to increases in social expenditures for both individuals and nations alike. Although the prevention and treatment of IA are getting more important, the diagnosis of IA remains problematic. Understanding the neurobiological mechanism of behavioral addictions is essential for the development of specific and effective treatments. Although there are many databases related to other addictions, a database for IA has not been developed yet. In addition, bioinformatics databases, especially genetic databases, require a high level of security and should be designed based on medical information standards. In this respect, our study proposes the OAuth standard protocol for database access authorization. The proposed IA Bioinformatics (IABio) database system is based on internet user authentication, which is a guideline for medical information standards, and uses OAuth 2.0 for access control technology. This study designed and developed the system requirements and configuration. The OAuth 2.0 protocol is expected to establish the security of personal medical information and be applied to genomic research on IA.

  20. Privacy Preserving PCA on Distributed Bioinformatics Datasets

    Science.gov (United States)

    Li, Xin

    2011-01-01

    In recent years, new bioinformatics technologies, such as gene expression microarray, genome-wide association study, proteomics, and metabolomics, have been widely used to simultaneously identify a huge number of human genomic/genetic biomarkers, generate a tremendously large amount of data, and dramatically increase the knowledge on human…

  1. Engaging Students in a Bioinformatics Activity to Introduce Gene Structure and Function

    Directory of Open Access Journals (Sweden)

    Barbara J. May

    2013-02-01

    Full Text Available Bioinformatics spans many fields of biological research and plays a vital role in mining and analyzing data. Therefore, there is an ever-increasing need for students to understand not only what can be learned from this data, but also how to use basic bioinformatics tools.  This activity is designed to provide secondary and undergraduate biology students to a hands-on activity meant to explore and understand gene structure with the use of basic bioinformatic tools.  Students are provided an “unknown” sequence from which they are asked to use a free online gene finder program to identify the gene. Students then predict the putative function of this gene with the use of additional online databases.

  2. Rise and demise of bioinformatics? Promise and progress.

    Directory of Open Access Journals (Sweden)

    Christos A Ouzounis

    Full Text Available The field of bioinformatics and computational biology has gone through a number of transformations during the past 15 years, establishing itself as a key component of new biology. This spectacular growth has been challenged by a number of disruptive changes in science and technology. Despite the apparent fatigue of the linguistic use of the term itself, bioinformatics has grown perhaps to a point beyond recognition. We explore both historical aspects and future trends and argue that as the field expands, key questions remain unanswered and acquire new meaning while at the same time the range of applications is widening to cover an ever increasing number of biological disciplines. These trends appear to be pointing to a redefinition of certain objectives, milestones, and possibly the field itself.

  3. Meeting review: 2002 O'Reilly Bioinformatics Technology Conference.

    Science.gov (United States)

    Counsell, Damian

    2002-01-01

    At the end of January I travelled to the States to speak at and attend the first O'Reilly Bioinformatics Technology Conference. It was a large, well-organized and diverse meeting with an interesting history. Although the meeting was not a typical academic conference, its style will, I am sure, become more typical of meetings in both biological and computational sciences.Speakers at the event included prominent bioinformatics researchers such as Ewan Birney, Terry Gaasterland and Lincoln Stein; authors and leaders in the open source programming community like Damian Conway and Nat Torkington; and representatives from several publishing companies including the Nature Publishing Group, Current Science Group and the President of O'Reilly himself, Tim O'Reilly. There were presentations, tutorials, debates, quizzes and even a 'jam session' for musical bioinformaticists.

  4. Bioinformatics Tools for the Discovery of New Nonribosomal Peptides

    DEFF Research Database (Denmark)

    Leclère, Valérie; Weber, Tilmann; Jacques, Philippe

    2016-01-01

    -dimensional structure of the peptides can be compared with the structural patterns of all known NRPs. The presented workflow leads to an efficient and rapid screening of genomic data generated by high throughput technologies. The exploration of such sequenced genomes may lead to the discovery of new drugs (i......This chapter helps in the use of bioinformatics tools relevant to the discovery of new nonribosomal peptides (NRPs) produced by microorganisms. The strategy described can be applied to draft or fully assembled genome sequences. It relies on the identification of the synthetase genes...... and the deciphering of the domain architecture of the nonribosomal peptide synthetases (NRPSs). In the next step, candidate peptides synthesized by these NRPSs are predicted in silico, considering the specificity of incorporated monomers together with their isomery. To assess their novelty, the two...

  5. An efficiently cleaved HIV-1 clade C Env selectively binds to neutralizing antibodies.

    Directory of Open Access Journals (Sweden)

    Saikat Boliar

    Full Text Available An ideal HIV-1 Env immunogen is expected to mimic the native trimeric conformation for inducing broadly neutralizing antibody responses. The native conformation is dependent on efficient cleavage of HIV-1 Env. The clade B isolate, JRFL Env is efficiently cleaved when expressed on the cell surface. Here, for the first time, we report the identification of a native clade C Env, 4-2.J41 that is naturally and efficiently cleaved on the cell surface as confirmed by its biochemical and antigenic characteristics. In addition to binding to several conformation-dependent neutralizing antibodies, 4-2.J41 Env binds efficiently to the cleavage-dependent antibody PGT151; thus validating its native cleaved conformation. In contrast, 4-2.J41 Env occludes non-neutralizing epitopes. The cytoplasmic-tail of 4-2.J41 Env plays an important role in maintaining its conformation. Furthermore, codon optimization of 4-2.J41 Env sequence significantly increases its expression while retaining its native conformation. Since clade C of HIV-1 is the prevalent subtype, identification and characterization of this efficiently cleaved Env would provide a platform for rational immunogen design.

  6. RISE OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY IN INDIA: A LOOK THROUGH PUBLICATIONS

    Directory of Open Access Journals (Sweden)

    Anjali Srivastava

    2017-09-01

    Full Text Available Computational biology and bioinformatics have been part and parcel of biomedical research for few decades now. However, the institutionalization of bioinformatics research took place with the establishment of Distributed Information Centres (DISCs in the research institutions of repute in various disciplines by the Department of Biotechnology, Government of India. Though, at initial stages, this endeavor was mainly focused on providing infrastructure for using information technology and internet based communication and tools for carrying out computational biology and in-silico assisted research in varied arena of research starting from disease biology to agricultural crops, spices, veterinary science and many more, the natural outcome of establishment of such facilities resulted into new experiments with bioinformatics tools. Thus, Biotechnology Information Systems (BTIS grew into a solid movement and a large number of publications started coming out of these centres. In the end of last century, bioinformatics started developing like a full-fledged research subject. In the last decade, a need was felt to actually make a factual estimation of the result of this endeavor of DBT which had, by then, established about two hundred centres in almost all disciplines of biomedical research. In a bid to evaluate the efforts and outcome of these centres, BTIS Centre at CSIR-CDRI, Lucknow was entrusted with collecting and collating the publications of these centres. However, when the full data was compiled, the DBT task force felt that the study must include Non-BTIS centres also so as to expand the report to have a glimpse of bioinformatics publications from the country.

  7. Distinct Processes Drive Diversification in Different Clades of Gesneriaceae.

    Science.gov (United States)

    Roalson, Eric H; Roberts, Wade R

    2016-07-01

    Using a time-calibrated phylogenetic hypothesis including 768 Gesneriaceae species (out of [Formula: see text]3300 species) and more than 29,000 aligned bases from 26 gene regions, we test Gesneriaceae for diversification rate shifts and the possible proximal drivers of these shifts: geographic distributions, growth forms, and pollination syndromes. Bayesian Analysis of Macroevolutionary Mixtures analyses found five significant rate shifts in Beslerieae, core Nematanthus, core Columneinae, core Streptocarpus, and Pacific Cyrtandra These rate shifts correspond with shifts in diversification rates, as inferred by Binary State Speciation and Extinction Model and Geographic State Speciation and Extinction model, associated with hummingbird pollination, epiphytism, unifoliate growth, and geographic area. Our results suggest that diversification processes are extremely variable across Gesneriaceae clades with different combinations of characters influencing diversification rates in different clades. Diversification patterns between New and Old World lineages show dramatic differences, suggesting that the processes of diversification in Gesneriaceae are very different in these two geographic regions. © The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  8. Global Information Resources on Rice for Research and Development

    Directory of Open Access Journals (Sweden)

    Shri RAM

    2012-12-01

    Full Text Available Various issues concerning the progress of rice research are related to ambiguous germplasm identification, difficulty in tracing pedigree information, and lack of integration between genetic resources, characterization, breeding, evaluation and utilization data. These issues are the constraints in developing knowledge-intensive crop improvement programs. The rapid growth, development and the global spread of modern information and communication technology allow quick adoption in fundamental research. Thus, there is a need to provide an opportunity for the establishment of services which describe the rice information for better accessibility to information resources used by researchers to enhance the competitiveness. This work reviews some of available resources on rice bioinformatics and their roles in elucidating and propagating biological and genomic information in rice research. These reviews will also enable stakeholders to understand and adopt the change in research and development and share knowledge with the global community of agricultural scientists. The establishment like International Rice Information System, Rice Genome Research Project and Integrated Rice Genome Explorer are major initiatives for the improvement of rice. Creation of databases for comparative studies of rice and other cereals are major steps in further improvement of genetic compositions. This paper will also highlight some of the initiatives and organizations working in the field of rice improvement and explore the availability of the various web resources for the purpose of research and development of rice. We are developing a meta web server for integration of online resources such as databases, web servers and journals in the area of bioinformatics. This integrated platform, with acronym iBIRA, is available online at ibiranet.in. The resources reviewed here are the excerpts from the resources integrated in iBIRA.

  9. Molecular relationships of fungi within the Fusarium redolens - F. hostae clade

    NARCIS (Netherlands)

    Baayen, R.P.; O'Donnell, K.; Breeuwsma, S.; Geiser, D.M.; Waalwijk, C.

    2001-01-01

    The evolutionary relationships of fungi in the Fusarium redolens - F. hostae clade were investigated by constructing nuclear and mitochondrial gene genealogies for 37 isolates representing the known genetic and pathogenic diversity of this lineage, together with 15 isolates from putative sister

  10. Somatic populations of PGT135-137 HIV-1-neutralizing antibodies identified by 454 pyrosequencing and bioinformatics

    Directory of Open Access Journals (Sweden)

    Jiang eZhu

    2012-09-01

    Full Text Available Select HIV-1-infected individuals develop sera capable of neutralizing diverse viral strains. The molecular basis of this neutralization is currently being deciphered by the isolation of HIV-1-neutralizing antibodies. In one infected donor, three neutralizing antibodies, PGT135-137, were identified by assessment of neutralization from individually sorted B cells and found to recognize an epitope containing an N-linked glycan at residue 332 on HIV-1 gp120. Here we use deep sequencing and bioinformatics methods to interrogate the B cell record of this donor to gain a more complete understanding of the humoral immune response. PGT135-137-gene family-specific primers were used to amplify heavy and light chain-variable domain sequences. 454 pyrosequencing produced 141,298 heavy-chain sequences of IGHV4-39 origin and 87,229 light-chain sequences of IGKV3-15 origin. A number of heavy and light chain sequences of ~90% identity to PGT137, several to PGT136, and none of high identity to PGT135 were identified. After expansion of these sequences to include close phylogenetic relatives, a total of 202 heavy-chain sequences and 72 light-chain sequences were identified. These sequences were clustered into populations of 95% identity comprising 15 for heavy chain and 10 for light chain, and a select sequence from each population was synthesized and reconstituted with a PGT137-partner chain. Reconstituted antibodies showed varied neutralization phenotypes for HIV-1 clade A and D isolates. Sequence diversity of the antibody population represented by these tested sequences was notably higher than observed with a 454 pyrosequencing-control analysis on 10 antibodies of defined sequence, suggesting that this diversity results primarily from somatic maturation. Our results thus provide an example of how pathogens like HIV-1 are opposed by a varied humoral immune response, derived from intrinsic mechanisms of antibody development, and embodied by somatic populations

  11. A review of bioinformatics training applied to research in molecular medicine, agriculture and biodiversity in Costa Rica and Central America.

    Science.gov (United States)

    Orozco, Allan; Morera, Jessica; Jiménez, Sergio; Boza, Ricardo

    2013-09-01

    Today, Bioinformatics has become a scientific discipline with great relevance for the Molecular Biosciences and for the Omics sciences in general. Although developed countries have progressed with large strides in Bioinformatics education and research, in other regions, such as Central America, the advances have occurred in a gradual way and with little support from the Academia, either at the undergraduate or graduate level. To address this problem, the University of Costa Rica's Medical School, a regional leader in Bioinformatics in Central America, has been conducting a series of Bioinformatics workshops, seminars and courses, leading to the creation of the region's first Bioinformatics Master's Degree. The recent creation of the Central American Bioinformatics Network (BioCANET), associated to the deployment of a supporting computational infrastructure (HPC Cluster) devoted to provide computing support for Molecular Biology in the region, is providing a foundational stone for the development of Bioinformatics in the area. Central American bioinformaticians have participated in the creation of as well as co-founded the Iberoamerican Bioinformatics Society (SOIBIO). In this article, we review the most recent activities in education and research in Bioinformatics from several regional institutions. These activities have resulted in further advances for Molecular Medicine, Agriculture and Biodiversity research in Costa Rica and the rest of the Central American countries. Finally, we provide summary information on the first Central America Bioinformatics International Congress, as well as the creation of the first Bioinformatics company (Indromics Bioinformatics), spin-off the Academy in Central America and the Caribbean.

  12. Bioinformatics: A History of Evolution "In Silico"

    Science.gov (United States)

    Ondrej, Vladan; Dvorak, Petr

    2012-01-01

    Bioinformatics, biological databases, and the worldwide use of computers have accelerated biological research in many fields, such as evolutionary biology. Here, we describe a primer of nucleotide sequence management and the construction of a phylogenetic tree with two examples; the two selected are from completely different groups of organisms:…

  13. Protein raftophilicity. How bioinformatics can help membranologists

    DEFF Research Database (Denmark)

    Nielsen, Henrik; Sperotto, Maria Maddalena

    )-based bioinformatics approach. The ANN was trained to recognize feature-based patterns in proteins that are considered to be associated with lipid rafts. The trained ANN was then used to predict protein raftophilicity. We found that, in the case of α-helical membrane proteins, their hydrophobic length does not affect...

  14. Development and implementation of a bioinformatics online ...

    African Journals Online (AJOL)

    Thus, there is the need for appropriate strategies of introducing the basic components of this emerging scientific field to part of the African populace through the development of an online distance education learning tool. This study involved the design of a bioinformatics online distance educative tool an implementation of ...

  15. Protein evolution in two co-occurring types of Symbiodinium: an exploration into the genetic basis of thermal tolerance in Symbiodinium clade D

    Directory of Open Access Journals (Sweden)

    Ladner Jason T

    2012-11-01

    Full Text Available Abstract Background The symbiosis between reef-building corals and photosynthetic dinoflagellates (Symbiodinium is an integral part of the coral reef ecosystem, as corals are dependent on Symbiodinium for the majority of their energy needs. However, this partnership is increasingly at risk due to changing climatic conditions. It is thought that functional diversity within Symbiodinium may allow some corals to rapidly adapt to different environments by changing the type of Symbiodinium with which they partner; however, very little is known about the molecular basis of the functional differences among symbiont groups. One group of Symbiodinium that is hypothesized to be important for the future of reefs is clade D, which, in general, seems to provide the coral holobiont (i.e., coral host and associated symbiont community with elevated thermal tolerance. Using high-throughput sequencing data from field-collected corals we assembled, de novo, draft transcriptomes for Symbiodinium clades C and D. We then explore the functional basis of thermal tolerance in clade D by comparing rates of coding sequence evolution among the four clades of Symbiodinium most commonly found in reef-building corals (A-D. Results We are able to highlight a number of genes and functional categories as candidates for involvement in the increased thermal tolerance of clade D. These include a fatty acid desaturase, molecular chaperones and proteins involved in photosynthesis and the thylakoid membrane. We also demonstrate that clades C and D co-occur within most of the sampled colonies of Acropora hyacinthus, suggesting widespread potential for this coral species to acclimatize to changing thermal conditions via ‘shuffling’ the proportions of these two clades from within their current symbiont communities. Conclusions Transcriptome-wide analysis confirms that the four main Symbiodinium clades found within corals exhibit extensive evolutionary divergence (18.5-27.3% avg

  16. Coexistence of two clades of enterovirus D68 in pediatric Swedish patients in the summer and fall of 2014.

    Science.gov (United States)

    Dyrdak, Robert; Rotzén-Östlund, Maria; Samuelson, Agneta; Eriksson, Margareta; Albert, Jan

    2015-01-01

    In 2014, an outbreak of enterovirus D68 (EV-D68) was observed in North America, with cases of severe respiratory illness and a possible etiological link to cases of acute flaccid paralysis. EV-D68 has also been reported from European countries, but no data from Sweden are available. This study investigated respiratory specimens collected during July-October 2014 from 30 Swedish children aged 0-9 years who were positive for enterovirus and/or rhinovirus in routine clinical PCR. Seven samples were typed as EV-D68 by VP4/VP2 sequencing. Two genetically distinct EV-D68 variants coexisted. Six viruses belonged to clade B, the variant involved in the North American outbreak, and one virus belonged to clade A. Respiratory illness was the major symptom among EV-D68 infected patients and all fully recovered. This is the first report of EV-D68 in Sweden. Considering the current epidemiological situation, genotyping and specific EV-D68 testing should be considered in patients with severe respiratory illness who test positive for enterovirus or rhinovirus in routine diagnostics.

  17. Novel H5N8 clade 2.3.4.4 highly pathogenic avian influenza virus in wild awuatic birds, Russia, 2016

    Science.gov (United States)

    H5N1 high pathogenicity avian influenza virus (HPAIV) emerged in 1996 in Guangdong China (Gs/GD) and has evolved into multiple genetic clades. Since 2008, HPAIV H5 clade 2.3.4 with N2, N5 and N8 neuraminidase subtypes have been identified in mainland China and outbreak of HPAIV H5N8 clade 2.3.4.4 ou...

  18. Analysis of requirements for teaching materials based on the course bioinformatics for plant metabolism

    Science.gov (United States)

    Balqis, Widodo, Lukiati, Betty; Amin, Mohamad

    2017-05-01

    A way to improve the quality of learning in the course of Plant Metabolism in the Department of Biology, State University of Malang, is to develop teaching materials. This research evaluates the needs of bioinformatics-based teaching material in the course Plant Metabolism by the Analyze, Design, Develop, Implement, and Evaluate (ADDIE) development model. Data were collected through questionnaires distributed to the students in the Plant Metabolism course of the Department of Biology, University of Malang, and analysis of the plan of lectures semester (RPS). Learning gains of this course show that it is not yet integrated into the field of bioinformatics. All respondents stated that plant metabolism books do not include bioinformatics and fail to explain the metabolism of a chemical compound of a local plant in Indonesia. Respondents thought that bioinformatics can explain examples and metabolism of a secondary metabolite analysis techniques and discuss potential medicinal compounds from local plants. As many as 65% of the respondents said that the existing metabolism book could not be used to understand secondary metabolism in lectures of plant metabolism. Therefore, the development of teaching materials including plant metabolism-based bioinformatics is important to improve the understanding of the lecture material in plant metabolism.

  19. A BIOINFORMATIC STRATEGY TO RAPIDLY CHARACTERIZE CDNA LIBRARIES

    Science.gov (United States)

    A Bioinformatic Strategy to Rapidly Characterize cDNA LibrariesG. Charles Ostermeier1, David J. Dix2 and Stephen A. Krawetz1.1Departments of Obstetrics and Gynecology, Center for Molecular Medicine and Genetics, & Institute for Scientific Computing, Wayne State Univer...

  20. Phylogeny of the sea hares in the aplysia clade based on mitochondrial DNA sequence data

    Energy Technology Data Exchange (ETDEWEB)

    Medina, Monica; Collins, Timothy; Walsh, Patrick J.

    2004-02-20

    Sea hare species within the Aplysia clade are distributed worldwide. Their phylogenetic and biogeographic relationships are, however, still poorly known. New molecular evidence is presented from a portion of the mitochondrial cytochrome oxidase c subunit 1 gene (cox1) that improves our understanding of the phylogeny of the group. Based on these data a preliminary discussion of the present distribution of sea hares in a biogeographic context is put forward. Our findings are consistent with only some aspects of the current taxonomy and nomenclatural changes are proposed. The first, is the use of a rank free classification for the different Aplysia clades and subclades as opposed to previously used genus and subgenus affiliations. The second, is the suggestion that Aplysia brasiliana (Rang, 1828) is a junior synonym of Aplysia fasciata (Poiret, 1789). The third, is the elimination of Neaplysia since its only member is confirmed to be part of the large Varria clade.

  1. Kretzoiarctos gen. nov., the oldest member of the giant panda clade.

    Science.gov (United States)

    Abella, Juan; Alba, David M; Robles, Josep M; Valenciano, Alberto; Rotgers, Cheyenn; Carmona, Raül; Montoya, Plinio; Morales, Jorge

    2012-01-01

    The phylogenetic position of the giant panda, Ailuropoda melanoleuca (Carnivora: Ursidae: Ailuropodinae), has been one of the most hotly debated topics by mammalian biologists and paleontologists during the last century. Based on molecular data, it is currently recognized as a true ursid, sister-taxon of the remaining extant bears, from which it would have diverged by the Early Miocene. However, from a paleobiogeographic and chronological perspective, the origin of the giant panda lineage has remained elusive due to the scarcity of the available Miocene fossil record. Until recently, the genus Ailurarctos from the Late Miocene of China (ca. 8-7 mya) was recognized as the oldest undoubted member of the Ailuropodinae, suggesting that the panda lineage might have originated from an Ursavus ancestor. The role of the purported ailuropodine Agriarctos, from the Miocene of Europe, in the origins of this clade has been generally dismissed due to the paucity of the available material. Here, we describe a new ailuropodine genus, Kretzoiarctos gen. nov., based on remains from two Middle Miocene (ca. 12-11 Ma) Spanish localities. A cladistic analysis of fossil and extant members of the Ursoidea confirms the inclusion of the new genus into the Ailuropodinae. Moreover, Kretzoiarctos precedes in time the previously-known, Late Miocene members of the giant panda clade from Eurasia (Agriarctos and Ailurarctos). The former can be therefore considered the oldest recorded member of the giant panda lineage, which has significant implications for understanding the origins of this clade from a paleobiogeographic viewpoint.

  2. Kretzoiarctos gen. nov., the oldest member of the giant panda clade.

    Directory of Open Access Journals (Sweden)

    Juan Abella

    Full Text Available The phylogenetic position of the giant panda, Ailuropoda melanoleuca (Carnivora: Ursidae: Ailuropodinae, has been one of the most hotly debated topics by mammalian biologists and paleontologists during the last century. Based on molecular data, it is currently recognized as a true ursid, sister-taxon of the remaining extant bears, from which it would have diverged by the Early Miocene. However, from a paleobiogeographic and chronological perspective, the origin of the giant panda lineage has remained elusive due to the scarcity of the available Miocene fossil record. Until recently, the genus Ailurarctos from the Late Miocene of China (ca. 8-7 mya was recognized as the oldest undoubted member of the Ailuropodinae, suggesting that the panda lineage might have originated from an Ursavus ancestor. The role of the purported ailuropodine Agriarctos, from the Miocene of Europe, in the origins of this clade has been generally dismissed due to the paucity of the available material. Here, we describe a new ailuropodine genus, Kretzoiarctos gen. nov., based on remains from two Middle Miocene (ca. 12-11 Ma Spanish localities. A cladistic analysis of fossil and extant members of the Ursoidea confirms the inclusion of the new genus into the Ailuropodinae. Moreover, Kretzoiarctos precedes in time the previously-known, Late Miocene members of the giant panda clade from Eurasia (Agriarctos and Ailurarctos. The former can be therefore considered the oldest recorded member of the giant panda lineage, which has significant implications for understanding the origins of this clade from a paleobiogeographic viewpoint.

  3. A clade-specific Arabidopsis gene connects primary metabolism and senescence

    Science.gov (United States)

    Plants have to deal with environmental insults as they cannot move to escape from stressful conditions. To do so, they have evolved novel components that respond to the changing environments. A primary example is Qua Quine Starch (QQS, AT3G30720), an Arabidopsis thaliana-specific (orphan) gene that ...

  4. Amphitremida (poche, 1913 is a new major, ubiquitous labyrinthulomycete clade.

    Directory of Open Access Journals (Sweden)

    Fatma Gomaa

    Full Text Available Micro-eukaryotic diversity is poorly documented at all taxonomic levels and the phylogenetic affiliation of many taxa - including many well-known and common organisms - remains unknown. Among these incertae sedis taxa are Archerella flavum (Loeblich and Tappan, 1961 and Amphitrema wrightianum (Archer, 1869 (Amphitremidae, two filose testate amoebae commonly found in Sphagnum peatlands. To clarify their phylogenetic position, we amplified and sequenced the SSU rRNA gene obtained from four independent DNA extractions of A. flavum and three independent DNA extractions of A. wrightianum. Our molecular data demonstrate that genera Archerella and Amphitrema form a fully supported deep-branching clade within the Labyrinthulomycetes (Stramenopiles, together with Diplophrys sp. (ATCC50360 and several environmental clones obtained from a wide range of environments. This newly described clade we named Amphitremida is diverse genetically, ecologically and physiologically. Our phylogenetic analysis suggests that osmotrophic species evolved most likely from phagotrophic ancestors and that the bothrosome, an organelle that produces cytoplasmic networks used for attachment to the substratum and to absorb nutrients from the environments, appeared lately in labyrithulomycete evolution.

  5. Exploring Cystic Fibrosis Using Bioinformatics Tools: A Module Designed for the Freshman Biology Course

    Science.gov (United States)

    Zhang, Xiaorong

    2011-01-01

    We incorporated a bioinformatics component into the freshman biology course that allows students to explore cystic fibrosis (CF), a common genetic disorder, using bioinformatics tools and skills. Students learn about CF through searching genetic databases, analyzing genetic sequences, and observing the three-dimensional structures of proteins…

  6. 2nd Colombian Congress on Computational Biology and Bioinformatics

    CERN Document Server

    Cristancho, Marco; Isaza, Gustavo; Pinzón, Andrés; Rodríguez, Juan

    2014-01-01

    This volume compiles accepted contributions for the 2nd Edition of the Colombian Computational Biology and Bioinformatics Congress CCBCOL, after a rigorous review process in which 54 papers were accepted for publication from 119 submitted contributions. Bioinformatics and Computational Biology are areas of knowledge that have emerged due to advances that have taken place in the Biological Sciences and its integration with Information Sciences. The expansion of projects involving the study of genomes has led the way in the production of vast amounts of sequence data which needs to be organized, analyzed and stored to understand phenomena associated with living organisms related to their evolution, behavior in different ecosystems, and the development of applications that can be derived from this analysis.  .

  7. Discovery of putative salivary biomarkers for Sjögren's syndrome using high resolution mass spectrometry and bioinformatics.

    Science.gov (United States)

    Zoukhri, Driss; Rawe, Ian; Singh, Mabi; Brown, Ashley; Kublin, Claire L; Dawson, Kevin; Haddon, William F; White, Earl L; Hanley, Kathleen M; Tusé, Daniel; Malyj, Wasyl; Papas, Athena

    2012-03-01

    The purpose of the current study was to determine if saliva contains biomarkers that can be used as diagnostic tools for Sjögren's syndrome (SjS). Twenty seven SjS patients and 27 age-matched healthy controls were recruited for these studies. Unstimulated glandular saliva was collected from the Wharton's duct using a suction device. Two µl of salvia were processed for mass spectrometry analyses on a prOTOF 2000 matrix-assisted laser desorption/ionization orthogonal time of flight (MALDI O-TOF) mass spectrometer. Raw data were analyzed using bioinformatic tools to identify biomarkers. MALDI O-TOF MS analyses of saliva samples were highly reproducible and the mass spectra generated were very rich in peptides and peptide fragments in the 750-7,500 Da range. Data analysis using bioinformatic tools resulted in several classification models being built and several biomarkers identified. One model based on 7 putative biomarkers yielded a sensitivity of 97.5%, specificity of 97.8% and an accuracy of 97.6%. One biomarker was present only in SjS samples and was identified as a proteolytic peptide originating from human basic salivary proline-rich protein 3 precursor. We conclude that salivary biomarkers detected by high-resolution mass spectrometry coupled with powerful bioinformatic tools offer the potential to serve as diagnostic/prognostic tools for SjS.

  8. KBWS: an EMBOSS associated package for accessing bioinformatics web services.

    Science.gov (United States)

    Oshita, Kazuki; Arakawa, Kazuharu; Tomita, Masaru

    2011-04-29

    The availability of bioinformatics web-based services is rapidly proliferating, for their interoperability and ease of use. The next challenge is in the integration of these services in the form of workflows, and several projects are already underway, standardizing the syntax, semantics, and user interfaces. In order to deploy the advantages of web services with locally installed tools, here we describe a collection of proxy client tools for 42 major bioinformatics web services in the form of European Molecular Biology Open Software Suite (EMBOSS) UNIX command-line tools. EMBOSS provides sophisticated means for discoverability and interoperability for hundreds of tools, and our package, named the Keio Bioinformatics Web Service (KBWS), adds functionalities of local and multiple alignment of sequences, phylogenetic analyses, and prediction of cellular localization of proteins and RNA secondary structures. This software implemented in C is available under GPL from http://www.g-language.org/kbws/ and GitHub repository http://github.com/cory-ko/KBWS. Users can utilize the SOAP services implemented in Perl directly via WSDL file at http://soap.g-language.org/kbws.wsdl (RPC Encoded) and http://soap.g-language.org/kbws_dl.wsdl (Document/literal).

  9. KBWS: an EMBOSS associated package for accessing bioinformatics web services

    Directory of Open Access Journals (Sweden)

    Tomita Masaru

    2011-04-01

    Full Text Available Abstract The availability of bioinformatics web-based services is rapidly proliferating, for their interoperability and ease of use. The next challenge is in the integration of these services in the form of workflows, and several projects are already underway, standardizing the syntax, semantics, and user interfaces. In order to deploy the advantages of web services with locally installed tools, here we describe a collection of proxy client tools for 42 major bioinformatics web services in the form of European Molecular Biology Open Software Suite (EMBOSS UNIX command-line tools. EMBOSS provides sophisticated means for discoverability and interoperability for hundreds of tools, and our package, named the Keio Bioinformatics Web Service (KBWS, adds functionalities of local and multiple alignment of sequences, phylogenetic analyses, and prediction of cellular localization of proteins and RNA secondary structures. This software implemented in C is available under GPL from http://www.g-language.org/kbws/ and GitHub repository http://github.com/cory-ko/KBWS. Users can utilize the SOAP services implemented in Perl directly via WSDL file at http://soap.g-language.org/kbws.wsdl (RPC Encoded and http://soap.g-language.org/kbws_dl.wsdl (Document/literal.

  10. The World-Wide Web: An Interface between Research and Teaching in Bioinformatics

    Directory of Open Access Journals (Sweden)

    James F. Aiton

    1994-01-01

    Full Text Available The rapid expansion occurring in World-Wide Web activity is beginning to make the concepts of ‘global hypermedia’ and ‘universal document readership’ realistic objectives of the new revolution in information technology. One consequence of this increase in usage is that educators and students are becoming more aware of the diversity of the knowledge base which can be accessed via the Internet. Although computerised databases and information services have long played a key role in bioinformatics these same resources can also be used to provide core materials for teaching and learning. The large datasets and arch ives th at have been compiled for biomedical research can be enhanced with the addition of a variety of multimedia elements (images. digital videos. animation etc.. The use of this digitally stored information in structured and self-directed learning environments is likely to increase as activity across World-Wide Web increases.

  11. Bioinformatics algorithm based on a parallel implementation of a machine learning approach using transducers

    International Nuclear Information System (INIS)

    Roche-Lima, Abiel; Thulasiram, Ruppa K

    2012-01-01

    Finite automata, in which each transition is augmented with an output label in addition to the familiar input label, are considered finite-state transducers. Transducers have been used to analyze some fundamental issues in bioinformatics. Weighted finite-state transducers have been proposed to pairwise alignments of DNA and protein sequences; as well as to develop kernels for computational biology. Machine learning algorithms for conditional transducers have been implemented and used for DNA sequence analysis. Transducer learning algorithms are based on conditional probability computation. It is calculated by using techniques, such as pair-database creation, normalization (with Maximum-Likelihood normalization) and parameters optimization (with Expectation-Maximization - EM). These techniques are intrinsically costly for computation, even worse when are applied to bioinformatics, because the databases sizes are large. In this work, we describe a parallel implementation of an algorithm to learn conditional transducers using these techniques. The algorithm is oriented to bioinformatics applications, such as alignments, phylogenetic trees, and other genome evolution studies. Indeed, several experiences were developed using the parallel and sequential algorithm on Westgrid (specifically, on the Breeze cluster). As results, we obtain that our parallel algorithm is scalable, because execution times are reduced considerably when the data size parameter is increased. Another experience is developed by changing precision parameter. In this case, we obtain smaller execution times using the parallel algorithm. Finally, number of threads used to execute the parallel algorithm on the Breezy cluster is changed. In this last experience, we obtain as result that speedup is considerably increased when more threads are used; however there is a convergence for number of threads equal to or greater than 16.

  12. Development and evaluation of a bioinformatics approach for designing molecular assays for viral detection.

    Directory of Open Access Journals (Sweden)

    Pierre H H Schneeberger

    Full Text Available Viruses belonging to the Flaviviridae and Bunyaviridae families show considerable genetic diversity. However, this diversity is not necessarily taken into account when developing diagnostic assays, which are often based on the pairwise alignment of a limited number of sequences. Our objective was to develop and evaluate a bioinformatics workflow addressing two recurrent issues of molecular assay design: (i the high intraspecies genetic diversity in viruses and (ii the potential for cross-reactivity with close relatives.The workflow developed herein was based on two consecutive BLASTn steps; the first was utilized to select highly conserved regions among the viral taxon of interest, and the second was employed to assess the degree of similarity of these highly-conserved regions to close relatives. Subsequently, the workflow was tested on a set of eight viral species, including various strains from the Flaviviridae and Bunyaviridae families.The genetic diversity ranges from as low as 0.45% variable sites over the complete genome of the Japanese encephalitis virus to more than 16% of variable sites on segment L of the Crimean-Congo hemorrhagic fever virus. Our proposed bioinformatics workflow allowed the selection-based on computing scores-of the best target for a diagnostic molecular assay for the eight viral species investigated.Our bioinformatics workflow allowed rapid selection of highly conserved and specific genomic fragments among the investigated viruses, while considering up to several hundred complete genomic sequences. The pertinence of this workflow will increase in parallel to the number of sequences made publicly available. We hypothesize that our workflow might be utilized to select diagnostic molecular markers for higher organisms with more complex genomes, provided the sequences are made available.

  13. Implementing a Web-Based Introductory Bioinformatics Course for Non-Bioinformaticians That Incorporates Practical Exercises

    Science.gov (United States)

    Vincent, Antony T.; Bourbonnais, Yves; Brouard, Jean-Simon; Deveau, Hélène; Droit, Arnaud; Gagné, Stéphane M.; Guertin, Michel; Lemieux, Claude; Rathier, Louis; Charette, Steve J.; Lagüe, Patrick

    2018-01-01

    A recent scientific discipline, bioinformatics, defined as using informatics for the study of biological problems, is now a requirement for the study of biological sciences. Bioinformatics has become such a powerful and popular discipline that several academic institutions have created programs in this field, allowing students to become…

  14. Statistical modelling in biostatistics and bioinformatics selected papers

    CERN Document Server

    Peng, Defen

    2014-01-01

    This book presents selected papers on statistical model development related mainly to the fields of Biostatistics and Bioinformatics. The coverage of the material falls squarely into the following categories: (a) Survival analysis and multivariate survival analysis, (b) Time series and longitudinal data analysis, (c) Statistical model development and (d) Applied statistical modelling. Innovations in statistical modelling are presented throughout each of the four areas, with some intriguing new ideas on hierarchical generalized non-linear models and on frailty models with structural dispersion, just to mention two examples. The contributors include distinguished international statisticians such as Philip Hougaard, John Hinde, Il Do Ha, Roger Payne and Alessandra Durio, among others, as well as promising newcomers. Some of the contributions have come from researchers working in the BIO-SI research programme on Biostatistics and Bioinformatics, centred on the Universities of Limerick and Galway in Ireland and fu...

  15. Mitochondrial genome sequences reveal evolutionary relationships of the Phytophthora 1c clade species.

    Science.gov (United States)

    Lassiter, Erica S; Russ, Carsten; Nusbaum, Chad; Zeng, Qiandong; Saville, Amanda C; Olarte, Rodrigo A; Carbone, Ignazio; Hu, Chia-Hui; Seguin-Orlando, Andaine; Samaniego, Jose A; Thorne, Jeffrey L; Ristaino, Jean B

    2015-11-01

    Phytophthora infestans is one of the most destructive plant pathogens of potato and tomato globally. The pathogen is closely related to four other Phytophthora species in the 1c clade including P. phaseoli, P. ipomoeae, P. mirabilis and P. andina that are important pathogens of other wild and domesticated hosts. P. andina is an interspecific hybrid between P. infestans and an unknown Phytophthora species. We have sequenced mitochondrial genomes of the sister species of P. infestans and examined the evolutionary relationships within the clade. Phylogenetic analysis indicates that the P. phaseoli mitochondrial lineage is basal within the clade. P. mirabilis and P. ipomoeae are sister lineages and share a common ancestor with the Ic mitochondrial lineage of P. andina. These lineages in turn are sister to the P. infestans and P. andina Ia mitochondrial lineages. The P. andina Ic lineage diverged much earlier than the P. andina Ia mitochondrial lineage and P. infestans. The presence of two mitochondrial lineages in P. andina supports the hybrid nature of this species. The ancestral state of the P. andina Ic lineage in the tree and its occurrence only in the Andean regions of Ecuador, Colombia and Peru suggests that the origin of this species hybrid in nature may occur there.

  16. Naturally selecting solutions: the use of genetic algorithms in bioinformatics.

    Science.gov (United States)

    Manning, Timmy; Sleator, Roy D; Walsh, Paul

    2013-01-01

    For decades, computer scientists have looked to nature for biologically inspired solutions to computational problems; ranging from robotic control to scheduling optimization. Paradoxically, as we move deeper into the post-genomics era, the reverse is occurring, as biologists and bioinformaticians look to computational techniques, to solve a variety of biological problems. One of the most common biologically inspired techniques are genetic algorithms (GAs), which take the Darwinian concept of natural selection as the driving force behind systems for solving real world problems, including those in the bioinformatics domain. Herein, we provide an overview of genetic algorithms and survey some of the most recent applications of this approach to bioinformatics based problems.

  17. Influenza A H5N1 clade 2.3.4 virus with a different antiviral susceptibility profile replaced clade 1 virus in humans in northern Vietnam

    NARCIS (Netherlands)

    Le, Mai T. Q.; Wertheim, Heiman F. L.; Nguyen, Hien D.; Taylor, Walter; Hoang, Phuong V. M.; Vuong, Cuong D.; Nguyen, Hang L. K.; Nguyen, Ha H.; Nguyen, Thai Q.; Nguyen, Trung V.; van, Trang D.; Ngoc, Bich T.; Bui, Thinh N.; Nguyen, Binh G.; Nguyen, Liem T.; Luong, San T.; Phan, Phuc H.; Pham, Hung V.; Nguyen, Tung; Fox, Annette; Nguyen, Cam V.; Do, Ha Q.; Crusat, Martin; Farrar, Jeremy; Nguyen, Hien T.; de Jong, Menno D.; Horby, Peter

    2008-01-01

    BACKGROUND: Prior to 2007, highly pathogenic avian influenza (HPAI) H5N1 viruses isolated from poultry and humans in Vietnam were consistently reported to be clade 1 viruses, susceptible to oseltamivir but resistant to amantadine. Here we describe the re-emergence of human HPAI H5N1 virus infections

  18. Widening participation would be key in enhancing bioinformatics and genomics research in Africa

    Directory of Open Access Journals (Sweden)

    Thomas K. Karikari

    2015-09-01

    Full Text Available Bioinformatics and genome science (BGS are gradually gaining roots in Africa, contributing to studies that are leading to improved understanding of health, disease, agriculture and food security. While a few African countries have established foundations for research and training in these areas, BGS appear to be limited to only a few institutions in specific African countries. However, improving the disciplines in Africa will require pragmatic efforts to expand training and research partnerships to scientists in yet-unreached institutions. Here, we discuss the need to expand BGS programmes in Africa, and propose mechanisms to do so.

  19. The OAuth 2.0 Web Authorization Protocol for the Internet Addiction Bioinformatics (IABio Database

    Directory of Open Access Journals (Sweden)

    Jeongseok Choi

    2016-03-01

    Full Text Available Internet addiction (IA has become a widespread and problematic phenomenon as smart devices pervade society. Moreover, internet gaming disorder leads to increases in social expenditures for both individuals and nations alike. Although the prevention and treatment of IA are getting more important, the diagnosis of IA remains problematic. Understanding the neurobiological mechanism of behavioral addictions is essential for the development of specific and effective treatments. Although there are many databases related to other addictions, a database for IA has not been developed yet. In addition, bioinformatics databases, especially genetic databases, require a high level of security and should be designed based on medical information standards. In this respect, our study proposes the OAuth standard protocol for database access authorization. The proposed IA Bioinformatics (IABio database system is based on internet user authentication, which is a guideline for medical information standards, and uses OAuth 2.0 for access control technology. This study designed and developed the system requirements and configuration. The OAuth 2.0 protocol is expected to establish the security of personal medical information and be applied to genomic research on IA.

  20. Bioboxes: standardised containers for interchangeable bioinformatics software.

    Science.gov (United States)

    Belmann, Peter; Dröge, Johannes; Bremges, Andreas; McHardy, Alice C; Sczyrba, Alexander; Barton, Michael D

    2015-01-01

    Software is now both central and essential to modern biology, yet lack of availability, difficult installations, and complex user interfaces make software hard to obtain and use. Containerisation, as exemplified by the Docker platform, has the potential to solve the problems associated with sharing software. We propose bioboxes: containers with standardised interfaces to make bioinformatics software interchangeable.

  1. Two novel species representing a new clade and cluster of Phytophthora.

    Science.gov (United States)

    Yang, Xiao; Copes, Warren E; Hong, Chuanxue

    2014-01-01

    Phytophthora stricta sp. nov. and Phytophthora macilentosa sp. nov. are described based on morphological, physiological and molecular characters in this study. Phytophthora stricta represents a previously unknown clade in the rRNA internal transcribed spacer (ITS)-based phylogeny. Phytophthora macilentosa, along with nine other species, consistently forms a high temperature-tolerant cluster within ITS clade 9. These observations are supported by the sequence analysis of the mitochondrial cytochrome c oxidase 1 gene. Both species are heterothallic and all examined isolates are A1 mating type. Phytophthora stricta produces nonpapillate and slightly caducous sporangia. This species is named after its characteristic constrictions on sporangiophores. Phytophthora macilentosa produces nonpapillate and noncaducous sporangia, which are mostly elongated obpyriform with a high length to breadth ratio. Both species were recovered from irrigation water of an ornamental plant nursery in Mississippi, USA and P. stricta was also recovered from stream water in Virginia, USA. Copyright © 2013 The British Mycological Society. All rights reserved.

  2. Making Bioinformatics Projects a Meaningful Experience in an Undergraduate Biotechnology or Biomedical Science Programme

    Science.gov (United States)

    Sutcliffe, Iain C.; Cummings, Stephen P.

    2007-01-01

    Bioinformatics has emerged as an important discipline within the biological sciences that allows scientists to decipher and manage the vast quantities of data (such as genome sequences) that are now available. Consequently, there is an obvious need to provide graduates in biosciences with generic, transferable skills in bioinformatics. We present…

  3. Comparative Proteome Bioinformatics: Identification of Phosphotyrosine Signaling Proteins in the Unicellular Protozoan Ciliate Tetrahymena

    DEFF Research Database (Denmark)

    Gammeltoft, Steen; Christensen, Søren Tvorup; Joachimiak, Marcin

    2005-01-01

    Tetrahymena, bioinformatics, cilia, evolution, signaling, TtPTK1, PTK, Grb2, SH-PTP 2, Plcy, Src, PTP, PI3K, SH2, SH3, PH......Tetrahymena, bioinformatics, cilia, evolution, signaling, TtPTK1, PTK, Grb2, SH-PTP 2, Plcy, Src, PTP, PI3K, SH2, SH3, PH...

  4. Morphological features of the species of the genus Chlamydomonas s.l. (Chlorophyta from various molecular clades

    Directory of Open Access Journals (Sweden)

    Maria N. Pavlovska

    2012-03-01

    Full Text Available The morphology of 78 authentic strains from 5 clades into culture condition was investigated. The complex of phenotype features was established. Such features as: type of mucilage and their origin, mucilage collapse under methylene blue, saving papilla and stigma in not motile stage, extracellular matrix formation inside cell wall, the way of sporangium break, pyrenoid and stigma habit before cell division, cell shape, chloroplast morphology. Diagnostic features for determination of taxa on clades level are discussed.

  5. Analyzing the field of bioinformatics with the multi-faceted topic modeling technique.

    Science.gov (United States)

    Heo, Go Eun; Kang, Keun Young; Song, Min; Lee, Jeong-Hoon

    2017-05-31

    Bioinformatics is an interdisciplinary field at the intersection of molecular biology and computing technology. To characterize the field as convergent domain, researchers have used bibliometrics, augmented with text-mining techniques for content analysis. In previous studies, Latent Dirichlet Allocation (LDA) was the most representative topic modeling technique for identifying topic structure of subject areas. However, as opposed to revealing the topic structure in relation to metadata such as authors, publication date, and journals, LDA only displays the simple topic structure. In this paper, we adopt the Tang et al.'s Author-Conference-Topic (ACT) model to study the field of bioinformatics from the perspective of keyphrases, authors, and journals. The ACT model is capable of incorporating the paper, author, and conference into the topic distribution simultaneously. To obtain more meaningful results, we use journals and keyphrases instead of conferences and bag-of-words.. For analysis, we use PubMed to collected forty-six bioinformatics journals from the MEDLINE database. We conducted time series topic analysis over four periods from 1996 to 2015 to further examine the interdisciplinary nature of bioinformatics. We analyze the ACT Model results in each period. Additionally, for further integrated analysis, we conduct a time series analysis among the top-ranked keyphrases, journals, and authors according to their frequency. We also examine the patterns in the top journals by simultaneously identifying the topical probability in each period, as well as the top authors and keyphrases. The results indicate that in recent years diversified topics have become more prevalent and convergent topics have become more clearly represented. The results of our analysis implies that overtime the field of bioinformatics becomes more interdisciplinary where there is a steady increase in peripheral fields such as conceptual, mathematical, and system biology. These results are

  6. libcov: A C++ bioinformatic library to manipulate protein structures, sequence alignments and phylogeny

    OpenAIRE

    Butt, Davin; Roger, Andrew J; Blouin, Christian

    2005-01-01

    Background An increasing number of bioinformatics methods are considering the phylogenetic relationships between biological sequences. Implementing new methodologies using the maximum likelihood phylogenetic framework can be a time consuming task. Results The bioinformatics library libcov is a collection of C++ classes that provides a high and low-level interface to maximum likelihood phylogenetics, sequence analysis and a data structure for structural biological methods. libcov can be used ...

  7. Bioinformatics for whole-genome shotgun sequencing of microbial communities.

    Directory of Open Access Journals (Sweden)

    Kevin Chen

    2005-07-01

    Full Text Available The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fecal samples. The interpretation of this new kind of data poses a wide variety of exciting and difficult bioinformatics problems. The aim of this review is to introduce the bioinformatics community to this emerging field by surveying existing techniques and promising new approaches for several of the most interesting of these computational problems.

  8. A clade in the QUASIMODO2 family evolved with vascular plants and supports a role for cell wall composition in adaptation to environmental changes.

    Science.gov (United States)

    Fuentes, Sara; Pires, Nuno; Østergaard, Lars

    2010-08-01

    The evolution of plant vascular tissue is tightly linked to the evolution of specialised cell walls. Mutations in the QUASIMODO2 (QUA2) gene from Arabidopsis thaliana were previously shown to result in cell adhesion defects due to reduced levels of the cell wall component homogalacturonic acid. In this study, we provide additional information about the role of QUA2 and its closest paralogues, QUASIMODO2 LIKE1 (QUL1) and QUL2. Within the extensive QUA2 family, our phylogenetic analysis shows that these three genes form a clade that evolved with vascular plants. Consistent with a possible role of this clade in vasculature development, QUA2 is highly expressed in the vascular tissue of embryos and inflorescence stems and overexpression of QUA2 resulted in temperature-sensitive xylem collapse. Moreover, in-depth characterisation of qua2 qul1 qul2 triple mutant and 35S::QUA2 overexpression plants revealed contrasting temperature-dependent stem development with dramatic effects on stem width. Taken together, our results suggest that the QUA2-specific clade contributed to the evolution of vasculature and illustrate the important role that modification of cell wall composition plays in the adaptation to changing environmental conditions, including changes in temperature.

  9. Complete genome sequence of Campylobacter jejuni strain 12567 a livestock-associated clade representative

    Science.gov (United States)

    We report the complete genome sequence of the Campylobacter jejuni strain 12567, a member of a C. jejuni livestock-associated clade that expresses glycoconjugates linked to improved gastrointestinal tract persistence....

  10. Promoting synergistic research and education in genomics and bioinformatics.

    Science.gov (United States)

    Yang, Jack Y; Yang, Mary Qu; Zhu, Mengxia Michelle; Arabnia, Hamid R; Deng, Youping

    2008-01-01

    Bioinformatics and Genomics are closely related disciplines that hold great promises for the advancement of research and development in complex biomedical systems, as well as public health, drug design, comparative genomics, personalized medicine and so on. Research and development in these two important areas are impacting the science and technology.High throughput sequencing and molecular imaging technologies marked the beginning of a new era for modern translational medicine and personalized healthcare. The impact of having the human sequence and personalized digital images in hand has also created tremendous demands of developing powerful supercomputing, statistical learning and artificial intelligence approaches to handle the massive bioinformatics and personalized healthcare data, which will obviously have a profound effect on how biomedical research will be conducted toward the improvement of human health and prolonging of human life in the future. The International Society of Intelligent Biological Medicine (http://www.isibm.org) and its official journals, the International Journal of Functional Informatics and Personalized Medicine (http://www.inderscience.com/ijfipm) and the International Journal of Computational Biology and Drug Design (http://www.inderscience.com/ijcbdd) in collaboration with International Conference on Bioinformatics and Computational Biology (Biocomp), touch tomorrow's bioinformatics and personalized medicine throughout today's efforts in promoting the research, education and awareness of the upcoming integrated inter/multidisciplinary field. The 2007 international conference on Bioinformatics and Computational Biology (BIOCOMP07) was held in Las Vegas, the United States of American on June 25-28, 2007. The conference attracted over 400 papers, covering broad research areas in the genomics, biomedicine and bioinformatics. The Biocomp 2007 provides a common platform for the cross fertilization of ideas, and to help shape knowledge and

  11. MACBenAbim: A Multi-platform Mobile Application for searching keyterms in Computational Biology and Bioinformatics.

    Science.gov (United States)

    Oluwagbemi, Olugbenga O; Adewumi, Adewole; Esuruoso, Abimbola

    2012-01-01

    Computational biology and bioinformatics are gradually gaining grounds in Africa and other developing nations of the world. However, in these countries, some of the challenges of computational biology and bioinformatics education are inadequate infrastructures, and lack of readily-available complementary and motivational tools to support learning as well as research. This has lowered the morale of many promising undergraduates, postgraduates and researchers from aspiring to undertake future study in these fields. In this paper, we developed and described MACBenAbim (Multi-platform Mobile Application for Computational Biology and Bioinformatics), a flexible user-friendly tool to search for, define and describe the meanings of keyterms in computational biology and bioinformatics, thus expanding the frontiers of knowledge of the users. This tool also has the capability of achieving visualization of results on a mobile multi-platform context. MACBenAbim is available from the authors for non-commercial purposes.

  12. Bioinformatics Meets Virology: The European Virus Bioinformatics Center's Second Annual Meeting.

    Science.gov (United States)

    Ibrahim, Bashar; Arkhipova, Ksenia; Andeweg, Arno C; Posada-Céspedes, Susana; Enault, François; Gruber, Arthur; Koonin, Eugene V; Kupczok, Anne; Lemey, Philippe; McHardy, Alice C; McMahon, Dino P; Pickett, Brett E; Robertson, David L; Scheuermann, Richard H; Zhernakova, Alexandra; Zwart, Mark P; Schönhuth, Alexander; Dutilh, Bas E; Marz, Manja

    2018-05-14

    The Second Annual Meeting of the European Virus Bioinformatics Center (EVBC), held in Utrecht, Netherlands, focused on computational approaches in virology, with topics including (but not limited to) virus discovery, diagnostics, (meta-)genomics, modeling, epidemiology, molecular structure, evolution, and viral ecology. The goals of the Second Annual Meeting were threefold: (i) to bring together virologists and bioinformaticians from across the academic, industrial, professional, and training sectors to share best practice; (ii) to provide a meaningful and interactive scientific environment to promote discussion and collaboration between students, postdoctoral fellows, and both new and established investigators; (iii) to inspire and suggest new research directions and questions. Approximately 120 researchers from around the world attended the Second Annual Meeting of the EVBC this year, including 15 renowned international speakers. This report presents an overview of new developments and novel research findings that emerged during the meeting.

  13. Molecular Phylogeny of the Parasitic Dinoflagellate Chytriodinium within the Gymnodinium Clade (Gymnodiniales, Dinophyceae).

    Science.gov (United States)

    Gómez, Fernando; Skovgaard, Alf

    2015-01-01

    The dinoflagellate genus Chytriodinium, an ectoparasite of copepod eggs, is reported for the first time in the North and South Atlantic Oceans. We provide the first large subunit rDNA (LSU rDNA) and Internal Transcribed Spacer 1 (ITS1) sequences, which were identical in both hemispheres for the Atlantic Chytriodinium sp. The first complete small subunit ribosomal DNA (SSU rDNA) of the Atlantic Chytriodinium sp. suggests that the specimens belong to an undescribed species. This is the first evidence of the split of the Gymnodinium clade: one for the parasitic forms of Chytriodiniaceae (Chytriodinium, Dissodinium), and other clade for the free-living species. © 2014 The Author(s) Journal of Eukaryotic Microbiology © 2014 International Society of Protistologists.

  14. iTools: a framework for classification, categorization and integration of computational biology resources.

    Directory of Open Access Journals (Sweden)

    Ivo D Dinov

    2008-05-01

    -term resource management. We demonstrate several applications of iTools as a framework for integrated bioinformatics. iTools and the complete details about its specifications, usage and interfaces are available at the iTools web page http://iTools.ccb.ucla.edu.

  15. Protecting innovation in bioinformatics and in-silico biology.

    Science.gov (United States)

    Harrison, Robert

    2003-01-01

    Commercial success or failure of innovation in bioinformatics and in-silico biology requires the appropriate use of legal tools for protecting and exploiting intellectual property. These tools include patents, copyrights, trademarks, design rights, and limiting information in the form of 'trade secrets'. Potentially patentable components of bioinformatics programmes include lines of code, algorithms, data content, data structure and user interfaces. In both the US and the European Union, copyright protection is granted for software as a literary work, and most other major industrial countries have adopted similar rules. Nonetheless, the grant of software patents remains controversial and is being challenged in some countries. Current debate extends to aspects such as whether patents can claim not only the apparatus and methods but also the data signals and/or products, such as a CD-ROM, on which the programme is stored. The patentability of substances discovered using in-silico methods is a separate debate that is unlikely to be resolved in the near future.

  16. Transmissibility of the monkeypox virus clades via respiratory transmission: investigation using the prairie dog-monkeypox virus challenge system.

    Directory of Open Access Journals (Sweden)

    Christina L Hutson

    Full Text Available Monkeypox virus (MPXV is endemic within Africa where it sporadically is reported to cause outbreaks of human disease. In 2003, an outbreak of human MPXV occurred in the US after the importation of infected African rodents. Since the eradication of smallpox (caused by an orthopoxvirus (OPXV related to MPXV and cessation of routine smallpox vaccination (with the live OPXV vaccinia, there is an increasing population of people susceptible to OPXV diseases. Previous studies have shown that the prairie dog MPXV model is a functional animal model for the study of systemic human OPXV illness. Studies with this model have demonstrated that infected animals are able to transmit the virus to naive animals through multiple routes of exposure causing subsequent infection, but were not able to prove that infected animals could transmit the virus exclusively via the respiratory route. Herein we used the model system to evaluate the hypothesis that the Congo Basin clade of MPXV is more easily transmitted, via respiratory route, than the West African clade. Using a small number of test animals, we show that transmission of viruses from each of the MPXV clade was minimal via respiratory transmission. However, transmissibility of the Congo Basin clade was slightly greater than West African MXPV clade (16.7% and 0% respectively. Based on these findings, respiratory transmission appears to be less efficient than those of previous studies assessing contact as a mechanism of transmission within the prairie dog MPXV animal model.

  17. Pay-as-you-go data integration for bio-informatics

    NARCIS (Netherlands)

    Wanders, B.

    2012-01-01

    Scientific research in bio-informatics is often data-driven and supported by numerous biological databases. A biological database contains factual information collected from scientific experiments and computational analyses about areas including genomics, proteomics, metabolomics, microarray gene

  18. Bioinformatics approaches to single-cell analysis in developmental biology.

    Science.gov (United States)

    Yalcin, Dicle; Hakguder, Zeynep M; Otu, Hasan H

    2016-03-01

    Individual cells within the same population show various degrees of heterogeneity, which may be better handled with single-cell analysis to address biological and clinical questions. Single-cell analysis is especially important in developmental biology as subtle spatial and temporal differences in cells have significant associations with cell fate decisions during differentiation and with the description of a particular state of a cell exhibiting an aberrant phenotype. Biotechnological advances, especially in the area of microfluidics, have led to a robust, massively parallel and multi-dimensional capturing, sorting, and lysis of single-cells and amplification of related macromolecules, which have enabled the use of imaging and omics techniques on single cells. There have been improvements in computational single-cell image analysis in developmental biology regarding feature extraction, segmentation, image enhancement and machine learning, handling limitations of optical resolution to gain new perspectives from the raw microscopy images. Omics approaches, such as transcriptomics, genomics and epigenomics, targeting gene and small RNA expression, single nucleotide and structural variations and methylation and histone modifications, rely heavily on high-throughput sequencing technologies. Although there are well-established bioinformatics methods for analysis of sequence data, there are limited bioinformatics approaches which address experimental design, sample size considerations, amplification bias, normalization, differential expression, coverage, clustering and classification issues, specifically applied at the single-cell level. In this review, we summarize biological and technological advancements, discuss challenges faced in the aforementioned data acquisition and analysis issues and present future prospects for application of single-cell analyses to developmental biology. © The Author 2015. Published by Oxford University Press on behalf of the European

  19. Cryptic host-specific diversity among western hemisphere broomrapes (Orobanche s.l., Orobanchaceae).

    Science.gov (United States)

    Schneider, Adam C; Colwell, Alison E L; Schneeweiss, Gerald M; Baldwin, Bruce G

    2016-11-01

    The broomrapes, Orobanche sensu lato (Orobanchaceae), are common root parasites found across Eurasia, Africa and the Americas. All species native to the western hemisphere, recognized as Orobanche sections Gymnocaulis and Nothaphyllon, form a clade that has a centre of diversity in western North America, but also includes four disjunct species in central and southern South America. The wide ecological distribution coupled with moderate taxonomic diversity make this clade a valuable model system for studying the role, if any, of host-switching in driving the diversification of plant parasites. Two spacer regions of ribosomal nuclear DNA (ITS + ETS), three plastid regions and one low-copy nuclear gene were sampled from 163 exemplars of Orobanche from across the native geographic range in order to infer a detailed phylogeny. Together with comprehensive data on the parasites' native host ranges, associations between phylogenetic lineages and host specificity are tested. Within the two currently recognized species of O. sect. Gymnocaulis, seven strongly supported clades were found. While commonly sympatric, members of these clades each had unique host associations. Strong support for cryptic host-specific diversity was also found in sect. Nothaphyllon, while other taxonomic species were well supported. We also find strong evidence for multiple amphitropical dispersals from central North America into South America. Host-switching is an important driver of diversification in western hemisphere broomrapes, where host specificity has been grossly underestimated. More broadly, host specificity and host-switching probably play fundamental roles in the speciation of parasitic plants. © The Author 2016. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  20. BioShaDock: a community driven bioinformatics shared Docker-based tools registry [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    François Moreews

    2015-12-01

    Full Text Available Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community.

  1. Nispero: a cloud-computing based Scala tool specially suited for bioinformatics data processing

    OpenAIRE

    Evdokim Kovach; Alexey Alekhin; Eduardo Pareja Tobes; Raquel Tobes; Eduardo Pareja; Marina Manrique

    2014-01-01

    Nowadays it is widely accepted that the bioinformatics data analysis is a real bottleneck in many research activities related to life sciences. High-throughput technologies like Next Generation Sequencing (NGS) have completely reshaped the biology and bioinformatics landscape. Undoubtedly NGS has allowed important progress in many life-sciences related fields but has also presented interesting challenges in terms of computation capabilities and algorithms. Many kinds of tasks related with NGS...

  2. An Adaptive Hybrid Multiprocessor technique for bioinformatics sequence alignment

    KAUST Repository

    Bonny, Talal; Salama, Khaled N.; Zidan, Mohammed A.

    2012-01-01

    Sequence alignment algorithms such as the Smith-Waterman algorithm are among the most important applications in the development of bioinformatics. Sequence alignment algorithms must process large amounts of data which may take a long time. Here, we

  3. Repeated evolution of vertebrate pollination syndromes in a recently diverged Andean plant clade.

    Science.gov (United States)

    Lagomarsino, Laura P; Forrestel, Elisabeth J; Muchhala, Nathan; Davis, Charles C

    2017-08-01

    Although specialized interactions, including those involving plants and their pollinators, are often invoked to explain high species diversity, they are rarely explored at macroevolutionary scales. We investigate the dynamic evolution of hummingbird and bat pollination syndromes in the centropogonid clade (Lobelioideae: Campanulaceae), an Andean-centered group of ∼550 angiosperm species. We demonstrate that flowers hypothesized to be adapted to different pollinators based on flower color fall into distinct regions of morphospace, and this is validated by morphology of species with known pollinators. This supports the existence of pollination syndromes in the centropogonids, an idea corroborated by ecological studies. We further demonstrate that hummingbird pollination is ancestral, and that bat pollination has evolved ∼13 times independently, with ∼11 reversals. This convergence is associated with correlated evolution of floral traits within selective regimes corresponding to pollination syndrome. Collectively, our results suggest that floral morphological diversity is extremely labile, likely resulting from selection imposed by pollinators. Finally, even though this clade's rapid diversification is partially attributed to their association with vertebrate pollinators, we detect no difference in diversification rates between hummingbird- and bat-pollinated lineages. Our study demonstrates the utility of pollination syndromes as a proxy for ecological relationships in macroevolutionary studies of certain species-rich clades. © 2017 The Author(s). Evolution © 2017 The Society for the Study of Evolution.

  4. Genomic Characterization of Two Novel SAR11 Isolates From the Red Sea, Including the First Strain of the SAR11 Ib clade

    KAUST Repository

    Jimenez Infante, Francy M.

    2017-06-22

    The SAR11 clade (Pelagibacterales) is a diverse group that forms a monophyletic clade within the Alphaproteobacteria, and constitutes up to one third of all prokaryotic cells in the photic zone of most oceans. Pelagibacterales are very abundant in the warm and highly saline surface waters of the Red Sea, raising the question of adaptive traits of SAR11 populations in this water body and warmer oceans through the world. In this study, two pure cultures were successfully obtained from surface waters on the Red Sea, one isolate of subgroup Ia and one of the previously uncultured SAR11 Ib lineage. The novel genomes were very similar to each other and to genomes of isolates of SAR11 subgroup Ia (Ia pan-genome), both in terms of gene content and synteny. Among the genes that were not present in the Ia pan-genome, 108 (RS39, Ia) and 151 genes (RS40, Ib) were strain-specific. Detailed analyses showed that only 51 (RS39, Ia) and 55 (RS40, Ib) of these strain-specific genes had not reported before on genome fragments of Pelagibacterales. Further analyses revealed the potential production of phosphonates by some SAR11 members and possible adaptations for oligotrophic life, including pentose sugar utilization and adhesion to marine particulate matter.

  5. Genomic Characterization of Two Novel SAR11 Isolates From the Red Sea, Including the First Strain of the SAR11 Ib clade

    KAUST Repository

    Jimenez Infante, Francy M.; Ngugi, David; Vinu, Manikandan; Blom, Jochen; Alam, Intikhab; Bajic, Vladimir B.; Stingl, Ulrich

    2017-01-01

    The SAR11 clade (Pelagibacterales) is a diverse group that forms a monophyletic clade within the Alphaproteobacteria, and constitutes up to one third of all prokaryotic cells in the photic zone of most oceans. Pelagibacterales are very abundant in the warm and highly saline surface waters of the Red Sea, raising the question of adaptive traits of SAR11 populations in this water body and warmer oceans through the world. In this study, two pure cultures were successfully obtained from surface waters on the Red Sea, one isolate of subgroup Ia and one of the previously uncultured SAR11 Ib lineage. The novel genomes were very similar to each other and to genomes of isolates of SAR11 subgroup Ia (Ia pan-genome), both in terms of gene content and synteny. Among the genes that were not present in the Ia pan-genome, 108 (RS39, Ia) and 151 genes (RS40, Ib) were strain-specific. Detailed analyses showed that only 51 (RS39, Ia) and 55 (RS40, Ib) of these strain-specific genes had not reported before on genome fragments of Pelagibacterales. Further analyses revealed the potential production of phosphonates by some SAR11 members and possible adaptations for oligotrophic life, including pentose sugar utilization and adhesion to marine particulate matter.

  6. 'Students-as-partners' scheme enhances postgraduate students' employability skills while addressing gaps in bioinformatics education.

    Science.gov (United States)

    Mello, Luciane V; Tregilgas, Luke; Cowley, Gwen; Gupta, Anshul; Makki, Fatima; Jhutty, Anjeet; Shanmugasundram, Achchuthan

    2017-01-01

    Teaching bioinformatics is a longstanding challenge for educators who need to demonstrate to students how skills developed in the classroom may be applied to real world research. This study employed an action research methodology which utilised student-staff partnership and peer-learning. It was centred on the experiences of peer-facilitators, students who had previously taken a postgraduate bioinformatics module, and had applied knowledge and skills gained from it to their own research. It aimed to demonstrate to peer-receivers, current students, how bioinformatics could be used in their own research while developing peer-facilitators' teaching and mentoring skills. This student-centred approach was well received by the peer-receivers, who claimed to have gained improved understanding of bioinformatics and its relevance to research. Equally, peer-facilitators also developed a better understanding of the subject and appreciated that the activity was a rare and invaluable opportunity to develop their teaching and mentoring skills, enhancing their employability.

  7. A bioinformatics-based overview of protein Lys-Ne-acetylation

    Science.gov (United States)

    Among posttranslational modifications, there are some conceptual similarities between Lys-N'-acetylation and Ser/Thr/Tyr O-phosphorylation. Herein we present a bioinformatics-based overview of reversible protein Lys-acetylation, including some comparisons with reversible protein phosphorylation. T...

  8. Missing "Links" in Bioinformatics Education: Expanding Students' Conceptions of Bioinformatics Using a Biodiversity Database of Living and Fossil Reef Corals

    Science.gov (United States)

    Nehm, Ross H.; Budd, Ann F.

    2006-01-01

    NMITA is a reef coral biodiversity database that we use to introduce students to the expansive realm of bioinformatics beyond genetics. We introduce a series of lessons that have students use this database, thereby accessing real data that can be used to test hypotheses about biodiversity and evolution while targeting the "National Science …

  9. Global cooling as a driver of diversification in a major marine clade

    Science.gov (United States)

    Davis, Katie E.; Hill, Jon; Astrop, Tim I.; Wills, Matthew A.

    2016-10-01

    Climate is a strong driver of global diversity and will become increasingly important as human influences drive temperature changes at unprecedented rates. Here we investigate diversification and speciation trends within a diverse group of aquatic crustaceans, the Anomura. We use a phylogenetic framework to demonstrate that speciation rate is correlated with global cooling across the entire tree, in contrast to previous studies. Additionally, we find that marine clades continue to show evidence of increased speciation rates with cooler global temperatures, while the single freshwater clade shows the opposite trend with speciation rates positively correlated to global warming. Our findings suggest that both global cooling and warming lead to diversification and that habitat plays a role in the responses of species to climate change. These results have important implications for our understanding of how extant biota respond to ongoing climate change and are of particular importance for conservation planning of marine ecosystems.

  10. Specifics of multi-project management: interaction and resources constraints

    Directory of Open Access Journals (Sweden)

    Tsvetkova Nadezhda

    2017-01-01

    Full Text Available Multi-project management is fundamentally different from the control of a particular project or a set of slightly interconnected projects in terms of complexity and specifics. In multiproject management of the company production it is important to analyze the innovation interaction and its impact on the commercialization stage. A multiparameter factor of innovations interaction was introduced. The optimization problem which considers this factor was mathematically defined. The solution of this problem produces a schedule of innovations launches. This problem definition allows updating the objective function that corresponds to the aims of a manufacturing company. For example, it can help maximize the number of interdependent innovations with restrictions to current tangible and intangible resources or minimize the number of used tangible resources at a fixed number of innovations implemented. In order to verify the optimization problem an evolutionary approach based on genetic algorithm and local search is used. The verification was performed by the Solver a Microsoft Excel add-in. The readiness for practical use of the proposed solution was proved by the experiment.

  11. Prototype of A/Duck/Sukoharjo/Bbvw-1428-9/2012 subtipe H5N1 clade 2.3.2 as vaccine on local duck

    Directory of Open Access Journals (Sweden)

    Risa Indriani

    2014-06-01

    Full Text Available A/Duck/Sukoharjo/Bbvw-1428-9/2012 virus subtipe H5N1 clade 2.3.2 as seed vaccine on local duck. AI H5N1 clade 2.3.2 vaccine containing 256 HAU per dose was formulated using adjuvant ISA 71VG Montanide ™. Six groups of one day old local duck were used in this study. Three groups (10 ducks per group were vaccinated and 3 groups (9 duck per group were served control. Vaccination was conducted when the duck were three weeks old of age using single dose. Three weeks after vaccination when the duck were challenged either with HPAI H5N1 clade 2.3.2, or HPAI H5N1 clade 2.1.3 virus at dose 106 EID50/ 0.1 ml by drops intranasaly. Result showed that vaccination produced 100% protection compared to unvaccinated ducks againt HPAI subtipe H5N1 clade 2.3.2, and 100% protection againt HPAI H5N1 clade 2.1.3 (A/ck/wj/Subang-29/2007 and A/ck/wj/Smi-Part/2006, while unvaccinated ducks showed virus shedding on day 3 post infection.

  12. Penalized feature selection and classification in bioinformatics

    OpenAIRE

    Ma, Shuangge; Huang, Jian

    2008-01-01

    In bioinformatics studies, supervised classification with high-dimensional input variables is frequently encountered. Examples routinely arise in genomic, epigenetic and proteomic studies. Feature selection can be employed along with classifier construction to avoid over-fitting, to generate more reliable classifier and to provide more insights into the underlying causal relationships. In this article, we provide a review of several recently developed penalized feature selection and classific...

  13. In situ morphometric survey elucidates the evolutionary systematics of the Eurasian Himantoglossum clade (Orchidaceae: Orchidinae

    Directory of Open Access Journals (Sweden)

    Richard M. Bateman

    2017-01-01

    Full Text Available Background and Aims The charismatic Himantoglossum s.l. clade of Eurasian orchids contains an unusually large proportion of taxa that are of controversial circumscriptions and considerable conservation concern. Whereas our previously published study addressed the molecular phylogenetics and phylogeography of every named taxon within the clade, here we use detailed morphometric data obtained from the same populations to compare genotypes with associated phenotypes, in order to better explore taxonomic circumscription and character evolution within the clade. Methods Between one and 12 plants found in 25 populations that encompassed the entire distribution of the Himantoglossum s.l. clade were measured in situ for 51 morphological characters. Results for 45 of those characters were subjected to detailed multivariate and univariate analyses. Key Results Multivariate analyses readily separate subgenus Barlia and subgenus Comperia from subgenus Himantoglossum, and also the early-divergent H. formosum from the less divergent remainder of subgenus Himantoglossum. The sequence of divergence of these four lineages is confidently resolved. Our experimental approach to morphometric character analysis demonstrates clearly that phenotypic evolution within Himantoglossum is unusually multi-dimensional. Conclusions Degrees of divergence between taxa shown by morphological analyses approximate those previously shown using molecular analyses. Himantoglossum s.l. is readily divisible into three subgenera. The three sections of subgenus Himantoglossum—hircinum, caprinum and formosum—are arrayed from west to east with only limited geographical overlap. At this taxonomic level, their juxtaposition combines with conflict between contrasting datasets to complicate attempts to distinguish between clinal variation and the discontinuities that by definition separate bona fide species. All taxa achieve allogamy via food deceit and have only weak pollinator specificity

  14. Flying lemurs – The 'flying tree shrews'? Molecular cytogenetic evidence for a Scandentia-Dermoptera sister clade

    Directory of Open Access Journals (Sweden)

    Volobouev Vitaly

    2008-05-01

    Full Text Available Abstract Background Flying lemurs or Colugos (order Dermoptera represent an ancient mammalian lineage that contains only two extant species. Although molecular evidence strongly supports that the orders Dermoptera, Scandentia, Lagomorpha, Rodentia and Primates form a superordinal clade called Supraprimates (or Euarchontoglires, the phylogenetic placement of Dermoptera within Supraprimates remains ambiguous. Results To search for cytogenetic signatures that could help to clarify the evolutionary affinities within this superordinal group, we have established a genome-wide comparative map between human and the Malayan flying lemur (Galeopterus variegatus by reciprocal chromosome painting using both human and G. variegatus chromosome-specific probes. The 22 human autosomal paints and the X chromosome paint defined 44 homologous segments in the G. variegatus genome. A putative inversion on GVA 11 was revealed by the hybridization patterns of human chromosome probes 16 and 19. Fifteen associations of human chromosome segments (HSA were detected in the G. variegatus genome: HSA1/3, 1/10, 2/21, 3/21, 4/8, 4/18, 7/15, 7/16, 7/19, 10/16, 12/22 (twice, 14/15, 16/19 (twice. Reverse painting of G. variegatus chromosome-specific paints onto human chromosomes confirmed the above results, and defined the origin of the homologous human chromosomal segments in these associations. In total, G. variegatus paints revealed 49 homologous chromosomal segments in the HSA genome. Conclusion Comparative analysis of our map with published maps from representative species of other placental orders, including Scandentia, Primates, Lagomorpha and Rodentia, suggests a signature rearrangement (HSA2q/21 association that links Scandentia and Dermoptera to one sister clade. Our results thus provide new evidence for the hypothesis that Scandentia and Dermoptera have a closer phylogenetic relationship to each other than either of them has to Primates.

  15. Toward Personalized Pressure Ulcer Care Planning: Development of a Bioinformatics System for Individualized Prioritization of Clinical Pratice Guideline

    Science.gov (United States)

    2016-10-01

    AWARD NUMBER: W81XWH-15-1-0342 TITLE: Toward Personalized Pressure Ulcer Care Planning: Development of a Bioinformatics System for Individualized...Planning: Development of a Bioinformatics System for Individualized Prioritization of Clinical Pratice Guideline 5a. CONTRACT NUMBER 5b. GRANT...recommendations of CPG has been identified by experts in the field. We will use bioinformatics to enable data extraction, storage, and analysis to support

  16. Biological Data Resources at the EMBL-EBI

    Directory of Open Access Journals (Sweden)

    Rodrigo Lopez

    2008-07-01

    Full Text Available The European Bioinformatics Institute (EBI is an Outstation of the European Molecular Biology Laboratory (EMBL. These are Europe’s flagships in bioinforma­tics and basic research in molecular biology. The EBI has been maintaining core data resources in molecular biology for 15 years and is notionally custodian to the largest collection of databases and services in Life Sciences in Europe. EBI provides access in a free and unrestricted fashion to these resources to the international research community. The data resources at the EBI are divided into thematic categories. Each represents a special knowledge domain where one or several databases are maintai­ned. The aims of this note are to introduce the reader to these resources and briefly outline training and education activities which may be of interest to students as well as academic staff in general. The web portal for the EBI can be found at http://www.ebi.ac.uk/ and represents a single entry point for all data resources and activities described below.

  17. High Ancient Genetic Diversity of Human Lice, Pediculus humanus, from Israel Reveals New Insights into the Origin of Clade B Lice.

    Science.gov (United States)

    Amanzougaghene, Nadia; Mumcuoglu, Kosta Y; Fenollar, Florence; Alfi, Shir; Yesilyurt, Gonca; Raoult, Didier; Mediannikov, Oleg

    2016-01-01

    The human head louse, Pediculus humanus capitis, is subdivided into several significantly divergent mitochondrial haplogroups, each with particular geographical distributions. Historically, they are among the oldest human parasites, representing an excellent marker for tracking older events in human evolutionary history. In this study, ancient DNA analysis using real-time polymerase chain reaction (qPCR), combined with conventional PCR, was applied to the remains of twenty-four ancient head lice and their eggs from the Roman period which were recovered from Israel. The lice and eggs were found in three combs, one of which was recovered from archaeological excavations in the Hatzeva area of the Judean desert, and two of which found in Moa, in the Arava region, close to the Dead Sea. Results show that the head lice remains dating approximately to 2,000 years old have a cytb haplogroup A, which is worldwide in distribution, and haplogroup B, which has thus far only been found in contemporary lice from America, Europe, Australia and, most recently, Africa. More specifically, this haplogroup B has a B36 haplotype, the most common among B haplogroups, and has been present in America for at least 4,000 years. The present findings confirm that clade B lice existed, at least in the Middle East, prior to contacts between Native Americans and Europeans. These results support a Middle Eastern origin for clade B followed by its introduction into the New World with the early peoples. Lastly, the presence of Acinetobacter baumannii DNA was demonstrated by qPCR and sequencing in four head lice remains belonging to clade A.

  18. Architecture exploration of FPGA based accelerators for bioinformatics applications

    CERN Document Server

    Varma, B Sharat Chandra; Balakrishnan, M

    2016-01-01

    This book presents an evaluation methodology to design future FPGA fabrics incorporating hard embedded blocks (HEBs) to accelerate applications. This methodology will be useful for selection of blocks to be embedded into the fabric and for evaluating the performance gain that can be achieved by such an embedding. The authors illustrate the use of their methodology by studying the impact of HEBs on two important bioinformatics applications: protein docking and genome assembly. The book also explains how the respective HEBs are designed and how hardware implementation of the application is done using these HEBs. It shows that significant speedups can be achieved over pure software implementations by using such FPGA-based accelerators. The methodology presented in this book may also be used for designing HEBs for accelerating software implementations in other domains besides bioinformatics. This book will prove useful to students, researchers, and practicing engineers alike.

  19. LegumeDB1 bioinformatics resource: comparative genomic analysis and novel cross-genera marker identification in lupin and pasture legume species.

    Science.gov (United States)

    Moolhuijzen, P; Cakir, M; Hunter, A; Schibeci, D; Macgregor, A; Smith, C; Francki, M; Jones, M G K; Appels, R; Bellgard, M

    2006-06-01

    The identification of markers in legume pasture crops, which can be associated with traits such as protein and lipid production, disease resistance, and reduced pod shattering, is generally accepted as an important strategy for improving the agronomic performance of these crops. It has been demonstrated that many quantitative trait loci (QTLs) identified in one species can be found in other plant species. Detailed legume comparative genomic analyses can characterize the genome organization between model legume species (e.g., Medicago truncatula, Lotus japonicus) and economically important crops such as soybean (Glycine max), pea (Pisum sativum), chickpea (Cicer arietinum), and lupin (Lupinus angustifolius), thereby identifying candidate gene markers that can be used to track QTLs in lupin and pasture legume breeding. LegumeDB is a Web-based bioinformatics resource for legume researchers. LegumeDB analysis of Medicago truncatula expressed sequence tags (ESTs) has identified novel simple sequence repeat (SSR) markers (16 tested), some of which have been putatively linked to symbiosome membrane proteins in root nodules and cell-wall proteins important in plant-pathogen defence mechanisms. These novel markers by preliminary PCR assays have been detected in Medicago truncatula and detected in at least one other legume species, Lotus japonicus, Glycine max, Cicer arietinum, and (or) Lupinus angustifolius (15/16 tested). Ongoing research has validated some of these markers to map them in a range of legume species that can then be used to compile composite genetic and physical maps. In this paper, we outline the features and capabilities of LegumeDB as an interactive application that provides legume genetic and physical comparative maps, and the efficient feature identification and annotation of the vast tracks of model legume sequences for convenient data integration and visualization. LegumeDB has been used to identify potential novel cross-genera polymorphic legume

  20. A bioinformatics approach for identifying transgene insertion sites using whole genome sequencing data.

    Science.gov (United States)

    Park, Doori; Park, Su-Hyun; Ban, Yong Wook; Kim, Youn Shic; Park, Kyoung-Cheul; Kim, Nam-Soo; Kim, Ju-Kon; Choi, Ik-Young

    2017-08-15

    Genetically modified crops (GM crops) have been developed to improve the agricultural traits of modern crop cultivars. Safety assessments of GM crops are of paramount importance in research at developmental stages and before releasing transgenic plants into the marketplace. Sequencing technology is developing rapidly, with higher output and labor efficiencies, and will eventually replace existing methods for the molecular characterization of genetically modified organisms. To detect the transgenic insertion locations in the three GM rice gnomes, Illumina sequencing reads are mapped and classified to the rice genome and plasmid sequence. The both mapped reads are classified to characterize the junction site between plant and transgene sequence by sequence alignment. Herein, we present a next generation sequencing (NGS)-based molecular characterization method, using transgenic rice plants SNU-Bt9-5, SNU-Bt9-30, and SNU-Bt9-109. Specifically, using bioinformatics tools, we detected the precise insertion locations and copy numbers of transfer DNA, genetic rearrangements, and the absence of backbone sequences, which were equivalent to results obtained from Southern blot analyses. NGS methods have been suggested as an effective means of characterizing and detecting transgenic insertion locations in genomes. Our results demonstrate the use of a combination of NGS technology and bioinformatics approaches that offers cost- and time-effective methods for assessing the safety of transgenic plants.

  1. High-throughput bioinformatics with the Cyrille2 pipeline system

    Directory of Open Access Journals (Sweden)

    de Groot Joost CW

    2008-02-01

    Full Text Available Abstract Background Modern omics research involves the application of high-throughput technologies that generate vast volumes of data. These data need to be pre-processed, analyzed and integrated with existing knowledge through the use of diverse sets of software tools, models and databases. The analyses are often interdependent and chained together to form complex workflows or pipelines. Given the volume of the data used and the multitude of computational resources available, specialized pipeline software is required to make high-throughput analysis of large-scale omics datasets feasible. Results We have developed a generic pipeline system called Cyrille2. The system is modular in design and consists of three functionally distinct parts: 1 a web based, graphical user interface (GUI that enables a pipeline operator to manage the system; 2 the Scheduler, which forms the functional core of the system and which tracks what data enters the system and determines what jobs must be scheduled for execution, and; 3 the Executor, which searches for scheduled jobs and executes these on a compute cluster. Conclusion The Cyrille2 system is an extensible, modular system, implementing the stated requirements. Cyrille2 enables easy creation and execution of high throughput, flexible bioinformatics pipelines.

  2. Mitochondrial molecular clocks and the origin of the major Otocephalan clades (Pisces: Teleostei)

    DEFF Research Database (Denmark)

    Peng, Zuogang; He, Shunping; Wang, Jun

    2006-01-01

    The Otocephala, a clade including ostariophysan and clupeomorph teleosts, represents about a quarter of total fish species diversity, with about 1000 genera and more than 7000 species. A series of recent papers have defended that the origin of this clade and of its major groups may be significantly...... otophysans could have originated before the splitting of the Pangean supercontinent is of extreme importance, since otophysan fishes are among the most useful animal groups for the determination of historical continental relationships. In the present work we examined divergence times for each major...... otocephalan group by an analysis of complete mtDNA sequences, in order to investigate if these divergence times support the hypotheses advanced in recent studies. The complete mtDNA sequences of nine representative non-otocephalan fish species and of twenty-one representative otocephalan species was compared...

  3. How to handle speciose clades? Mass taxon-sampling as a strategy towards illuminating the natural history of Campanula (Campanuloideae.

    Directory of Open Access Journals (Sweden)

    Guilhem Mansion

    Full Text Available BACKGROUND: Speciose clades usually harbor species with a broad spectrum of adaptive strategies and complex distribution patterns, and thus constitute ideal systems to disentangle biotic and abiotic causes underlying species diversification. The delimitation of such study systems to test evolutionary hypotheses is difficult because they often rely on artificial genus concepts as starting points. One of the most prominent examples is the bellflower genus Campanula with some 420 species, but up to 600 species when including all lineages to which Campanula is paraphyletic. We generated a large alignment of petD group II intron sequences to include more than 70% of described species as a reference. By comparison with partial data sets we could then assess the impact of selective taxon sampling strategies on phylogenetic reconstruction and subsequent evolutionary conclusions. METHODOLOGY/PRINCIPAL FINDINGS: Phylogenetic analyses based on maximum parsimony (PAUP, PRAP, Bayesian inference (MrBayes, and maximum likelihood (RAxML were first carried out on the large reference data set (D680. Parameters including tree topology, branch support, and age estimates, were then compared to those obtained from smaller data sets resulting from "classification-guided" (D088 and "phylogeny-guided sampling" (D101. Analyses of D088 failed to fully recover the phylogenetic diversity in Campanula, whereas D101 inferred significantly different branch support and age estimates. CONCLUSIONS/SIGNIFICANCE: A short genomic region with high phylogenetic utility allowed us to easily generate a comprehensive phylogenetic framework for the speciose Campanula clade. Our approach recovered 17 well-supported and circumscribed sub-lineages. Knowing these will be instrumental for developing more specific evolutionary hypotheses and guide future research, we highlight the predictive value of a mass taxon-sampling strategy as a first essential step towards illuminating the detailed

  4. How to Handle Speciose Clades? Mass Taxon-Sampling as a Strategy towards Illuminating the Natural History of Campanula (Campanuloideae)

    Science.gov (United States)

    Mansion, Guilhem; Parolly, Gerald; Crowl, Andrew A.; Mavrodiev, Evgeny; Cellinese, Nico; Oganesian, Marine; Fraunhofer, Katharina; Kamari, Georgia; Phitos, Dimitrios; Haberle, Rosemarie; Akaydin, Galip; Ikinci, Nursel; Raus, Thomas; Borsch, Thomas

    2012-01-01

    Background Speciose clades usually harbor species with a broad spectrum of adaptive strategies and complex distribution patterns, and thus constitute ideal systems to disentangle biotic and abiotic causes underlying species diversification. The delimitation of such study systems to test evolutionary hypotheses is difficult because they often rely on artificial genus concepts as starting points. One of the most prominent examples is the bellflower genus Campanula with some 420 species, but up to 600 species when including all lineages to which Campanula is paraphyletic. We generated a large alignment of petD group II intron sequences to include more than 70% of described species as a reference. By comparison with partial data sets we could then assess the impact of selective taxon sampling strategies on phylogenetic reconstruction and subsequent evolutionary conclusions. Methodology/Principal Findings Phylogenetic analyses based on maximum parsimony (PAUP, PRAP), Bayesian inference (MrBayes), and maximum likelihood (RAxML) were first carried out on the large reference data set (D680). Parameters including tree topology, branch support, and age estimates, were then compared to those obtained from smaller data sets resulting from “classification-guided” (D088) and “phylogeny-guided sampling” (D101). Analyses of D088 failed to fully recover the phylogenetic diversity in Campanula, whereas D101 inferred significantly different branch support and age estimates. Conclusions/Significance A short genomic region with high phylogenetic utility allowed us to easily generate a comprehensive phylogenetic framework for the speciose Campanula clade. Our approach recovered 17 well-supported and circumscribed sub-lineages. Knowing these will be instrumental for developing more specific evolutionary hypotheses and guide future research, we highlight the predictive value of a mass taxon-sampling strategy as a first essential step towards illuminating the detailed evolutionary

  5. Workshop summary: detection, impact, and control of specific pathogens in animal resource facilities.

    Science.gov (United States)

    Mansfield, Keith G; Riley, Lela K; Kent, Michael L

    2010-01-01

    Despite advances, infectious diseases remain a threat to animal facilities, continue to affect animal health, and serve as potential confounders of experimental research. A workshop entitled Detection, Impact, and Control of Specific Pathogens in Animal Resource Facilities was sponsored by the National Center for Research Resources (NCRR) and National Institutes of Aging (NIA) and held April 23-24, 2009, at the Lister Hill Conference Center on the National Institutes of Health's (NIH) Bethesda campus. The meeting brought together laboratory animal scientists and veterinarians with experience in fish, rodent, and nonhuman primate models to identify common issues and problems. Session speakers addressed (1) common practices and current knowledge of these species, (2) new technologies in the diagnosis of infectious diseases, (3) impact of environmental quality on infectious disease, (4) normal microbial flora in health and disease, (5) genetics and infectious disease, and (6) specific infectious agents and their impact on research. Attendees discussed current challenges and future needs, highlighting the importance of education and training, the funding of critical infrastructure and resource research, and the need for improved communication of disease risks and integration of these risks with strategic planning. NIH and NCRR have a strong record of supporting resource initiatives that have helped address many of these issues and recent efforts have focused on the building of consortium activities among such programs. This manuscript summarizes the presentations and conclusions of participants at the meeting; abstracts and a full conference report are available online (www.ncrr.nih.gov).

  6. Staff Scientist - RNA Bioinformatics | Center for Cancer Research

    Science.gov (United States)

    The newly established RNA Biology Laboratory (RBL) at the Center for Cancer Research (CCR), National Cancer Institute (NCI), National Institutes of Health (NIH) in Frederick, Maryland is recruiting a Staff Scientist with strong expertise in RNA bioinformatics to join the Intramural Research Program’s mission of high impact, high reward science. The RBL is the equivalent of an

  7. Open discovery: An integrated live Linux platform of Bioinformatics tools.

    Science.gov (United States)

    Vetrivel, Umashankar; Pilla, Kalabharath

    2008-01-01

    Historically, live linux distributions for Bioinformatics have paved way for portability of Bioinformatics workbench in a platform independent manner. Moreover, most of the existing live Linux distributions limit their usage to sequence analysis and basic molecular visualization programs and are devoid of data persistence. Hence, open discovery - a live linux distribution has been developed with the capability to perform complex tasks like molecular modeling, docking and molecular dynamics in a swift manner. Furthermore, it is also equipped with complete sequence analysis environment and is capable of running windows executable programs in Linux environment. Open discovery portrays the advanced customizable configuration of fedora, with data persistency accessible via USB drive or DVD. The Open Discovery is distributed free under Academic Free License (AFL) and can be downloaded from http://www.OpenDiscovery.org.in.

  8. ‘Students-as-partners’ scheme enhances postgraduate students’ employability skills while addressing gaps in bioinformatics education

    Science.gov (United States)

    Mello, Luciane V.; Tregilgas, Luke; Cowley, Gwen; Gupta, Anshul; Makki, Fatima; Jhutty, Anjeet; Shanmugasundram, Achchuthan

    2017-01-01

    Abstract Teaching bioinformatics is a longstanding challenge for educators who need to demonstrate to students how skills developed in the classroom may be applied to real world research. This study employed an action research methodology which utilised student–staff partnership and peer-learning. It was centred on the experiences of peer-facilitators, students who had previously taken a postgraduate bioinformatics module, and had applied knowledge and skills gained from it to their own research. It aimed to demonstrate to peer-receivers, current students, how bioinformatics could be used in their own research while developing peer-facilitators’ teaching and mentoring skills. This student-centred approach was well received by the peer-receivers, who claimed to have gained improved understanding of bioinformatics and its relevance to research. Equally, peer-facilitators also developed a better understanding of the subject and appreciated that the activity was a rare and invaluable opportunity to develop their teaching and mentoring skills, enhancing their employability. PMID:29098185

  9. Bioinformatics tools for development of fast and cost effective simple ...

    African Journals Online (AJOL)

    Bioinformatics tools for development of fast and cost effective simple sequence repeat ... comparative mapping and exploration of functional genetic diversity in the ... Already, a number of computer programs have been implemented that aim at ...

  10. Virginia Bioinformatics Institute to expand cyberinfrastructure education and outreach project

    OpenAIRE

    Whyte, Barry James

    2008-01-01

    The National Science Foundation has awarded the Virginia Bioinformatics Institute at Virginia Tech $918,000 to expand its education and outreach program in Cyberinfrastructure - Training, Education, Advancement and Mentoring, commonly known as the CI-TEAM.

  11. Effect of salt on the metabolism of 'Candidatus Accumulibacter' clade I and II

    NARCIS (Netherlands)

    Wang, Zhongwei; Dunne, Aislinn; van Loosdrecht, Mark C.M.; Saikaly, Pascal E.

    2018-01-01

    Saline wastewater is known to affect the performance of phosphate-accumulating organisms (PAOs) in enhanced biological phosphorus removal (EBPR) process. However, studies comparing the effect of salinity on different PAO clades are lacking. In this study, 'Candidatus Accumulibacter phosphatis'

  12. Taxonomy of the ant genus Proceratium Roger (Hymenoptera, Formicidae in the Afrotropical region with a revision of the P. arnoldi clade and description of four new species

    Directory of Open Access Journals (Sweden)

    Francisco Hita Garcia

    2014-10-01

    Full Text Available The taxonomy of the genus Proceratium Roger is updated for the Afrotropical region. We give an overview of the genus in the region, provide an illustrated identification key to the three clades (P. arnoldi, P. stictum and P. toschii clades and revise the P. arnoldi clade. Four new species from the P. arnoldi clade are described as new: P. sokoke sp. n. from Kenya, P. carri sp. n. from Mozambique, and P. nilo sp. n. and P. sali sp. n. from Tanzania. In order to integrate the new species into the existing taxonomic system we present an illustrated identification key to distinguish the seven Afrotropical species of the P. arnoldi clade. In addition, we provide accounts for all members of the P. arnoldi clade including detailed descriptions, diagnoses, taxonomic discussions, distribution data and high quality montage images.

  13. 6th International Conference on Practical Applications of Computational Biology & Bioinformatics

    CERN Document Server

    Luscombe, Nicholas; Fdez-Riverola, Florentino; Rodríguez, Juan; Practical Applications of Computational Biology & Bioinformatics

    2012-01-01

    The growth in the Bioinformatics and Computational Biology fields over the last few years has been remarkable.. The analysis of the datasets of Next Generation Sequencing needs new algorithms and approaches from fields such as Databases, Statistics, Data Mining, Machine Learning, Optimization, Computer Science and Artificial Intelligence. Also Systems Biology has also been emerging as an alternative to the reductionist view that dominated biological research in the last decades. This book presents the results of the  6th International Conference on Practical Applications of Computational Biology & Bioinformatics held at University of Salamanca, Spain, 28-30th March, 2012 which brought together interdisciplinary scientists that have a strong background in the biological and computational sciences.

  14. Detection and Characterization of Clade 1 Reassortant H5N1 Viruses Isolated from Human Cases in Vietnam during 2013.

    Directory of Open Access Journals (Sweden)

    Sharmi W Thor

    Full Text Available Highly pathogenic avian influenza (HPAI H5N1 is endemic in Vietnamese poultry and has caused sporadic human infection in Vietnam since 2003. Human infections with HPAI H5N1 are of concern due to a high mortality rate and the potential for the emergence of pandemic viruses with sustained human-to-human transmission. Viruses isolated from humans in southern Vietnam have been classified as clade 1 with a single genome constellation (VN3 since their earliest detection in 2003. This is consistent with detection of this clade/genotype in poultry viruses endemic to the Mekong River Delta and surrounding regions. Comparison of H5N1 viruses detected in humans from southern Vietnamese provinces during 2012 and 2013 revealed the emergence of a 2013 reassortant virus with clade 1.1.2 hemagglutinin (HA and neuraminidase (NA surface protein genes but internal genes derived from clade 2.3.2.1a viruses (A/Hubei/1/2010-like; VN12. Closer analysis revealed mutations in multiple genes of this novel genotype (referred to as VN49 previously associated with increased virulence in animal models and other markers of adaptation to mammalian hosts. Despite the changes identified between the 2012 and 2013 genotypes analyzed, their virulence in a ferret model was similar. Antigenically, the 2013 viruses were less cross-reactive with ferret antiserum produced to the clade 1 progenitor virus, A/Vietnam/1203/2004, but reacted with antiserum produced against a new clade 1.1.2 WHO candidate vaccine virus (A/Cambodia/W0526301/2012 with comparable hemagglutination inhibition titers as the homologous antigen. Together, these results indicate changes to both surface and internal protein genes of H5N1 viruses circulating in southern Vietnam compared to 2012 and earlier viruses.

  15. Phylogenetic signal detection from an ancient rapid radiation: Effects of noise reduction, long-branch attraction, and model selection in crown clade Apocynaceae.

    Science.gov (United States)

    Straub, Shannon C K; Moore, Michael J; Soltis, Pamela S; Soltis, Douglas E; Liston, Aaron; Livshultz, Tatyana

    2014-11-01

    Crown clade Apocynaceae comprise seven primary lineages of lianas, shrubs, and herbs with a diversity of pollen aggregation morphologies including monads, tetrads, and pollinia, making them an ideal group for investigating the evolution and function of pollen packaging. Traditional molecular systematic approaches utilizing small amounts of sequence data have failed to resolve relationships along the spine of the crown clade, a likely ancient rapid radiation. The previous best estimate of the phylogeny was a five-way polytomy, leaving ambiguous the homology of aggregated pollen in two major lineages, the Periplocoideae, which possess pollen tetrads, and the milkweeds (Secamonoideae plus Asclepiadoideae), which possess pollinia. To assess whether greatly increased character sampling would resolve these relationships, a plastome sequence data matrix was assembled for 13 taxa of Apocynaceae, including nine newly generated complete plastomes, one partial new plastome, and three previously reported plastomes, collectively representing all primary crown clade lineages and outgroups. The effects of phylogenetic noise, long-branch attraction, and model selection (linked versus unlinked branch lengths among data partitions) were evaluated in a hypothesis-testing framework based on Shimodaira-Hasegawa tests. Discrimination among alternative crown clade resolutions was affected by all three factors. Exclusion of the noisiest alignment positions and topologies influenced by long-branch attraction resulted in a trichotomy along the spine of the crown clade consisting of Rhabdadenia+the Asian clade, Baisseeae+milkweeds, and Periplocoideae+the New World clade. Parsimony reconstruction on all optimal topologies after noise exclusion unambiguously supports parallel evolution of aggregated pollen in Periplocoideae (tetrads) and milkweeds (pollinia). Our phylogenomic approach has greatly advanced the resolution of one of the most perplexing radiations in Apocynaceae, providing the

  16. Phylogeny of Elatinaceae and the Tropical Gondwanan Origin of the Centroplacaceae(Malpighiaceae, Elatinaceae Clade.

    Directory of Open Access Journals (Sweden)

    Liming Cai

    Full Text Available The flowering plant family Elatinaceae is a widespread aquatic lineage inhabiting temperate and tropical latitudes, including ∼35(-50 species. Its phylogeny remains largely unknown, compromising our understanding of its systematics. Moreover, this group is particularly in need of attention because the biogeography of most aquatic plant clades has yet to be investigated, resulting in uncertainty about whether aquatic plants show histories that deviate from terrestrial plants. We inferred the phylogeny of Elatinaceae from four DNA regions spanning 59 accessions across the family. An expanded sampling was used for molecular divergence time estimation and ancestral area reconstruction to infer the biogeography of Elatinaceae and their closest terrestrial relatives, Malpighiaceae and Centroplacaceae. The two genera of Elatinaceae, Bergia and Elatine, are monophyletic, but several traditionally recognized groups within the family are non-monophyletic. Our results suggest two ancient biogeographic events in the Centroplacaceae(Malpighiaceae, Elatinaceae clade involving western Gondwana, while Elatinaceae shows a more complicated biogeographic history with a high degree of continental endemicity. Our results indicate the need for further taxonomic investigation of Elatinaceae. Further, our study is one of few to implicate ancient Gondwanan biogeography in extant angiosperms, especially significant given the Centroplacaceae(Malpighiaceae, Elatinaceae clade's largely tropical distribution. Finally, Elatinaceae demonstrates long-term continental in situ diversification, which argues against recent dispersal as a universal explanation commonly invoked for aquatic plant distributions.

  17. Integration of Bioinformatics into an Undergraduate Biology Curriculum and the Impact on Development of Mathematical Skills

    Science.gov (United States)

    Wightman, Bruce; Hark, Amy T.

    2012-01-01

    The development of fields such as bioinformatics and genomics has created new challenges and opportunities for undergraduate biology curricula. Students preparing for careers in science, technology, and medicine need more intensive study of bioinformatics and more sophisticated training in the mathematics on which this field is based. In this…

  18. Into the Himalayan Exile: The Phylogeography of the Ground Beetle Ethira clade Supports the Tibetan Origin of Forest-Dwelling Himalayan Species Groups

    Science.gov (United States)

    Schmidt, Joachim; Opgenoorth, Lars; Höll, Steffen; Bastrop, Ralf

    2012-01-01

    The Himalayan mountain arc is one of the hotspots of biodiversity on earth, and species diversity is expected to be especially high among insects in this region. Little is known about the origin of the Himalayan insect fauna. With respect to the fauna of high altitude cloud forests, it has generally been accepted that Himalayan lineages are derived from ancestors that immigrated from Western Asia and from adjacent mountainous regions of East and Southeast Asia (immigration hypothesis). In this study, we sought to test a Tibetan Origin as an alternative hypothesis for groups with a poor dispersal ability through a phylogeographic analysis of the Ethira clade of the genus Pterostichus. We sequenced COI mtDNA and the 18S and 28S rDNA genes in 168 Pterostichini specimens, including 46 species and subspecies of the Ethira clade. In our analysis, we were able to show that the Ethira clade is monophyletic and, thus, represents a Himalayan endemic clade, supporting endemism of two of the basal lineages to the Central Himalaya and documenting large distributional gaps within the phylogeographic structure of the Ethira clade. Furthermore, the molecular data strongly indicate very limited dispersal abilities of species and subspecies of these primary wingless ground beetles. These results are consistent with the hypothesis of a Tibetan Origin, which explains the evolution, diversity and distribution of the Himalayan ground beetle Ethira clade much more parsimoniously than the original immigration hypothesis. PMID:23049805

  19. PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasets.

    Science.gov (United States)

    Djokic-Petrovic, Marija; Cvjetkovic, Vladimir; Yang, Jeremy; Zivanovic, Marko; Wild, David J

    2017-09-20

    There are a huge variety of data sources relevant to chemical, biological and pharmacological research, but these data sources are highly siloed and cannot be queried together in a straightforward way. Semantic technologies offer the ability to create links and mappings across datasets and manage them as a single, linked network so that searching can be carried out across datasets, independently of the source. We have developed an application called PIBAS FedSPARQL that uses semantic technologies to allow researchers to carry out such searching across a vast array of data sources. PIBAS FedSPARQL is a web-based query builder and result set visualizer of bioinformatics data. As an advanced feature, our system can detect similar data items identified by different Uniform Resource Identifiers (URIs), using a text-mining algorithm based on the processing of named entities to be used in Vector Space Model and Cosine Similarity Measures. According to our knowledge, PIBAS FedSPARQL was unique among the systems that we found in that it allows detecting of similar data items. As a query builder, our system allows researchers to intuitively construct and run Federated SPARQL queries across multiple data sources, including global initiatives, such as Bio2RDF, Chem2Bio2RDF, EMBL-EBI, and one local initiative called CPCTAS, as well as additional user-specified data source. From the input topic, subtopic, template and keyword, a corresponding initial Federated SPARQL query is created and executed. Based on the data obtained, end users have the ability to choose the most appropriate data sources in their area of interest and exploit their Resource Description Framework (RDF) structure, which allows users to select certain properties of data to enhance query results. The developed system is flexible and allows intuitive creation and execution of queries for an extensive range of bioinformatics topics. Also, the novel "similar data items detection" algorithm can be particularly

  20. Genomic characterization of two novel SAR11 isolates from the Red Sea, including the first strain of the SAR11 Ib clade.

    Science.gov (United States)

    Jimenez-Infante, Francy; Ngugi, David Kamanda; Vinu, Manikandan; Blom, Jochen; Alam, Intikhab; Bajic, Vladimir B; Stingl, Ulrich

    2017-07-01

    The SAR11 clade (Pelagibacterales) is a diverse group that forms a monophyletic clade within the Alphaproteobacteria, and constitutes up to one third of all prokaryotic cells in the photic zone of most oceans. Pelagibacterales are very abundant in the warm and highly saline surface waters of the Red Sea, raising the question of adaptive traits of SAR11 populations in this water body and warmer oceans through the world. In this study, two pure cultures were successfully obtained from surface waters on the Red Sea: one isolate of subgroup Ia and one of the previously uncultured SAR11 Ib lineage. The novel genomes were very similar to each other and to genomes of isolates of SAR11 subgroup Ia (Ia pan-genome), both in terms of gene content and synteny. Among the genes that were not present in the Ia pan-genome, 108 (RS39, Ia) and 151 genes (RS40, Ib) were strain specific. Detailed analyses showed that only 51 (RS39, Ia) and 55 (RS40, Ib) of these strain-specific genes had not reported before on genome fragments of Pelagibacterales. Further analyses revealed the potential production of phosphonates by some SAR11 members and possible adaptations for oligotrophic life, including pentose sugar utilization and adhesion to marine particulate matter. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  1. EVOLUTION OF NUCLEAR RDNA ITS SEQUENCES IN THE CLADOPHORA ALBIDA/SERICEA CLADE (CHLOROPHYTA)

    NARCIS (Netherlands)

    BAKKER, FT; OLSEN, JL; STAM, WT

    Ribosomal DNA ITS sequences were compared among 13 different species and biogeographic isolates from the monophyletic ''abbida/sericea clade'' in the green algal genus Cladophora. Six distinct ITS sequence types were found, characterized by multiple insertions and deletions and high levels of

  2. Bioinformatics for Undergraduates: Steps toward a Quantitative Bioscience Curriculum

    Science.gov (United States)

    Chapman, Barbara S.; Christmann, James L.; Thatcher, Eileen F.

    2006-01-01

    We describe an innovative bioinformatics course developed under grants from the National Science Foundation and the California State University Program in Research and Education in Biotechnology for undergraduate biology students. The project has been part of a continuing effort to offer students classroom experiences focused on principles and…

  3. Species-specific variation in nesting and postfledging resource selection for two forest breeding migrant songbirds.

    Directory of Open Access Journals (Sweden)

    Julianna M A Jenkins

    Full Text Available Habitat selection is a fundamental component of community ecology, population ecology, and evolutionary biology and can be especially important to species with complex annual habitat requirements, such as migratory birds. Resource preferences on the breeding grounds may change during the postfledging period for migrant songbirds, however, the degree to which selection changes, timing of change, and whether all or only a few species alter their resource use is unclear. We compared resource selection for nest sites and resource selection by postfledging juvenile ovenbirds (Seiurus aurocapilla and Acadian flycatchers (Empidonax virescens followed with radio telemetry in Missouri mature forest fragments from 2012-2015. We used Bayesian discrete choice modeling to evaluate support for local vegetation characteristics on the probability of selection for nest sites and locations utilized by different ages of postfledging juveniles. Patterns of resource selection variation were species-specific. Resource selection models indicated that Acadian flycatcher habitat selection criteria were similar for nesting and dependent postfledging juveniles and selection criteria diverged when juveniles became independent from adults. After independence, flycatcher resource selection was more associated with understory foliage density. Ovenbirds differed in selection criteria between the nesting and postfledging periods. Fledgling ovenbirds selected areas with higher densities of understory structure compared to nest sites, and the effect of foliage density on selection increased as juveniles aged and gained independence. The differences observed between two sympatric forest nesting species, in both the timing and degree of change in resource selection criteria over the course of the breeding season, illustrates the importance of considering species-specific traits and postfledging requirements when developing conservation efforts, especially when foraging guilds or

  4. Integration of Proteomics, Bioinformatics, and Systems Biology in Traumatic Brain Injury Biomarker Discovery

    Science.gov (United States)

    Guingab-Cagmat, J.D.; Cagmat, E.B.; Hayes, R.L.; Anagli, J.

    2013-01-01

    Traumatic brain injury (TBI) is a major medical crisis without any FDA-approved pharmacological therapies that have been demonstrated to improve functional outcomes. It has been argued that discovery of disease-relevant biomarkers might help to guide successful clinical trials for TBI. Major advances in mass spectrometry (MS) have revolutionized the field of proteomic biomarker discovery and facilitated the identification of several candidate markers that are being further evaluated for their efficacy as TBI biomarkers. However, several hurdles have to be overcome even during the discovery phase which is only the first step in the long process of biomarker development. The high-throughput nature of MS-based proteomic experiments generates a massive amount of mass spectral data presenting great challenges in downstream interpretation. Currently, different bioinformatics platforms are available for functional analysis and data mining of MS-generated proteomic data. These tools provide a way to convert data sets to biologically interpretable results and functional outcomes. A strategy that has promise in advancing biomarker development involves the triad of proteomics, bioinformatics, and systems biology. In this review, a brief overview of how bioinformatics and systems biology tools analyze, transform, and interpret complex MS datasets into biologically relevant results is discussed. In addition, challenges and limitations of proteomics, bioinformatics, and systems biology in TBI biomarker discovery are presented. A brief survey of researches that utilized these three overlapping disciplines in TBI biomarker discovery is also presented. Finally, examples of TBI biomarkers and their applications are discussed. PMID:23750150

  5. Revision of the Middle American clade of the ant genus Stenamma Westwood (Hymenoptera, Formicidae, Myrmicinae

    Directory of Open Access Journals (Sweden)

    Michael Branstetter

    2013-04-01

    Full Text Available Stenamma is a cryptic “leaf-litter” ant genus that occurs in mesic forest habitats throughout the Holarctic region, Central America, and part of northwestern South America (Colombia and Ecuador. The genus was thought to be restricted primarily to the temperate zone, but recent collecting efforts have uncovered a large radiation of Neotropical forms, which rival the Holarctic species in terms of morphological and behavioral diversity. By inferring a broad-scale molecular phylogeny of Stenamma, Branstetter (2012 showed that all Neotropical species belong to a diverse Middle American clade (MAC, and that this clade is sister to an almost completely geographically separated Holarctic clade (HOC. Here, the Middle American clade of Stenamma is revised to recognize 40 species, of which 33 are described as new. Included in the revision are a key to species based on the worker caste, and for each species where possible, descriptions and images of workers and queens, images of males, information on geographic distribution, descriptions of intraspecific variation, and notes on natural history. Several species groups are defined, but the majority of species remain unassigned due to a lack of diagnostic morphological character states for most molecular clades. The following species are redescribed: S. alas Longino, S. diversum Mann, S. expolitum Smith, S. felixi Mann, S. huachucanum Smith, S. manni Wheeler, and S. schmidti Menozzi. The following are described as new: S. andersoni sp. n., S. atribellum sp. n., S. brujita sp. n., S. callipygium sp. n., S. catracho sp. n., S. connectum sp. n., S. crypticum sp. n., S. cusuco sp. n., S. excisum sp. n., S. expolitico sp. n., S. hojarasca sp. n., S. ignotum sp. n., S. lagunum sp. n., S. llama sp. n., S. leptospinum sp. n., S. lobinodus sp. n., S. longinoi sp. n., S. maximon sp. n., S. megamanni sp. n., S. monstrosum sp. n., S. muralla sp. n., S. nanozoi sp. n., S. nonotch sp. n., S. ochrocnemis sp. n., S

  6. In silico cloning and bioinformatic analysis of PEPCK gene in ...

    African Journals Online (AJOL)

    Phosphoenolpyruvate carboxykinase (PEPCK), a critical gluconeogenic enzyme, catalyzes the first committed step in the diversion of tricarboxylic acid cycle intermediates toward gluconeogenesis. According to the relative conservation of homologous gene, a bioinformatics strategy was applied to clone Fusarium ...

  7. Phylogenomic analyses of 539 highly informative loci dates a fully resolved time tree for the major clades of living turtles (Testudines).

    Science.gov (United States)

    Shaffer, H Bradley; McCartney-Melstad, Evan; Near, Thomas J; Mount, Genevieve G; Spinks, Phillip Q

    2017-10-01

    Accurate time-calibrated phylogenies are the centerpiece of many macroevolutionary studies, and the relationship between the size and scale of molecular data sets and the density and accuracy of fossil calibrations is a key element of time tree studies. Here, we develop a target capture array specifically for living turtles, compare its efficiency to an ultraconserved element (UCE) dataset, and present a time-calibrated molecular phylogeny based on 539 nuclear loci sequenced from 26 species representing the breadth of living turtle diversity plus outgroups. Our gene array, based on three fully sequenced turtle genomes, is 2.4 times more variable across turtles than a recently published UCE data set for an identical subset of 13 species, confirming that taxon-specific arrays return more informative data per sequencing effort than UCEs. We used our genomic data to estimate the ages of living turtle clades including a mid-late Triassic origin for crown turtles and a mid-Carboniferous split of turtles from their sister group, Archosauria. By specifically excluding several of the earliest potential crown turtle fossils and limiting the age of fossil calibration points to the unambiguous crown lineage Caribemys oxfordiensis from the Late Jurassic (Oxfordian, 163.5-157.3Ma) we corroborate a relatively ancient age for living turtles. We also provide novel age estimates for five of the ten testudine families containing more than a single species, as well as several intrafamilial clades. Most of the diversity of crown turtles appears to date to the Paleogene, well after the Cretaceous-Paleogene mass extinction 66mya. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. Phylogenomic analyses and molecular signatures for the class Halobacteria and its two major clades: a proposal for division of the class Halobacteria into an emended order Halobacteriales and two new orders, Haloferacales ord. nov. and Natrialbales ord. nov., containing the novel families Haloferacaceae fam. nov. and Natrialbaceae fam. nov.

    Science.gov (United States)

    Gupta, Radhey S; Naushad, Sohail; Baker, Sheridan

    2015-03-01

    The Halobacteria constitute one of the largest groups within the Archaea. The hierarchical relationship among members of this large class, which comprises a single order and a single family, has proven difficult to determine based upon 16S rRNA gene trees and morphological and physiological characteristics. This work reports detailed phylogenetic and comparative genomic studies on >100 halobacterial (haloarchaeal) genomes containing representatives from 30 genera to investigate their evolutionary relationships. In phylogenetic trees reconstructed on the basis of 32 conserved proteins, using both neighbour-joining and maximum-likelihood methods, two major clades (clades A and B) encompassing nearly two-thirds of the sequenced haloarchaeal species were strongly supported. Clades grouping the same species/genera were also supported by the 16S rRNA gene trees and trees for several individual highly conserved proteins (RpoC, EF-Tu, UvrD, GyrA, EF-2/EF-G). In parallel, our comparative analyses of protein sequences from haloarchaeal genomes have identified numerous discrete molecular markers in the form of conserved signature indels (CSI) in protein sequences and conserved signature proteins (CSPs) that are found uniquely in specific groups of haloarchaea. Thirteen CSIs in proteins involved in diverse functions and 68 CSPs that are uniquely present in all or most genome-sequenced haloarchaea provide novel molecular means for distinguishing members of the class Halobacteria from all other prokaryotes. The members of clade A are distinguished from all other haloarchaea by the unique shared presence of two CSIs in the ribose operon protein and small GTP-binding protein and eight CSPs that are found specifically in members of this clade. Likewise, four CSIs in different proteins and five other CSPs are present uniquely in members of clade B and distinguish them from all other haloarchaea. Based upon their specific clustering in phylogenetic trees for different gene

  9. Metabolic fluxes in the central carbon metabolism of Dinoroseobacter shibae and Phaeobacter gallaeciensis, two members of the marine Roseobacter clade

    Directory of Open Access Journals (Sweden)

    Rabus Ralf

    2009-09-01

    Full Text Available Abstract Background In the present work the central carbon metabolism of Dinoroseobacter shibae and Phaeobacter gallaeciensis was studied at the level of metabolic fluxes. These two strains belong to the marine Roseobacter clade, a dominant bacterial group in various marine habitats, and represent surface-associated, biofilm-forming growth (P. gallaeciensis and symbiotic growth with eukaryotic algae (D. shibae. Based on information from recently sequenced genomes, a rich repertoire of pathways has been identified in the carbon core metabolism of these organisms, but little is known about the actual contribution of the various reactions in vivo. Results Using 13C labelling techniques in specifically designed experiments, it could be shown that glucose-grown cells of D. shibae catabolise the carbon source exclusively via the Entner-Doudoroff pathway, whereas alternative routes of glycolysis and the pentose phosphate pathway are obviously utilised for anabolic purposes only. Enzyme assays confirmed this flux pattern and link the lack of glycolytic flux to the absence of phosphofructokinase activity. The previously suggested formation of phosphoenolpyruvate from pyruvate during mixotrophic CO2 assimilation was found to be inactive under the conditions studied. Moreover, it could be shown that pyruvate carboxylase is involved in CO2 assimilation and that the cyclic respiratory mode of the TCA cycle is utilised. Interestingly, the use of intracellular pathways was highly similar for P. gallaeciensis. Conclusion The present study reveals the first insight into pathway utilisation within the Roseobacter group. Fluxes through major intracellular pathways of the central carbon metabolism, which are closely linked to the various important traits found for the Roseobacter clade, could be determined. The close similarity of fluxes between the two physiologically rather different species might provide the first indication of more general key properties among

  10. Bioinformatics in the Netherlands : The value of a nationwide community

    NARCIS (Netherlands)

    van Gelder, Celia W.G.; Hooft, Rob; van Rijswijk, Merlijn; van den Berg, Linda; Kok, Ruben; Reinders, M.J.T.; Mons, Barend; Heringa, Jaap

    2017-01-01

    This review provides a historical overview of the inception and development of bioinformatics research in the Netherlands. Rooted in theoretical biology by foundational figures such as Paulien Hogeweg (at Utrecht University since the 1970s), the developments leading to organizational structures

  11. Developing eThread Pipeline Using SAGA-Pilot Abstraction for Large-Scale Structural Bioinformatics

    Directory of Open Access Journals (Sweden)

    Anjani Ragothaman

    2014-01-01

    Full Text Available While most of computational annotation approaches are sequence-based, threading methods are becoming increasingly attractive because of predicted structural information that could uncover the underlying function. However, threading tools are generally compute-intensive and the number of protein sequences from even small genomes such as prokaryotes is large typically containing many thousands, prohibiting their application as a genome-wide structural systems biology tool. To leverage its utility, we have developed a pipeline for eThread—a meta-threading protein structure modeling tool, that can use computational resources efficiently and effectively. We employ a pilot-based approach that supports seamless data and task-level parallelism and manages large variation in workload and computational requirements. Our scalable pipeline is deployed on Amazon EC2 and can efficiently select resources based upon task requirements. We present runtime analysis to characterize computational complexity of eThread and EC2 infrastructure. Based on results, we suggest a pathway to an optimized solution with respect to metrics such as time-to-solution or cost-to-solution. Our eThread pipeline can scale to support a large number of sequences and is expected to be a viable solution for genome-scale structural bioinformatics and structure-based annotation, particularly, amenable for small genomes such as prokaryotes. The developed pipeline is easily extensible to other types of distributed cyberinfrastructure.

  12. An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics

    International Nuclear Information System (INIS)

    Taylor, Ronald C.

    2010-01-01

    Bioinformatics researchers are increasingly confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce. An overview is given of the current usage within the bioinformatics community of Hadoop, a top-level Apache Software Foundation project, and of associated open source software projects. The concepts behind Hadoop and the associated HBase project are defined, and current bioinformatics software that employ Hadoop is described. The focus is on next-generation sequencing, as the leading application area to date.

  13. Antigenic Variation in H5N1 clade 2.1 Viruses in Indonesia from 2005 to 2011

    Directory of Open Access Journals (Sweden)

    Vivi Setiawaty

    2013-01-01

    Full Text Available Influenza A (H5N1 virus, has spread to several countries in the world and has a high mortality rate. Meanwhile, the virus has evolved into several clades. The human influenza A (H5N1 virus circulating in Indonesia is a member of clade 2.1, which is different in antigenicity from other clades of influenza A (H5N1. An analysis of the antigenic variation in the H5 hemagglutinin gene (HA of the influenza A (H5N1 virus strains circulating in Indonesia has been undertaken. Several position of amino acid mutations, including mutations at positions 35, 53, 141, 145, 163, 174, 183, 184, 189, and 231, have been identified. The mutation Val-174-Iso appears to play an important role in immunogenicity and cross-reactivity with rabbit antisera. This study shows that the evolution of the H5HA antigenic variation of the influenza A (H5N1 virus circulating in Indonesia from 2005 to 2011 may affect the immunogenicity of the virus.

  14. Learning Genetics through an Authentic Research Simulation in Bioinformatics

    Science.gov (United States)

    Gelbart, Hadas; Yarden, Anat

    2006-01-01

    Following the rationale that learning is an active process of knowledge construction as well as enculturation into a community of experts, we developed a novel web-based learning environment in bioinformatics for high-school biology majors in Israel. The learning environment enables the learners to actively participate in a guided inquiry process…

  15. Molecular phylogeny and ecological diversification in a clade of New World songbirds (genus Vireo).

    Science.gov (United States)

    Cicero, C; Johnson, N K

    1998-10-01

    We constructed a molecular phylogeny for a clade of eye-ringed vireos (Vireo flavifrons and the V. solitarius complex) to examine existing hypotheses of speciation and ecological diversification. Complete sequences of the mtDNA cytochrome b gene were obtained from 47 individuals of this group plus four vireonid outgroups. Mean levels of sequence divergence in the clade varied from 0.29% to 5.7%. Differences were greatest between V. flavifrons and four taxa of 'V. solitarius'. The latter separated into three taxonomic, geographical and ecological groups: V. plumbeus plumbeus, V. cassinii cassinii, and V. solitarius solitarius plus V. solitarius alticola. These differed by an average of 2.6-3.2%. Populations within each group revealed low levels of sequence variation (x = 0.20%) and little geographical structuring. The mtDNA data generally corroborate results from allozymes. V. plumbeus shows a loss of yellow-green carotenoid pigmentation from the ancestral condition. The occupancy of relatively dry habitats by this species and V. cassinii represents a derived ecological shift from more-humid environments occupied by other species of vireonids. Ecological divergence in this clade occurred in allopatry and is associated with generic-level stability in morphometrics and foraging styles. Migratory behaviour and seasonal habitat shifts apparently evolved multiple times in vireos breeding in temperate environments. Present geographical and ecological distributions, and low levels of intrataxon genetic divergence, are hypothesized to be the result of postglacial regionalization of climate-plant associations and rapid northward expansion of breeding ranges.

  16. Atlas – a data warehouse for integrative bioinformatics

    Directory of Open Access Journals (Sweden)

    Yuen Macaire MS

    2005-02-01

    Full Text Available Abstract Background We present a biological data warehouse called Atlas that locally stores and integrates biological sequences, molecular interactions, homology information, functional annotations of genes, and biological ontologies. The goal of the system is to provide data, as well as a software infrastructure for bioinformatics research and development. Description The Atlas system is based on relational data models that we developed for each of the source data types. Data stored within these relational models are managed through Structured Query Language (SQL calls that are implemented in a set of Application Programming Interfaces (APIs. The APIs include three languages: C++, Java, and Perl. The methods in these API libraries are used to construct a set of loader applications, which parse and load the source datasets into the Atlas database, and a set of toolbox applications which facilitate data retrieval. Atlas stores and integrates local instances of GenBank, RefSeq, UniProt, Human Protein Reference Database (HPRD, Biomolecular Interaction Network Database (BIND, Database of Interacting Proteins (DIP, Molecular Interactions Database (MINT, IntAct, NCBI Taxonomy, Gene Ontology (GO, Online Mendelian Inheritance in Man (OMIM, LocusLink, Entrez Gene and HomoloGene. The retrieval APIs and toolbox applications are critical components that offer end-users flexible, easy, integrated access to this data. We present use cases that use Atlas to integrate these sources for genome annotation, inference of molecular interactions across species, and gene-disease associations. Conclusion The Atlas biological data warehouse serves as data infrastructure for bioinformatics research and development. It forms the backbone of the research activities in our laboratory and facilitates the integration of disparate, heterogeneous biological sources of data enabling new scientific inferences. Atlas achieves integration of diverse data sets at two levels. First

  17. Characterization of the denitrification-associated phosphorus uptake properties of "Candidatus Accumulibacter phosphatis" clades in sludge subjected to enhanced biological phosphorus removal.

    Science.gov (United States)

    Kim, Jeong Myeong; Lee, Hyo Jung; Lee, Dae Sung; Jeon, Che Ok

    2013-03-01

    To characterize the denitrifying phosphorus (P) uptake properties of "Candidatus Accumulibacter phosphatis," a sequencing batch reactor (SBR) was operated with acetate. The SBR operation was gradually acclimated from anaerobic-oxic (AO) to anaerobic-anoxic-oxic (A2O) conditions by stepwise increases of nitrate concentration and the anoxic time. The communities of "Ca. Accumulibacter" and associated bacteria at the initial (AO) and final (A2O) stages were compared using 16S rRNA and polyphosphate kinase genes and using fluorescence in situ hybridization (FISH). The acclimation process led to a clear shift in the relative abundances of recognized "Ca. Accumulibacter" subpopulations from clades IIA > IA > IIF to clades IIC > IA > IIF, as well as to increases in the abundance of other associated bacteria (Dechloromonas [from 1.2% to 19.2%] and "Candidatus Competibacter phosphatis" [from 16.4% to 20.0%]), while the overall "Ca. Accumulibacter" abundance decreased (from 55.1% to 29.2%). A series of batch experiments combined with FISH/microautoradiography (MAR) analyses was performed to characterize the denitrifying P uptake properties of the "Ca. Accumulibacter" clades. In FISH/MAR experiments using slightly diluted sludge (∼0.5 g/liter), all "Ca. Accumulibacter" clades successfully took up phosphorus in the presence of nitrate. However, the "Ca. Accumulibacter" clades showed no P uptake in the presence of nitrate when the sludge was highly diluted (∼0.005 g/liter); under these conditions, reduction of nitrate to nitrite did not occur, whereas P uptake by "Ca. Accumulibacter" clades occurred when nitrite was added. These results suggest that the "Ca. Accumulibacter" cells lack nitrate reduction capabilities and that P uptake by "Ca. Accumulibacter" is dependent upon nitrite generated by associated nitrate-reducing bacteria such as Dechloromonas and "Ca. Competibacter."

  18. Identification of microRNAs from Eugenia uniflora by high-throughput sequencing and bioinformatics analysis.

    Science.gov (United States)

    Guzman, Frank; Almerão, Mauricio P; Körbes, Ana P; Loss-Morais, Guilherme; Margis, Rogerio

    2012-01-01

    microRNAs or miRNAs are small non-coding regulatory RNAs that play important functions in the regulation of gene expression at the post-transcriptional level by targeting mRNAs for degradation or inhibiting protein translation. Eugenia uniflora is a plant native to tropical America with pharmacological and ecological importance, and there have been no previous studies concerning its gene expression and regulation. To date, no miRNAs have been reported in Myrtaceae species. Small RNA and RNA-seq libraries were constructed to identify miRNAs and pre-miRNAs in Eugenia uniflora. Solexa technology was used to perform high throughput sequencing of the library, and the data obtained were analyzed using bioinformatics tools. From 14,489,131 small RNA clean reads, we obtained 1,852,722 mature miRNA sequences representing 45 conserved families that have been identified in other plant species. Further analysis using contigs assembled from RNA-seq allowed the prediction of secondary structures of 25 known and 17 novel pre-miRNAs. The expression of twenty-seven identified miRNAs was also validated using RT-PCR assays. Potential targets were predicted for the most abundant mature miRNAs in the identified pre-miRNAs based on sequence homology. This study is the first large scale identification of miRNAs and their potential targets from a species of the Myrtaceae family without genomic sequence resources. Our study provides more information about the evolutionary conservation of the regulatory network of miRNAs in plants and highlights species-specific miRNAs.

  19. Computational biology of genome expression and regulation--a review of microarray bioinformatics.

    Science.gov (United States)

    Wang, Junbai

    2008-01-01

    Microarray technology is being used widely in various biomedical research areas; the corresponding microarray data analysis is an essential step toward the best utilizing of array technologies. Here we review two components of the microarray data analysis: a low level of microarray data analysis that emphasizes the designing, the quality control, and the preprocessing of microarray experiments, then a high level of microarray data analysis that focuses on the domain-specific microarray applications such as tumor classification, biomarker prediction, analyzing array CGH experiments, and reverse engineering of gene expression networks. Additionally, we will review the recent development of building a predictive model in genome expression and regulation studies. This review may help biologists grasp a basic knowledge of microarray bioinformatics as well as its potential impact on the future evolvement of biomedical research fields.

  20. Evolution of the intercontinental disjunctions in six continents in the Ampelopsis clade of the grape family (Vitaceae)

    Science.gov (United States)

    2012-01-01

    Background The Ampelopsis clade (Ampelopsis and its close allies) of the grape family Vitaceae contains ca. 43 species disjunctly distributed in Asia, Europe, North America, South America, Africa, and Australia, and is a rare example to study both the Northern and the Southern Hemisphere intercontinental disjunctions. We reconstruct the temporal and spatial diversification of the Ampelopsis clade to explore the evolutionary processes that have resulted in their intercontinental disjunctions in six continents. Results The Bayesian molecular clock dating and the likelihood ancestral area analyses suggest that the Ampelopsis clade most likely originated in North America with its crown group dated at 41.2 Ma (95% HPD 23.4 - 61.0 Ma) in the middle Eocene. Two independent Laurasian migrations into Eurasia are inferred to have occurred in the early Miocene via the North Atlantic land bridges. The ancestor of the Southern Hemisphere lineage migrated from North America to South America in the early Oligocene. The Gondwanan-like pattern of intercontinental disjunction is best explained by two long-distance dispersals: once from South America to Africa estimated at 30.5 Ma (95% HPD 16.9 - 45.9 Ma), and the other from South America to Australia dated to 19.2 Ma (95% HPD 6.7 - 22.3 Ma). Conclusions The global disjunctions in the Ampelopsis clade are best explained by a diversification model of North American origin, two Laurasian migrations, one migration into South America, and two post-Gondwanan long-distance dispersals. These findings highlight the importance of both vicariance and long distance dispersal in shaping intercontinental disjunctions of flowering plants. PMID:22316163