WorldWideScience

Sample records for sampling expedition metagenomic

  1. The Sorcerer II Global Ocean Sampling Expedition: metagenomic characterization of viruses within aquatic microbial samples.

    Directory of Open Access Journals (Sweden)

    Shannon J Williamson

    Full Text Available Viruses are the most abundant biological entities on our planet. Interactions between viruses and their hosts impact several important biological processes in the world's oceans such as horizontal gene transfer, microbial diversity and biogeochemical cycling. Interrogation of microbial metagenomic sequence data collected as part of the Sorcerer II Global Ocean Expedition (GOS revealed a high abundance of viral sequences, representing approximately 3% of the total predicted proteins. Cluster analyses of the viral sequences revealed hundreds to thousands of viral genes encoding various metabolic and cellular functions. Quantitative analyses of viral genes of host origin performed on the viral fraction of aquatic samples confirmed the viral nature of these sequences and suggested that significant portions of aquatic viral communities behave as reservoirs of such genetic material. Distributional and phylogenetic analyses of these host-derived viral sequences also suggested that viral acquisition of environmentally relevant genes of host origin is a more abundant and widespread phenomenon than previously appreciated. The predominant viral sequences identified within microbial fractions originated from tailed bacteriophages and exhibited varying global distributions according to viral family. Recruitment of GOS viral sequence fragments against 27 complete aquatic viral genomes revealed that only one reference bacteriophage genome was highly abundant and was closely related, but not identical, to the cyanomyovirus P-SSM4. The co-distribution across all sampling sites of P-SSM4-like sequences with the dominant ecotype of its host, Prochlorococcus supports the classification of the viral sequences as P-SSM4-like and suggests that this virus may influence the abundance, distribution and diversity of one of the most dominant components of picophytoplankton in oligotrophic oceans. In summary, the abundance and broad geographical distribution of viral

  2. Comparison of metagenomic samples using sequence signatures

    Directory of Open Access Journals (Sweden)

    Jiang Bai

    2012-12-01

    Full Text Available Abstract Background Sequence signatures, as defined by the frequencies of k-tuples (or k-mers, k-grams, have been used extensively to compare genomic sequences of individual organisms, to identify cis-regulatory modules, and to study the evolution of regulatory sequences. Recently many next-generation sequencing (NGS read data sets of metagenomic samples from a variety of different environments have been generated. The assembly of these reads can be difficult and analysis methods based on mapping reads to genes or pathways are also restricted by the availability and completeness of existing databases. Sequence-signature-based methods, however, do not need the complete genomes or existing databases and thus, can potentially be very useful for the comparison of metagenomic samples using NGS read data. Still, the applications of sequence signature methods for the comparison of metagenomic samples have not been well studied. Results We studied several dissimilarity measures, including d2, d2* and d2S recently developed from our group, a measure (hereinafter noted as Hao used in CVTree developed from Hao’s group (Qi et al., 2004, measures based on relative di-, tri-, and tetra-nucleotide frequencies as in Willner et al. (2009, as well as standard lp measures between the frequency vectors, for the comparison of metagenomic samples using sequence signatures. We compared their performance using a series of extensive simulations and three real next-generation sequencing (NGS metagenomic datasets: 39 fecal samples from 33 mammalian host species, 56 marine samples across the world, and 13 fecal samples from human individuals. Results showed that the dissimilarity measure d2S can achieve superior performance when comparing metagenomic samples by clustering them into different groups as well as recovering environmental gradients affecting microbial samples. New insights into the environmental factors affecting microbial compositions in metagenomic samples

  3. Shotgun metagenomics, from sampling to analysis.

    Science.gov (United States)

    Quince, Christopher; Walker, Alan W; Simpson, Jared T; Loman, Nicholas J; Segata, Nicola

    2017-09-12

    Diverse microbial communities of bacteria, archaea, viruses and single-celled eukaryotes have crucial roles in the environment and in human health. However, microbes are frequently difficult to culture in the laboratory, which can confound cataloging of members and understanding of how communities function. High-throughput sequencing technologies and a suite of computational pipelines have been combined into shotgun metagenomics methods that have transformed microbiology. Still, computational approaches to overcome the challenges that affect both assembly-based and mapping-based metagenomic profiling, particularly of high-complexity samples or environments containing organisms with limited similarity to sequenced genomes, are needed. Understanding the functions and characterizing specific strains of these communities offers biotechnological promise in therapeutic discovery and innovative ways to synthesize products using microbial factories and can pinpoint the contributions of microorganisms to planetary, animal and human health.

  4. Exploring the functional side of the Ocean Sampling Day metagenomes

    Science.gov (United States)

    Antonio, F. G.; Kottmann, R.; Wallom, D.; Glöckner, F. O.

    2016-02-01

    The Ocean Sampling Day (OSD) is a simultaneous, collaborative, standardized, and global mega-sequencing campaign to analyze marine microbial community composition and functional traits. 150 metagenomes were sequenced from the first OSD in June 2014 including a rich set of environmental and oceanographic measurements. Unlike other ocean mega-surveys such as Global Ocean Sampling (GOS) or the TARA expedition that mostly sampled open ocean waters most of the OSD samples are from coastal sampling sites, an area not previously well studied in this regard. The result is that OSD adds more than three million new genes to the recently published Ocean Microbial-Reference Gene Catalog (Sunawaga et al., 2015). This allows us to significantly increase our knowledge of the ocean microbiome, identify hot-spots of novelty in terms of function and investigate the impact of human activities on oceans coastal areas where there is the largest interaction between dense human populations and the marine world. Additionally, these cumulative samples, related in time, space and environmental parameters, are providing insights into fundamental rules describing microbial diversity and function and contribute to the blue economy through the identification of novel ocean-derived biotechnologies. References: Sunagawa, Coelho, Chaffron, et al. (2015, May). Structure and function of the global ocean microbiome. Science, 348(6237), 126135.

  5. Surveillance of Foodborne Pathogens: Towards Diagnostic Metagenomics of Fecal Samples

    DEFF Research Database (Denmark)

    Andersen, Sandra Christine; Hoorfar, Jeffrey

    2018-01-01

    Diagnostic metagenomics is a rapidly evolving laboratory tool for culture-independent tracing of foodborne pathogens. The method has the potential to become a generic platform for detection of most pathogens and many sample types. Today, however, it is still at an early and experimental stage....... Studies show that metagenomic methods, from sample storage and DNA extraction to library preparation and shotgun sequencing, have a great influence on data output. To construct protocols that extract the complete metagenome but with minimal bias is an ongoing challenge. Many different software strategies...... for data analysis are being developed, and several studies applying diagnostic metagenomics to human clinical samples have been published, detecting, and sometimes, typing bacterial infections. It is possible to obtain a draft genome of the pathogen and to develop methods that can theoretically be applied...

  6. Towards diagnostic metagenomics of Campylobacter in fecal samples.

    Science.gov (United States)

    Andersen, Sandra Christine; Kiil, Kristoffer; Harder, Christoffer Bugge; Josefsen, Mathilde Hasseldam; Persson, Søren; Nielsen, Eva Møller; Hoorfar, Jeffrey

    2017-06-08

    The development of diagnostic metagenomics is driven by the need for universal, culture-independent methods for detection and characterization of pathogens to substitute the time-consuming, organism-specific, and often culture-based laboratory procedures for epidemiological source-tracing. Some of the challenges in diagnostic metagenomics are, that it requires a great next-generation sequencing depth and unautomated data analysis. DNA from human fecal samples spiked with 7.75 × 101-7.75 × 107 colony forming unit (CFU)/ml Campylobacter jejuni and chicken fecal samples spiked with 1 × 102-1 × 106 CFU/g Campylobacter jejuni was sequenced and data analysis was done by the metagenomic tools Kraken and CLARK. More hits were obtained at higher spiking levels, however with no significant linear correlations (human samples p = 0.12, chicken samples p = 0.10). Therefore, no definite detection limit could be determined, but the lowest spiking levels found positive were 7.75 × 104 CFU/ml in human feces and 103 CFU/g in chicken feces. Eight human clinical fecal samples with estimated Campylobacter infection loads from 9.2 × 104-1.0 × 109 CFU/ml were analyzed using the same methods. It was possible to detect Campylobacter in all the clinical samples. Sensitivity in diagnostic metagenomics is improving and has reached a clinically relevant level. There are still challenges to overcome before real-time diagnostic metagenomics can replace quantitative polymerase chain reaction (qPCR) or culture-based surveillance and diagnostics, but it is a promising new technology.

  7. Towards diagnostic metagenomics of Campylobacter in fecal samples

    DEFF Research Database (Denmark)

    Andersen, Sandra Christine; Kiil, Kristoffer; Harder, Christoffer Bugge

    2017-01-01

    The development of diagnostic metagenomics is driven by the need for universal, culture-independent methods for detection and characterization of pathogens to substitute the time-consuming, organism-specific, and often culture-based laboratory procedures for epidemiological source-tracing. Some...... of the challenges in diagnostic metagenomics are, that it requires a great next-generation sequencing depth and unautomated data analysis. DNA from human fecal samples spiked with 7.75 × 101-7.75 × 107 colony forming unit (CFU)/ml Campylobacter jejuni and chicken fecal samples spiked with 1 × 102-1 × 106 CFU....../g Campylobacter jejuni was sequenced and data analysis was done by the metagenomic tools Kraken and CLARK. More hits were obtained at higher spiking levels, however with no significant linear correlations (human samples p = 0.12, chicken samples p = 0.10). Therefore, no definite detection limit could...

  8. In silico analyses of metagenomes from human atherosclerotic plaque samples

    DEFF Research Database (Denmark)

    Mitra, Suparna; Drautz-Moses, Daniela I; Alhede, Morten

    2015-01-01

    a challenge. RESULTS: To investigate microbiome diversity within human atherosclerotic tissue samples, we employed high-throughput metagenomic analysis on: (1) atherosclerotic plaques obtained from a group of patients who underwent endarterectomy due to recent transient cerebral ischemia or stroke. (2......) Presumed stabile atherosclerotic plaques obtained from autopsy from a control group of patients who all died from causes not related to cardiovascular disease. Our data provides evidence that suggest a wide range of microbial agents in atherosclerotic plaques, and an intriguing new observation that shows...... these microbiota displayed differences between symptomatic and asymptomatic plaques as judged from the taxonomic profiles in these two groups of patients. Additionally, functional annotations reveal significant differences in basic metabolic and disease pathway signatures between these groups. CONCLUSIONS: We...

  9. A sampling and metagenomic sequencing-based methodology for monitoring antimicrobial resistance in swine herds.

    Science.gov (United States)

    Munk, Patrick; Andersen, Vibe Dalhoff; de Knegt, Leonardo; Jensen, Marie Stengaard; Knudsen, Berith Elkær; Lukjancenko, Oksana; Mordhorst, Hanne; Clasen, Julie; Agersø, Yvonne; Folkesson, Anders; Pamp, Sünje Johanna; Vigre, Håkan; Aarestrup, Frank Møller

    2017-02-01

    Reliable methods for monitoring antimicrobial resistance (AMR) in livestock and other reservoirs are essential to understand the trends, transmission and importance of agricultural resistance. Quantification of AMR is mostly done using culture-based techniques, but metagenomic read mapping shows promise for quantitative resistance monitoring. We evaluated the ability of: (i) MIC determination for Escherichia coli; (ii) cfu counting of E. coli; (iii) cfu counting of aerobic bacteria; and (iv) metagenomic shotgun sequencing to predict expected tetracycline resistance based on known antimicrobial consumption in 10 Danish integrated slaughter pig herds. In addition, we evaluated whether fresh or manure floor samples constitute suitable proxies for intestinal sampling, using cfu counting, qPCR and metagenomic shotgun sequencing. Metagenomic read-mapping outperformed cultivation-based techniques in terms of predicting expected tetracycline resistance based on antimicrobial consumption. Our metagenomic approach had sufficient resolution to detect antimicrobial-induced changes to individual resistance gene abundances. Pen floor manure samples were found to represent rectal samples well when analysed using metagenomics, as they contain the same DNA with the exception of a few contaminating taxa that proliferate in the extraintestinal environment. We present a workflow, from sampling to interpretation, showing how resistance monitoring can be carried out in swine herds using a metagenomic approach. We propose metagenomic sequencing should be part of routine livestock resistance monitoring programmes and potentially of integrated One Health monitoring in all reservoirs. © The Author 2016. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy.

  10. Microbial Metagenomics: Beyond the Genome

    Science.gov (United States)

    Gilbert, Jack A.; Dupont, Christopher L.

    2011-01-01

    Metagenomics literally means “beyond the genome.” Marine microbial metagenomic databases presently comprise ˜400 billion base pairs of DNA, only ˜3% of that found in 1 ml of seawater. Very soon a trillion-base-pair sequence run will be feasible, so it is time to reflect on what we have learned from metagenomics. We review the impact of metagenomics on our understanding of marine microbial communities. We consider the studies facilitated by data generated through the Global Ocean Sampling expedition, as well as the revolution wrought at the individual laboratory level through next generation sequencing technologies. We review recent studies and discoveries since 2008, provide a discussion of bioinformatic analyses, including conceptual pipelines and sequence annotation and predict the future of metagenomics, with suggestions of collaborative community studies tailored toward answering some of the fundamental questions in marine microbial ecology.

  11. Computational workflow for the fine-grained analysis of metagenomic samples.

    Science.gov (United States)

    Pérez-Wohlfeil, Esteban; Arjona-Medina, Jose A; Torreno, Oscar; Ulzurrun, Eugenia; Trelles, Oswaldo

    2016-10-25

    The field of metagenomics, defined as the direct genetic analysis of uncultured samples of genomes contained within an environmental sample, is gaining increasing popularity. The aim of studies of metagenomics is to determine the species present in an environmental community and identify changes in the abundance of species under different conditions. Current metagenomic analysis software faces bottlenecks due to the high computational load required to analyze complex samples. A computational open-source workflow has been developed for the detailed analysis of metagenomes. This workflow provides new tools and datafile specifications that facilitate the identification of differences in abundance of reads assigned to taxa (mapping), enables the detection of reads of low-abundance bacteria (producing evidence of their presence), provides new concepts for filtering spurious matches, etc. Innovative visualization ideas for improved display of metagenomic diversity are also proposed to better understand how reads are mapped to taxa. Illustrative examples are provided based on the study of two collections of metagenomes from faecal microbial communities of adult female monozygotic and dizygotic twin pairs concordant for leanness or obesity and their mothers. The proposed workflow provides an open environment that offers the opportunity to perform the mapping process using different reference databases. Additionally, this workflow shows the specifications of the mapping process and datafile formats to facilitate the development of new plugins for further post-processing. This open and extensible platform has been designed with the aim of enabling in-depth analysis of metagenomic samples and better understanding of the underlying biological processes.

  12. Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes

    DEFF Research Database (Denmark)

    Nielsen, Henrik Bjørn; Almeida, Mathieu; Juncker, Agnieszka

    2014-01-01

    , such as particular bacterial strains or viruses, remains a largely unsolved problem. Here we present a method, based on binning co-abundant genes across a series of metagenomic samples, that enables comprehensive discovery of new microbial organisms, viruses and co-inherited genetic entities and aids assembly...... affiliations between MGS and hundreds of viruses or genetic entities. Our method provides the means for comprehensive profiling of the diversity within complex metagenomic samples....

  13. A sampling and metagenomic sequencing-based methodology for monitoring antimicrobial resistance in swine herds

    DEFF Research Database (Denmark)

    Munk, Patrick; Dalhoff Andersen, Vibe; de Knegt, Leonardo

    2016-01-01

    Objectives Reliable methods for monitoring antimicrobial resistance (AMR) in livestock and other reservoirs are essential to understand the trends, transmission and importance of agricultural resistance. Quantification of AMR is mostly done using culture-based techniques, but metagenomic read...... mapping shows promise for quantitative resistance monitoring. Methods We evaluated the ability of: (i) MIC determination for Escherichia coli; (ii) cfu counting of E. coli; (iii) cfu counting of aerobic bacteria; and (iv) metagenomic shotgun sequencing to predict expected tetracycline resistance based...... on known antimicrobial consumption in 10 Danish integrated slaughter pig herds. In addition, we evaluated whether fresh or manure floor samples constitute suitable proxies for intestinal sampling, using cfu counting, qPCR and metagenomic shotgun sequencing. Results Metagenomic read-mapping outperformed...

  14. Metagenomic analysis of viral diversity in respiratory samples from patients with respiratory tract infections in Kuwait.

    Science.gov (United States)

    Madi, Nada; Al-Nakib, Widad; Mustafa, Abu Salim; Habibi, Nazima

    2018-03-01

    A metagenomic approach based on target independent next-generation sequencing has become a known method for the detection of both known and novel viruses in clinical samples. This study aimed to use the metagenomic sequencing approach to characterize the viral diversity in respiratory samples from patients with respiratory tract infections. We have investigated 86 respiratory samples received from various hospitals in Kuwait between 2015 and 2016 for the diagnosis of respiratory tract infections. A metagenomic approach using the next-generation sequencer to characterize viruses was used. According to the metagenomic analysis, an average of 145, 019 reads were identified, and 2% of these reads were of viral origin. Also, metagenomic analysis of the viral sequences revealed many known respiratory viruses, which were detected in 30.2% of the clinical samples. Also, sequences of non-respiratory viruses were detected in 14% of the clinical samples, while sequences of non-human viruses were detected in 55.8% of the clinical samples. The average genome coverage of the viruses was 12% with the highest genome coverage of 99.2% for respiratory syncytial virus, and the lowest was 1% for torque teno midi virus 2. Our results showed 47.7% agreement between multiplex Real-Time PCR and metagenomics sequencing in the detection of respiratory viruses in the clinical samples. Though there are some difficulties in using this method to clinical samples such as specimen quality, these observations are indicative of the promising utility of the metagenomic sequencing approach for the identification of respiratory viruses in patients with respiratory tract infections. © 2017 Wiley Periodicals, Inc.

  15. Rhizon Sampling of Pore Waters on Scientific Drilling Expeditions: An Example from the IODP Expedition 302, Arctic Coring Expedition (ACEX

    Directory of Open Access Journals (Sweden)

    Luzie Schnieders

    2007-03-01

    Full Text Available For more than 35 years, interstitial water (IW samples have been collected from sediment recovered during marine scientific coring expeditions. Pioneering work of DSDP and ODP quickly demonstrated that pore water chemistry differed from that of overlying seawater and from one location to another for myriad reasons (Sayles and Manheim, 1975. Extraction and analysis of IW samples has now becomeroutine on deep-sea drilling cruises; the ensuing pore water profiles are being used to understand a range of processes, such as subsurface fluid flow (e.g., Brown et al., 2001; Saffer and Screaton, 2003, mineral diagenesis (e.g., Rudnicki et al., 2001; Malone et al., 2002, microbial reactions (e.g., Böttcher and Khim, 2004; D’Hondt et al., 2004, gas hydrate dissociation (e.g., Egeberg and Dickens, 1999; Tréhu et al., 2004, and glacial to interglacial changes in the composition of bottom water (e.g., Paul et al., 2001; Adkins and McIntyre, 2002.

  16. The Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific.

    Directory of Open Access Journals (Sweden)

    Douglas B Rusch

    2007-03-01

    Full Text Available The world's oceans contain a complex mixture of micro-organisms that are for the most part, uncharacterized both genetically and biochemically. We report here a metagenomic study of the marine planktonic microbiota in which surface (mostly marine water samples were analyzed as part of the Sorcerer II Global Ocean Sampling expedition. These samples, collected across a several-thousand km transect from the North Atlantic through the Panama Canal and ending in the South Pacific yielded an extensive dataset consisting of 7.7 million sequencing reads (6.3 billion bp. Though a few major microbial clades dominate the planktonic marine niche, the dataset contains great diversity with 85% of the assembled sequence and 57% of the unassembled data being unique at a 98% sequence identity cutoff. Using the metadata associated with each sample and sequencing library, we developed new comparative genomic and assembly methods. One comparative genomic method, termed "fragment recruitment," addressed questions of genome structure, evolution, and taxonomic or phylogenetic diversity, as well as the biochemical diversity of genes and gene families. A second method, termed "extreme assembly," made possible the assembly and reconstruction of large segments of abundant but clearly nonclonal organisms. Within all abundant populations analyzed, we found extensive intra-ribotype diversity in several forms: (1 extensive sequence variation within orthologous regions throughout a given genome; despite coverage of individual ribotypes approaching 500-fold, most individual sequencing reads are unique; (2 numerous changes in gene content some with direct adaptive implications; and (3 hypervariable genomic islands that are too variable to assemble. The intra-ribotype diversity is organized into genetically isolated populations that have overlapping but independent distributions, implying distinct environmental preference. We present novel methods for measuring the genomic

  17. MetaMLST: multi-locus strain-level bacterial typing from metagenomic samples.

    Science.gov (United States)

    Zolfo, Moreno; Tett, Adrian; Jousson, Olivier; Donati, Claudio; Segata, Nicola

    2017-01-25

    Metagenomic characterization of microbial communities has the potential to become a tool to identify pathogens in human samples. However, software tools able to extract strain-level typing information from metagenomic data are needed. Low-throughput molecular typing schema such as Multilocus Sequence Typing (MLST) are still widely used and provide a wealth of strain-level information that is currently not exploited by metagenomic methods. We introduce MetaMLST, a software tool that reconstructs the MLST loci of microorganisms present in microbial communities from metagenomic data. Tested on synthetic and spiked-in real metagenomes, the pipeline was able to reconstruct the MLST sequences with >98.5% accuracy at coverages as low as 1×. On real samples, the pipeline showed higher sensitivity than assembly-based approaches and it proved successful in identifying strains in epidemic outbreaks as well as in intestinal, skin and gastrointestinal microbiome samples. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  18. A framework for space-efficient read clustering in metagenomic samples.

    Science.gov (United States)

    Alanko, Jarno; Cunial, Fabio; Belazzougui, Djamal; Mäkinen, Veli

    2017-03-14

    A metagenomic sample is a set of DNA fragments, randomly extracted from multiple cells in an environment, belonging to distinct, often unknown species. Unsupervised metagenomic clustering aims at partitioning a metagenomic sample into sets that approximate taxonomic units, without using reference genomes. Since samples are large and steadily growing, space-efficient clustering algorithms are strongly needed. We design and implement a space-efficient algorithmic framework that solves a number of core primitives in unsupervised metagenomic clustering using just the bidirectional Burrows-Wheeler index and a union-find data structure on the set of reads. When run on a sample of total length n, with m reads of maximum length ℓ each, on an alphabet of total size σ, our algorithms take O(n(t+logσ)) time and just 2n+o(n)+O(max{ℓ σlogn,K logm}) bits of space in addition to the index and to the union-find data structure, where K is a measure of the redundancy of the sample and t is the query time of the union-find data structure. Our experimental results show that our algorithms are practical, they can exploit multiple cores by a parallel traversal of the suffix-link tree, and they are competitive both in space and in time with the state of the art.

  19. Statistical methods for detecting differentially abundant features in clinical metagenomic samples.

    Directory of Open Access Journals (Sweden)

    James Robert White

    2009-04-01

    Full Text Available Numerous studies are currently underway to characterize the microbial communities inhabiting our world. These studies aim to dramatically expand our understanding of the microbial biosphere and, more importantly, hope to reveal the secrets of the complex symbiotic relationship between us and our commensal bacterial microflora. An important prerequisite for such discoveries are computational tools that are able to rapidly and accurately compare large datasets generated from complex bacterial communities to identify features that distinguish them.We present a statistical method for comparing clinical metagenomic samples from two treatment populations on the basis of count data (e.g. as obtained through sequencing to detect differentially abundant features. Our method, Metastats, employs the false discovery rate to improve specificity in high-complexity environments, and separately handles sparsely-sampled features using Fisher's exact test. Under a variety of simulations, we show that Metastats performs well compared to previously used methods, and significantly outperforms other methods for features with sparse counts. We demonstrate the utility of our method on several datasets including a 16S rRNA survey of obese and lean human gut microbiomes, COG functional profiles of infant and mature gut microbiomes, and bacterial and viral metabolic subsystem data inferred from random sequencing of 85 metagenomes. The application of our method to the obesity dataset reveals differences between obese and lean subjects not reported in the original study. For the COG and subsystem datasets, we provide the first statistically rigorous assessment of the differences between these populations. The methods described in this paper are the first to address clinical metagenomic datasets comprising samples from multiple subjects. Our methods are robust across datasets of varied complexity and sampling level. While designed for metagenomic applications, our software

  20. Sampling and Chemical Analysis of Potable Water for ISS Expeditions 12 and 13

    Science.gov (United States)

    Straub, John E. II; Plumlee, Deborah K.; Schultz, John R.

    2007-01-01

    The crews of Expeditions 12 and 13 aboard the International Space Station (ISS) continued to rely on potable water from two different sources, regenerated humidity condensate and Russian ground-supplied water. The Space Shuttle launched twice during the 12- months spanning both expeditions and docked with the ISS for delivery of hardware and supplies. However, no Shuttle potable water was transferred to the station during either of these missions. The chemical quality of the ISS onboard potable water supplies was verified by performing ground analyses of archival water samples at the Johnson Space Center (JSC) Water and Food Analytical Laboratory (WAFAL). Since no Shuttle flights launched during Expedition 12 and there was restricted return volume on the Russian Soyuz vehicle, only one chemical archive potable water sample was collected with U.S. hardware and returned during Expedition 12. This sample was collected in March 2006 and returned on Soyuz 11. The number and sensitivity of the chemical analyses performed on this sample were limited due to low sample volume. Shuttle flights STS-121 (ULF1.1) and STS-115 (12A) docked with the ISS in July and September of 2006, respectively. These flights returned to Earth with eight chemical archive potable water samples that were collected with U.S. hardware during Expedition 13. The average collected volume increased for these samples, allowing full chemical characterization to be performed. This paper presents a discussion of the results from chemical analyses performed on Expeditions 12 and 13 archive potable water samples. In addition to the results from the U.S. samples analyzed, results from pre-flight samples of Russian potable water delivered to the ISS on Progress vehicles and in-flight samples collected with Russian hardware during Expeditions 12 and 13 and analyzed at JSC are also discussed.

  1. Metagenomic detection of viruses in aerosol samples from workers in animal slaughterhouses.

    Science.gov (United States)

    Hall, Richard J; Leblanc-Maridor, Mily; Wang, Jing; Ren, Xiaoyun; Moore, Nicole E; Brooks, Collin R; Peacey, Matthew; Douwes, Jeroen; McLean, David J

    2013-01-01

    Published studies have shown that workers in animal slaughterhouses are at a higher risk of lung cancers as compared to the general population. No specific causal agents have been identified, and exposures to several chemicals have been examined and found to be unrelated. Evidence suggests a biological aetiology as the risk is highest for workers who are exposed to live animals or to biological material containing animal faeces, urine or blood. To investigate possible biological exposures in animal slaughterhouses, we used a metagenomic approach to characterise the profile of organisms present within an aerosol sample. An assessment of aerosol exposures for individual workers was achieved by the collection of personal samples that represent the inhalable fraction of dust/bioaerosol in workplace air in both cattle and sheep slaughterhouses. Two sets of nine personal aerosol samples were pooled for the cattle processing and sheep processing areas respectively, with a total of 332,677,346 sequence reads and 250,144,492 sequence reads of 85 bp in length produced for each. Eukaryotic genome sequence was found in both sampling locations, and bovine, ovine and human sequences were common. Sequences from WU polyomavirus and human papillomavirus 120 were detected in the metagenomic dataset from the cattle processing area, and these sequences were confirmed as being present in the original personal aerosol samples. This study presents the first metagenomic description of personal aerosol exposure and this methodology could be applied to a variety of environments. Also, the detection of two candidate viruses warrants further investigation in the setting of occupational exposures in animal slaughterhouses.

  2. Metagenomic detection of viruses in aerosol samples from workers in animal slaughterhouses.

    Directory of Open Access Journals (Sweden)

    Richard J Hall

    Full Text Available Published studies have shown that workers in animal slaughterhouses are at a higher risk of lung cancers as compared to the general population. No specific causal agents have been identified, and exposures to several chemicals have been examined and found to be unrelated. Evidence suggests a biological aetiology as the risk is highest for workers who are exposed to live animals or to biological material containing animal faeces, urine or blood. To investigate possible biological exposures in animal slaughterhouses, we used a metagenomic approach to characterise the profile of organisms present within an aerosol sample. An assessment of aerosol exposures for individual workers was achieved by the collection of personal samples that represent the inhalable fraction of dust/bioaerosol in workplace air in both cattle and sheep slaughterhouses. Two sets of nine personal aerosol samples were pooled for the cattle processing and sheep processing areas respectively, with a total of 332,677,346 sequence reads and 250,144,492 sequence reads of 85 bp in length produced for each. Eukaryotic genome sequence was found in both sampling locations, and bovine, ovine and human sequences were common. Sequences from WU polyomavirus and human papillomavirus 120 were detected in the metagenomic dataset from the cattle processing area, and these sequences were confirmed as being present in the original personal aerosol samples. This study presents the first metagenomic description of personal aerosol exposure and this methodology could be applied to a variety of environments. Also, the detection of two candidate viruses warrants further investigation in the setting of occupational exposures in animal slaughterhouses.

  3. Comparative analysis of metagenomes from three methanogenic hydrocarbon-degrading enrichment cultures with 41 environmental samples

    Science.gov (United States)

    Tan, Boonfei; Jane Fowler, S; Laban, Nidal Abu; Dong, Xiaoli; Sensen, Christoph W; Foght, Julia; Gieg, Lisa M

    2015-01-01

    Methanogenic hydrocarbon metabolism is a key process in subsurface oil reservoirs and hydrocarbon-contaminated environments and thus warrants greater understanding to improve current technologies for fossil fuel extraction and bioremediation. In this study, three hydrocarbon-degrading methanogenic cultures established from two geographically distinct environments and incubated with different hydrocarbon substrates (added as single hydrocarbons or as mixtures) were subjected to metagenomic and 16S rRNA gene pyrosequencing to test whether these differences affect the genetic potential and composition of the communities. Enrichment of different putative hydrocarbon-degrading bacteria in each culture appeared to be substrate dependent, though all cultures contained both acetate- and H2-utilizing methanogens. Despite differing hydrocarbon substrates and inoculum sources, all three cultures harbored genes for hydrocarbon activation by fumarate addition (bssA, assA, nmsA) and carboxylation (abcA, ancA), along with those for associated downstream pathways (bbs, bcr, bam), though the cultures incubated with hydrocarbon mixtures contained a broader diversity of fumarate addition genes. A comparative metagenomic analysis of the three cultures showed that they were functionally redundant despite their enrichment backgrounds, sharing multiple features associated with syntrophic hydrocarbon conversion to methane. In addition, a comparative analysis of the culture metagenomes with those of 41 environmental samples (containing varying proportions of methanogens) showed that the three cultures were functionally most similar to each other but distinct from other environments, including hydrocarbon-impacted environments (for example, oil sands tailings ponds and oil-affected marine sediments). This study provides a basis for understanding key functions and environmental selection in methanogenic hydrocarbon-associated communities. PMID:25734684

  4. Identification of a novel human papillomavirus by metagenomic analysis of samples from patients with febrile respiratory illness

    NARCIS (Netherlands)

    Mokili, J.L.; Dutilh, B.E.; Lim, Y.W.; Schneider, B.S.; Taylor, T.; Haynes, M.R.; Metzgar, D.; Myers, C.A.; Blair, P.J.; Nosrat, B.; Wolfe, N.D.; Rohwer, F.

    2013-01-01

    As part of a virus discovery investigation using a metagenomic approach, a highly divergent novel Human papillomavirus type was identified in pooled convenience nasal/oropharyngeal swab samples collected from patients with febrile respiratory illness. Phylogenetic analysis of the whole genome and

  5. Metagenomic covariation along densely sampled environmental gradients in the Red Sea

    KAUST Repository

    Thompson, Luke R

    2016-07-15

    Oceanic microbial diversity covaries with physicochemical parameters. Temperature, for example, explains approximately half of global variation in surface taxonomic abundance. It is unknown, however, whether covariation patterns hold over narrower parameter gradients and spatial scales, and extending to mesopelagic depths. We collected and sequenced 45 epipelagic and mesopelagic microbial metagenomes on a meridional transect through the eastern Red Sea. We asked which environmental parameters explain the most variation in relative abundances of taxonomic groups, gene ortholog groups, and pathways—at a spatial scale of <2000 km, along narrow but well-defined latitudinal and depth-dependent gradients. We also asked how microbes are adapted to gradients and extremes in irradiance, temperature, salinity, and nutrients, examining the responses of individual gene ortholog groups to these parameters. Functional and taxonomic metrics were equally well explained (75–79%) by environmental parameters. However, only functional and not taxonomic covariation patterns were conserved when comparing with an intruding water mass with different physicochemical properties. Temperature explained the most variation in each metric, followed by nitrate, chlorophyll, phosphate, and salinity. That nitrate explained more variation than phosphate suggested nitrogen limitation, consistent with low surface N:P ratios. Covariation of gene ortholog groups with environmental parameters revealed patterns of functional adaptation to the challenging Red Sea environment: high irradiance, temperature, salinity, and low nutrients. Nutrient-acquisition gene ortholog groups were anti-correlated with concentrations of their respective nutrient species, recapturing trends previously observed across much larger distances and environmental gradients. This dataset of metagenomic covariation along densely sampled environmental gradients includes online data exploration supplements, serving as a community

  6. Modular approach to customise sample preparation procedures for viral metagenomics: a reproducible protocol for virome analysis.

    Science.gov (United States)

    Conceição-Neto, Nádia; Zeller, Mark; Lefrère, Hanne; De Bruyn, Pieter; Beller, Leen; Deboutte, Ward; Yinda, Claude Kwe; Lavigne, Rob; Maes, Piet; Van Ranst, Marc; Heylen, Elisabeth; Matthijnssens, Jelle

    2015-11-12

    A major limitation for better understanding the role of the human gut virome in health and disease is the lack of validated methods that allow high throughput virome analysis. To overcome this, we evaluated the quantitative effect of homogenisation, centrifugation, filtration, chloroform treatment and random amplification on a mock-virome (containing nine highly diverse viruses) and a bacterial mock-community (containing four faecal bacterial species) using quantitative PCR and next-generation sequencing. This resulted in an optimised protocol that was able to recover all viruses present in the mock-virome and strongly alters the ratio of viral versus bacterial and 16S rRNA genetic material in favour of viruses (from 43.2% to 96.7% viral reads and from 47.6% to 0.19% bacterial reads). Furthermore, our study indicated that most of the currently used virome protocols, using small filter pores and/or stringent centrifugation conditions may have largely overlooked large viruses present in viromes. We propose NetoVIR (Novel enrichment technique of VIRomes), which allows for a fast, reproducible and high throughput sample preparation for viral metagenomics studies, introducing minimal bias. This procedure is optimised mainly for faecal samples, but with appropriate concentration steps can also be used for other sample types with lower initial viral loads.

  7. Utilizing the International GeoSample Number Concept during ICDP Expedition COSC

    Science.gov (United States)

    Conze, Ronald; Lorenz, Henning; Ulbricht, Damian; Gorgas, Thomas; Elger, Kirsten

    2016-04-01

    The concept of the International GeoSample Number (IGSN) was introduced to uniquely identify and register geo-related sample material, and make it retrievable via electronic media (e.g., SESAR - http://www.geosamples.org/igsnabout). The general aim of the IGSN concept is to improve accessing stored sample material worldwide, enable the exact identification, its origin and provenance, and also the exact and complete citation of acquired samples throughout the literature. The ICDP expedition COSC (Collisional Orogeny in the Scandinavian Caledonides, http://cosc.icdp-online.org) prompted for the first time in ICDP's history to assign and register IGSNs during an ongoing drilling campaign. ICDP drilling expeditions are using commonly the Drilling Information System DIS (http://doi.org/10.2204/iodp.sd.4.07.2007) for the inventory of recovered sample material. During COSC IGSNs were assigned to every drill hole, core run, core section, and sample taken from core material. The original IGSN specification has been extended to achieve the required uniqueness of IGSNs with our offline-procedure. The ICDP name space indicator and the Expedition ID (5054) are forming an extended prefix (ICDP5054). For every type of sample material, an encoded sequence of characters follows. This sequence is derived from the DIS naming convention which is unique from the beginning. Thereby every ICDP expedition has an unlimited name space for IGSN assignments. This direct derivation of IGSNs from the DIS database context ensures the distinct parent-child hierarchy of the IGSNs among each other. In the case of COSC this method of inventory-keeping of all drill cores was done routinely using the ExpeditionDIS during field work and subsequent sampling party. After completing the field campaign, all sample material was transferred to the "Nationales Bohrkernlager" in Berlin-Spandau, Germany. Corresponding data was subsequently imported into the CurationDIS used at the aforementioned core storage

  8. Soup to Tree: The Phylogeny of Beetles Inferred by Mitochondrial Metagenomics of a Bornean Rainforest Sample

    Science.gov (United States)

    Crampton-Platt, Alex; Timmermans, Martijn J.T.N.; Gimmel, Matthew L.; Kutty, Sujatha Narayanan; Cockerill, Timothy D.; Vun Khen, Chey; Vogler, Alfried P.

    2015-01-01

    In spite of the growth of molecular ecology, systematics and next-generation sequencing, the discovery and analysis of diversity is not currently integrated with building the tree-of-life. Tropical arthropod ecologists are well placed to accelerate this process if all specimens obtained through mass-trapping, many of which will be new species, could be incorporated routinely into phylogeny reconstruction. Here we test a shotgun sequencing approach, whereby mitochondrial genomes are assembled from complex ecological mixtures through mitochondrial metagenomics, and demonstrate how the approach overcomes many of the taxonomic impediments to the study of biodiversity. DNA from approximately 500 beetle specimens, originating from a single rainforest canopy fogging sample from Borneo, was pooled and shotgun sequenced, followed by de novo assembly of complete and partial mitogenomes for 175 species. The phylogenetic tree obtained from this local sample was highly similar to that from existing mitogenomes selected for global coverage of major lineages of Coleoptera. When all sequences were combined only minor topological changes were induced against this reference set, indicating an increasingly stable estimate of coleopteran phylogeny, while the ecological sample expanded the tip-level representation of several lineages. Robust trees generated from ecological samples now enable an evolutionary framework for ecology. Meanwhile, the inclusion of uncharacterized samples in the tree-of-life rapidly expands taxon and biogeographic representation of lineages without morphological identification. Mitogenomes from shotgun sequencing of unsorted environmental samples and their associated metadata, placed robustly into the phylogenetic tree, constitute novel DNA “superbarcodes” for testing hypotheses regarding global patterns of diversity. PMID:25957318

  9. Comparative analysis of the sensitivity of metagenomic sequencing and PCR to detect a biowarfare simulant (Bacillus atrophaeus) in soil samples

    Science.gov (United States)

    Plaire, Delphine; Puaud, Simon; Marsolier-Kergoat, Marie-Claude

    2017-01-01

    To evaluate the sensitivity of high-throughput DNA sequencing for monitoring biowarfare agents in the environment, we analysed soil samples inoculated with different amounts of Bacillus atrophaeus, a surrogate organism for Bacillus anthracis. The soil samples considered were a poorly carbonated soil of the silty sand class, and a highly carbonated soil of the silt class. Control soil samples and soil samples inoculated with 10, 103, or 105 cfu were processed for DNA extraction. About 1% of the DNA extracts was analysed through the sequencing of more than 108 reads. Similar amounts of extracts were also studied for Bacillus atrophaeus DNA content by real-time PCR. We demonstrate that, for both soils, high-throughput sequencing is at least equally sensitive than real-time PCR to detect Bacillus atrophaeus DNA. We conclude that metagenomics allows the detection of less than 10 ppm of DNA from a biowarfare simulant in complex environmental samples. PMID:28472119

  10. Sequencing at sea: challenges and experiences in Ion Torrent PGM sequencing during the 2013 Southern Line Islands Research Expedition

    NARCIS (Netherlands)

    Lim, Y.W.; Cuevas, D.A.; Silva, G.G.; Aguinaldo, K.; Dinsdale, E.A.; Haas, A.F.; Hatay, M.; Sanchez, S.E.; Wegley-Kelly, L.; Dutilh, B.E.; Harkins, T.T.; Lee, C.C.; Tom, W.; Sandin, S.A.; Smith, J.E.; Zgliczynski, B.; Vermeij, M.J.; Rohwer, F.; Edwards, R.A.

    2014-01-01

    Genomics and metagenomics have revolutionized our understanding of marine microbial ecology and the importance of microbes in global geochemical cycles. However, the process of DNA sequencing has always been an abstract extension of the research expedition, completed once the samples were returned

  11. Sequencing at sea : challenges and experiences in Ion Torrent PGM sequencing during the 2013 Southern Line Islands Research Expedition

    NARCIS (Netherlands)

    Lim, Yan Wei; Cuevas, Daniel A; Silva, Genivaldo Gueiros Z; Aguinaldo, Kristen; Dinsdale, Elizabeth A; Haas, Andreas F; Hatay, Mark; Sanchez, Savannah E; Wegley-Kelly, Linda; Dutilh, Bas E; Harkins, Timothy T; Lee, Clarence C; Tom, Warren; Sandin, Stuart A; Smith, Jennifer E; Zgliczynski, Brian; Vermeij, Mark J A; Rohwer, Forest; Edwards, Robert A

    2014-01-01

    Genomics and metagenomics have revolutionized our understanding of marine microbial ecology and the importance of microbes in global geochemical cycles. However, the process of DNA sequencing has always been an abstract extension of the research expedition, completed once the samples were returned

  12. Identification of a novel human papillomavirus by metagenomic analysis of samples from patients with febrile respiratory illness.

    Directory of Open Access Journals (Sweden)

    John L Mokili

    Full Text Available As part of a virus discovery investigation using a metagenomic approach, a highly divergent novel Human papillomavirus type was identified in pooled convenience nasal/oropharyngeal swab samples collected from patients with febrile respiratory illness. Phylogenetic analysis of the whole genome and the L1 gene reveals that the new HPV identified in this study clusters with previously described gamma papillomaviruses, sharing only 61.1% (whole genome and 63.1% (L1 sequence identity with its closest relative in the Papillomavirus episteme (PAVE database. This new virus was named HPV_SD2 pending official classification. The complete genome of HPV-SD2 is 7,299 bp long (36.3% G/C and contains 7 open reading frames (L2, L1, E6, E7, E1, E2 and E4 and a non-coding long control region (LCR between L1 and E6. The metagenomic procedures, coupled with the bioinformatic methods described herein are well suited to detect small circular genomes such as those of human papillomaviruses.

  13. Virome profiling of bats from Myanmar by metagenomic analysis of tissue samples reveals more novel Mammalian viruses.

    Directory of Open Access Journals (Sweden)

    Biao He

    Full Text Available Bats are reservoir animals harboring many important pathogenic viruses and with the capability of transmitting these to humans and other animals. To establish an effective surveillance to monitor transboundary spread of bat viruses between Myanmar and China, complete organs from the thorax and abdomen from 853 bats of six species from two Myanmar counties close to Yunnan province, China, were collected and tested for their virome through metagenomics by Solexa sequencing and bioinformatic analysis. In total, 3,742,314 reads of 114 bases were generated, and over 86% were assembled into 1,649,512 contigs with an average length of 114 bp, of which 26,698 (2% contigs were recognizable viral sequences belonging to 24 viral families. Of the viral contigs 45% (12,086/26,698 were related to vertebrate viruses, 28% (7,443/26,698 to insect viruses, 27% (7,074/26,698 to phages and 95 contigs to plant viruses. The metagenomic results were confirmed by PCR of selected viruses in all bat samples followed by phylogenetic analysis, which has led to the discovery of some novel bat viruses of the genera Mamastrovirus, Bocavirus, Circovirus, Iflavirus and Orthohepadnavirus and to their prevalence rates in two bat species. In conclusion, the present study aims to present the bat virome in Myanmar, and the results obtained further expand the spectrum of viruses harbored by bats.

  14. The Sorcerer II Global Ocean Sampling Expedition: Expanding theUniverse of Protein Families

    Energy Technology Data Exchange (ETDEWEB)

    Yooseph, Shibu; Sutton, Granger; Rusch, Douglas B.; Halpern,Aaron L.; Williamson, Shannon J.; Remington, Karin; Eisen, Jonathan A.; Heidelberg, Karla B.; Manning, Gerard; Li, Weizhong; Jaroszewski, Lukasz; Cieplak, Piotr; Miller, Christopher S.; Li, Huiying; Mashiyama, Susan T.; Joachimiak, Marcin P.; van Belle, Christopher; Chandonia, John-Marc; Soergel, David A.; Zhai, Yufeng; Natarajan, Kannan; Lee, Shaun; Raphael,Benjamin J.; Bafna, Vineet; Friedman, Robert; Brenner, Steven E.; Godzik,Adam; Eisenberg, David; Dixon, Jack E.; Taylor, Susan S.; Strausberg,Robert L.; Frazier, Marvin; Venter, J.Craig

    2006-03-23

    Metagenomics projects based on shotgun sequencing of populations of micro-organisms yield insight into protein families. We used sequence similarity clustering to explore proteins with a comprehensive dataset consisting of sequences from available databases together with 6.12 million proteins predicted from an assembly of 7.7 million Global Ocean Sampling (GOS) sequences. The GOS dataset covers nearly all known prokaryotic protein families. A total of 3,995 medium- and large-sized clusters consisting of only GOS sequences are identified, out of which 1,700 have no detectable homology to known families. The GOS-only clusters contain a higher than expected proportion of sequences of viral origin, thus reflecting a poor sampling of viral diversity until now. Protein domain distributions in the GOS dataset and current protein databases show distinct biases. Several protein domains that were previously categorized as kingdom specific are shown to have GOS examples in other kingdoms. About 6,000 sequences (ORFans) from the literature that heretofore lacked similarity to known proteins have matches in the GOS data. The GOS dataset is also used to improve remote homology detection. Overall, besides nearly doubling the number of current proteins, the predicted GOS proteins also add a great deal of diversity to known protein families and shed light on their evolution. These observations are illustrated using several protein families, including phosphatases, proteases, ultraviolet-irradiation DNA damage repair enzymes, glutamine synthetase, and RuBisCO. The diversity added by GOS data has implications for choosing targets for experimental structure characterization as part of structural genomics efforts. Our analysis indicates that new families are being discovered at a rate that is linear or almost linear with the addition of new sequences, implying that we are still far from discovering all protein families in nature.

  15. The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families.

    Directory of Open Access Journals (Sweden)

    Shibu Yooseph

    2007-03-01

    Full Text Available Metagenomics projects based on shotgun sequencing of populations of micro-organisms yield insight into protein families. We used sequence similarity clustering to explore proteins with a comprehensive dataset consisting of sequences from available databases together with 6.12 million proteins predicted from an assembly of 7.7 million Global Ocean Sampling (GOS sequences. The GOS dataset covers nearly all known prokaryotic protein families. A total of 3,995 medium- and large-sized clusters consisting of only GOS sequences are identified, out of which 1,700 have no detectable homology to known families. The GOS-only clusters contain a higher than expected proportion of sequences of viral origin, thus reflecting a poor sampling of viral diversity until now. Protein domain distributions in the GOS dataset and current protein databases show distinct biases. Several protein domains that were previously categorized as kingdom specific are shown to have GOS examples in other kingdoms. About 6,000 sequences (ORFans from the literature that heretofore lacked similarity to known proteins have matches in the GOS data. The GOS dataset is also used to improve remote homology detection. Overall, besides nearly doubling the number of current proteins, the predicted GOS proteins also add a great deal of diversity to known protein families and shed light on their evolution. These observations are illustrated using several protein families, including phosphatases, proteases, ultraviolet-irradiation DNA damage repair enzymes, glutamine synthetase, and RuBisCO. The diversity added by GOS data has implications for choosing targets for experimental structure characterization as part of structural genomics efforts. Our analysis indicates that new families are being discovered at a rate that is linear or almost linear with the addition of new sequences, implying that we are still far from discovering all protein families in nature.

  16. Persistent organic pollutants in biota samples collected during the Ymer-80 expedition to the Arctic

    Directory of Open Access Journals (Sweden)

    Henrik Kylin

    2015-10-01

    Full Text Available During the 1980 expedition to the Arctic with the icebreaker Ymer, a number of vertebrate species were sampled for determination of persistent organic pollutants. Samples of Arctic char (Salvelinus alpinus, n=34, glaucous gull (Larus hyperboreus, n=8, common eider (Somateria mollissima, n=10, Brünnich's guillemot (Uria lomvia, n=9, ringed seal (Pusa hispida, n=2 and polar bear (Ursus maritimus, n=2 were collected. With the exception of Brünnich's guillemot, there was a marked contamination difference of birds from western as compared to eastern/northern Svalbard. Samples in the west contained a larger number of polychlorinated biphenyl (PCB congeners and also polychlorinated terphenyls, indicating local sources. Brünnich's guillemots had similar pollutant concentrations in the west and east/north; possibly younger birds were sampled in the west. In Arctic char, pollutant profiles from lake Linnévatn (n=5, the lake closest to the main economic activities in Svalbard, were similar to profiles in Arctic char from the Shetland Islands (n=5, but differed from lakes to the north and east in Svalbard (n=30. Arctic char samples had higher concentrations of hexachlorocyclohexanes (HCHs than the marine species of birds and mammals, possibly due to accumulation via snowmelt. Compared to the Baltic Sea, comparable species collected in Svalbard had lower concentrations of PCB and dichlorodiphenyltrichloroethane (DDT, but similar concentrations indicating long-range transport of hexachlorobenzene, HCHs and cyclodiene pesticides. In samples collected in Svalbard in 1971, the concentrations of PCB and DDT in Brünnich's guillemot (n=7, glaucous gull (n=2 and polar bear (n=2 were similar to the concentrations found in 1980.

  17. Culture-independent detection and characterisation of Mycobacterium tuberculosis and M. africanum in sputum samples using shotgun metagenomics on a benchtop sequencer

    Directory of Open Access Journals (Sweden)

    Emma L. Doughty

    2014-09-01

    Full Text Available Tuberculosis remains a major global health problem. Laboratory diagnostic methods that allow effective, early detection of cases are central to management of tuberculosis in the individual patient and in the community. Since the 1880s, laboratory diagnosis of tuberculosis has relied primarily on microscopy and culture. However, microscopy fails to provide species- or lineage-level identification and culture-based workflows for diagnosis of tuberculosis remain complex, expensive, slow, technically demanding and poorly able to handle mixed infections. We therefore explored the potential of shotgun metagenomics, sequencing of DNA from samples without culture or target-specific amplification or capture, to detect and characterise strains from the Mycobacterium tuberculosis complex in smear-positive sputum samples obtained from The Gambia in West Africa. Eight smear- and culture-positive sputum samples were investigated using a differential-lysis protocol followed by a kit-based DNA extraction method, with sequencing performed on a benchtop sequencing instrument, the Illumina MiSeq. The number of sequence reads in each sputum-derived metagenome ranged from 989,442 to 2,818,238. The proportion of reads in each metagenome mapping against the human genome ranged from 20% to 99%. We were able to detect sequences from the M. tuberculosis complex in all eight samples, with coverage of the H37Rv reference genome ranging from 0.002X to 0.7X. By analysing the distribution of large sequence polymorphisms (deletions and the locations of the insertion element IS6110 and single nucleotide polymorphisms (SNPs, we were able to assign seven of eight metagenome-derived genomes to a species and lineage within the M. tuberculosis complex. Two metagenome-derived mycobacterial genomes were assigned to M. africanum, a species largely confined to West Africa; the others that could be assigned belonged to lineages T, H or LAM within the clade of “modern” M. tuberculosis

  18. Culture-independent detection and characterisation of Mycobacterium tuberculosis and M. africanum in sputum samples using shotgun metagenomics on a benchtop sequencer.

    Science.gov (United States)

    Doughty, Emma L; Sergeant, Martin J; Adetifa, Ifedayo; Antonio, Martin; Pallen, Mark J

    2014-01-01

    Tuberculosis remains a major global health problem. Laboratory diagnostic methods that allow effective, early detection of cases are central to management of tuberculosis in the individual patient and in the community. Since the 1880s, laboratory diagnosis of tuberculosis has relied primarily on microscopy and culture. However, microscopy fails to provide species- or lineage-level identification and culture-based workflows for diagnosis of tuberculosis remain complex, expensive, slow, technically demanding and poorly able to handle mixed infections. We therefore explored the potential of shotgun metagenomics, sequencing of DNA from samples without culture or target-specific amplification or capture, to detect and characterise strains from the Mycobacterium tuberculosis complex in smear-positive sputum samples obtained from The Gambia in West Africa. Eight smear- and culture-positive sputum samples were investigated using a differential-lysis protocol followed by a kit-based DNA extraction method, with sequencing performed on a benchtop sequencing instrument, the Illumina MiSeq. The number of sequence reads in each sputum-derived metagenome ranged from 989,442 to 2,818,238. The proportion of reads in each metagenome mapping against the human genome ranged from 20% to 99%. We were able to detect sequences from the M. tuberculosis complex in all eight samples, with coverage of the H37Rv reference genome ranging from 0.002X to 0.7X. By analysing the distribution of large sequence polymorphisms (deletions and the locations of the insertion element IS6110) and single nucleotide polymorphisms (SNPs), we were able to assign seven of eight metagenome-derived genomes to a species and lineage within the M. tuberculosis complex. Two metagenome-derived mycobacterial genomes were assigned to M. africanum, a species largely confined to West Africa; the others that could be assigned belonged to lineages T, H or LAM within the clade of "modern" M. tuberculosis strains. We have

  19. Metagenomic Analysis of Silage

    OpenAIRE

    Tennant, Richard K.; Sambles, Christine M; Diffey, Georgina E.; Moore, Karen A.; Love, John

    2017-01-01

    Metagenomics is defined as the direct analysis of deoxyribonucleic acid (DNA) purified from environmental samples and enables taxonomic identification of the microbial communities present within them. Two main metagenomic approaches exist; sequencing the 16S rRNA gene coding region, which exhibits sufficient variation between taxa for identification, and shotgun sequencing, in which genomes of the organisms that are present in the sample are analyzed and ascribed to "operational taxonomic uni...

  20. Evaluating the metagenome of two sampling locations in the nasal cavity of cattle with bovine respiratory disease complex

    Science.gov (United States)

    Bovine respiratory disease complex (BRDC) is a multi-factor disease, and disease incidence may be associated with an animal’s commensal microbiota (metagenome). Evaluation of the animal’s resident microbiota in the nasal cavity may help us to understand the impact of the metagenome on incidence of ...

  1. A short note on the cephalopods sampled in the Angola Basin during the DIVA I-expedition

    OpenAIRE

    Piatkowski, Uwe; Diekmann, Rabea

    2005-01-01

    Five cephalopods, all belonging to different species, were identified from deep-sea trawl samples conducted during the DIVA 1-expedition of RV “Meteor” in the Angola Basin in July 2000. These were the teuthoid squids Bathyteuthis abyssicola, Brachioteuthis riisei, Mastigoteuthis atlantica, Galiteuthis armata, and the finned deep-sea octopus Grimpoteuthis wuelkeri. The present study contributes information on size, morphometry, biology and distribution of the species form this unique cephalopo...

  2. Integration of Metagenomic and Biogeochemical Data from Soils Sampled from a Long-Term Reciprocal Transplant

    Science.gov (United States)

    Bailey, V. L.; Hess, N. J.; McCue, L. A.

    2014-12-01

    The long-term impacts of climate conditions on soil ecosystems are difficult to discern with sufficient resolution to underpin a predictive understanding of ecosystem response to global climate change. The structure and function of the microbial community is intimately linked to soil organic carbon (SOC) by both the deposition of new carbon, and metabolism and respiration of existing SOC. We are studying the resilience of the microbial community, and the vulnerability of the soil carbon reservoirs, to changing climate conditions using a reciprocal soil transplant experiment initiated in 1994 in eastern Washington. Soil cores were reciprocally transplanted between two elevations (310 m and 844 m); the lower site is warmer and drier with 0.8% soil carbon, and the upper site is cooler and wetter with 1.8% soil carbon. We resampled these cores in 2012-13 to analyze the structure of the microbial community, biochemical activities of carbohydrate-active enzymes, and the soil carbon and nitrogen content. We hypothesized that microbial and biochemical dynamics developed under cool, moist conditions would destabilize under hot, dry conditions, such that carbon and nitrogen losses would be faster in warmer climate soils than the accruals in cooler climate soils. Metagenomics data analyses show that the microbial communities below 5 cm depth in the transplanted soils are most similar to those in the native and control soils from their original (pre-1994) location, whereas the surface microbial community has been influenced by their new (post-1994) location. Enzyme activities are highest in soils from the cooler, moister location, and the activities of the reciprocally transplanted soils are shifting toward the activities typical of their new location. Integration of these results with high-resolution mass spectrometry data of the soil carbon moieties will contribute to our fundamental understanding of climate change effects on the terrestrial ecosystem carbon cycle.

  3. Applying meta-pathway analyses through metagenomics to identify the functional properties of the major bacterial communities of a single spontaneous cocoa bean fermentation process sample.

    Science.gov (United States)

    Illeghems, Koen; Weckx, Stefan; De Vuyst, Luc

    2015-09-01

    A high-resolution functional metagenomic analysis of a representative single sample of a Brazilian spontaneous cocoa bean fermentation process was carried out to gain insight into its bacterial community functioning. By reconstruction of microbial meta-pathways based on metagenomic data, the current knowledge about the metabolic capabilities of bacterial members involved in the cocoa bean fermentation ecosystem was extended. Functional meta-pathway analysis revealed the distribution of the metabolic pathways between the bacterial members involved. The metabolic capabilities of the lactic acid bacteria present were most associated with the heterolactic fermentation and citrate assimilation pathways. The role of Enterobacteriaceae in the conversion of substrates was shown through the use of the mixed-acid fermentation and methylglyoxal detoxification pathways. Furthermore, several other potential functional roles for Enterobacteriaceae were indicated, such as pectinolysis and citrate assimilation. Concerning acetic acid bacteria, metabolic pathways were partially reconstructed, in particular those related to responses toward stress, explaining their metabolic activities during cocoa bean fermentation processes. Further, the in-depth metagenomic analysis unveiled functionalities involved in bacterial competitiveness, such as the occurrence of CRISPRs and potential bacteriocin production. Finally, comparative analysis of the metagenomic data with bacterial genomes of cocoa bean fermentation isolates revealed the applicability of the selected strains as functional starter cultures. Copyright © 2015 Elsevier Ltd. All rights reserved.

  4. Identification of a novel interspecific hybrid yeast from a metagenomic spontaneously inoculated beer sample using Hi-C.

    Science.gov (United States)

    Smukowski Heil, Caiti; Burton, Joshua N; Liachko, Ivan; Friedrich, Anne; Hanson, Noah A; Morris, Cody L; Schacherer, Joseph; Shendure, Jay; Thomas, James H; Dunham, Maitreya J

    2018-01-01

    Interspecific hybridization is a common mechanism enabling genetic diversification and adaptation; however, the detection of hybrid species has been quite difficult. The identification of microbial hybrids is made even more complicated, as most environmental microbes are resistant to culturing and must be studied in their native mixed communities. We have previously adapted the chromosome conformation capture method Hi-C to the assembly of genomes from mixed populations. Here, we show the method's application in assembling genomes directly from an uncultured, mixed population from a spontaneously inoculated beer sample. Our assembly method has enabled us to de-convolute four bacterial and four yeast genomes from this sample, including a putative yeast hybrid. Downstream isolation and analysis of this hybrid confirmed its genome to consist of Pichia membranifaciens and that of another related, but undescribed, yeast. Our work shows that Hi-C-based metagenomic methods can overcome the limitation of traditional sequencing methods in studying complex mixtures of genomes. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  5. Large-scale targeted metagenomics analysis of bacterial ecological changes in 88 kimchi samples during fermentation.

    Science.gov (United States)

    Lee, Moeun; Song, Jung Hee; Jung, Min Young; Lee, Se Hee; Chang, Ji Yoon

    2017-09-01

    The microbial communities in kimchi vary widely, but the precise effects of differences in region of origin, ingredients, and preparation method on the microbiota are unclear. We analyzed the bacterial community composition of household (n = 69) and commercial (n = 19) kimchi samples obtained from six Korean provinces between April and August 2015. Samples were analyzed by barcoded pyrosequencing targeting the V1-V3 region of the 16S ribosomal RNA gene. The initial pH of the kimchi samples was 5.00-6.39, and the salt concentration was 1.72-4.42%. Except for sampling locality, all categorical variables, i.e., salt concentration, major ingredient, fermentation period, sampling time, and manufacturing process, influenced the bacterial community composition. Particularly, samples were highly clustered by sampling time and salt concentration in non-metric multidimensional scaling plots and an analysis of similarity. These results indicated that the microbial community differed according to fermentation conditions such as salt concentration, major ingredient, fermentation period, and sampling time. Furthermore, fermentation properties, including pH, acidity, salt concentration, and microbial abundance differed between kimchi samples from household and commercial sources. Analyses of changes in bacterial ecology during fermentation will improve our understanding of the biological properties of kimchi, as well as the relationships between these properties and the microbiota of kimchi. Copyright © 2017 Elsevier Ltd. All rights reserved.

  6. Searching for signatures across microbial communities: Metagenomic analysis of soil samples from mangrove and other ecosystems.

    Science.gov (United States)

    Imchen, Madangchanok; Kumavath, Ranjith; Barh, Debmalya; Avezedo, Vasco; Ghosh, Preetam; Viana, Marcus; Wattam, Alice R

    2017-08-18

    In this study, we categorize the microbial community in mangrove sediment samples from four different locations within a vast mangrove system in Kerala, India. We compared this data to other samples taken from the other known mangrove data, a tropical rainforest, and ocean sediment. An examination of the microbial communities from a large mangrove forest that stretches across southwestern India showed strong similarities across the higher taxonomic levels. When ocean sediment and a single isolate from a tropical rain forest were included in the analysis, a strong pattern emerged with Bacteria from the phylum Proteobacteria being the prominent taxon among the forest samples. The ocean samples were predominantly Archaea, with Euryarchaeota as the dominant phylum. Principal component and functional analyses grouped the samples isolated from forests, including those from disparate mangrove forests and the tropical rain forest, from the ocean. Our findings show similar patterns in samples were isolated from forests, and these were distinct from the ocean sediment isolates. The taxonomic structure was maintained to the level of class, and functional analysis of the genes present also displayed these similarities. Our report for the first time shows the richness of microbial diversity in the Kerala coast and its differences with tropical rain forest and ocean microbiome.

  7. Partial least squares regression can aid in detecting differential abundance of multiple features in sets of metagenomic samples

    Directory of Open Access Journals (Sweden)

    Ondrej eLibiger

    2015-12-01

    Full Text Available It is now feasible to examine the composition and diversity of microbial communities (i.e., `microbiomes‘ that populate different human organs and orifices using DNA sequencing and related technologies. To explore the potential links between changes in microbial communities and various diseases in the human body, it is essential to test associations involving different species within and across microbiomes, environmental settings and disease states. Although a number of statistical techniques exist for carrying out relevant analyses, it is unclear which of these techniques exhibit the greatest statistical power to detect associations given the complexity of most microbiome datasets. We compared the statistical power of principal component regression, partial least squares regression, regularized regression, distance-based regression, Hill's diversity measures, and a modified test implemented in the popular and widely used microbiome analysis methodology 'Metastats‘ across a wide range of simulated scenarios involving changes in feature abundance between two sets of metagenomic samples. For this purpose, simulation studies were used to change the abundance of microbial species in a real dataset from a published study examining human hands. Each technique was applied to the same data, and its ability to detect the simulated change in abundance was assessed. We hypothesized that a small subset of methods would outperform the rest in terms of the statistical power. Indeed, we found that the Metastats technique modified to accommodate multivariate analysis and partial least squares regression yielded high power under the models and data sets we studied. The statistical power of diversity measure-based tests, distance-based regression and regularized regression was significantly lower. Our results provide insight into powerful analysis strategies that utilize information on species counts from large microbiome data sets exhibiting skewed frequency

  8. Metagenomic analyses of novel viruses and plasmids from a cultured environmental sample of hyperthermophilic neutrophiles

    DEFF Research Database (Denmark)

    Garrett, Roger Antony; Prangishvili, David; Shah, Shiraz Ali

    2010-01-01

    ), and derive apparently from archaeal viruses HAV1 and HAV2. Genomic DNA was obtained from samples enriched in filamentous and tadpole-shaped virus-like particles respectively. They yielded few significant matches in public sequence databases reinforcing, further, the wide diversity of archaeal viruses...

  9. Databases of the marine metagenomics.

    Science.gov (United States)

    Mineta, Katsuhiko; Gojobori, Takashi

    2016-02-01

    The metagenomic data obtained from marine environments is significantly useful for understanding marine microbial communities. In comparison with the conventional amplicon-based approach of metagenomics, the recent shotgun sequencing-based approach has become a powerful tool that provides an efficient way of grasping a diversity of the entire microbial community at a sampling point in the sea. However, this approach accelerates accumulation of the metagenome data as well as increase of data complexity. Moreover, when metagenomic approach is used for monitoring a time change of marine environments at multiple locations of the seawater, accumulation of metagenomics data will become tremendous with an enormous speed. Because this kind of situation has started becoming of reality at many marine research institutions and stations all over the world, it looks obvious that the data management and analysis will be confronted by the so-called Big Data issues such as how the database can be constructed in an efficient way and how useful knowledge should be extracted from a vast amount of the data. In this review, we summarize the outline of all the major databases of marine metagenome that are currently publically available, noting that database exclusively on marine metagenome is none but the number of metagenome databases including marine metagenome data are six, unexpectedly still small. We also extend our explanation to the databases, as reference database we call, that will be useful for constructing a marine metagenome database as well as complementing important information with the database. Then, we would point out a number of challenges to be conquered in constructing the marine metagenome database. Copyright © 2015 Elsevier B.V. All rights reserved.

  10. Databases of the marine metagenomics

    KAUST Repository

    Mineta, Katsuhiko

    2015-10-28

    The metagenomic data obtained from marine environments is significantly useful for understanding marine microbial communities. In comparison with the conventional amplicon-based approach of metagenomics, the recent shotgun sequencing-based approach has become a powerful tool that provides an efficient way of grasping a diversity of the entire microbial community at a sampling point in the sea. However, this approach accelerates accumulation of the metagenome data as well as increase of data complexity. Moreover, when metagenomic approach is used for monitoring a time change of marine environments at multiple locations of the seawater, accumulation of metagenomics data will become tremendous with an enormous speed. Because this kind of situation has started becoming of reality at many marine research institutions and stations all over the world, it looks obvious that the data management and analysis will be confronted by the so-called Big Data issues such as how the database can be constructed in an efficient way and how useful knowledge should be extracted from a vast amount of the data. In this review, we summarize the outline of all the major databases of marine metagenome that are currently publically available, noting that database exclusively on marine metagenome is none but the number of metagenome databases including marine metagenome data are six, unexpectedly still small. We also extend our explanation to the databases, as reference database we call, that will be useful for constructing a marine metagenome database as well as complementing important information with the database. Then, we would point out a number of challenges to be conquered in constructing the marine metagenome database.

  11. An Improved Method for High Quality Metagenomics DNA Extraction from Human and Environmental Samples

    DEFF Research Database (Denmark)

    Bag, Satyabrata; Saha, Bipasa; Mehta, Ojasvi

    2016-01-01

    and quantity from culturable and uncultured microbial species living in that environment. Proper lysis of heterogeneous community microbial cells without damaging their genomes is a major challenge. In this study, we have developed an improved method for extraction of community DNA from different environmental......To explore the natural microbial community of any ecosystems by high-resolution molecular approaches including next generation sequencing, it is extremely important to develop a sensitive and reproducible DNA extraction method that facilitate isolation of microbial DNA of sufficient purity...... and human origin samples. We introduced a combination of physical, chemical and mechanical lysis methods for proper lysis of microbial inhabitants. The community microbial DNA was precipitated by using salt and organic solvent. Both the quality and quantity of isolated DNA was compared with the existing...

  12. Metagenomics and Applications

    Directory of Open Access Journals (Sweden)

    L Rafati

    2016-11-01

    Full Text Available Introduction: Bacteria are a group of microorganisms which in contrast to their diversity in nature, only very few of them can be grown and isolated in the current standard laboratories. Metagenomics as a new field of research, during the last decade has worked on clarification of the genomes of the non-cultured microbes and researchers around the world with serious study of this group of bacteria, looking for new compounds such as new antibiotics, anti-cancer agents, new enzymes and biomolecules. Methods: This article is reviews study which with study of Texts and Internet and handy browsing of key words from reliable scientific resources and sites amongst: Google Scholar, Pub med, Science direct, Sid and Scopus in the years 2000 to 2013 were collected and studied. Results: The data collection instrument in the study includes all printed metagenomics related texts. Although, nowadays metagenomics is used to screen samples but now as a perfect technique beside the medium application and other traditional techniques will have better position. The highest usage of metagenomics is in clinical cases where with conventional techniques can't be discovered microbial reasons. So for tests and analyze information need to skilled scientists. Conclusion: This paper focuses on some of the latest achievements of Metagenomics and its application in new drugs, detection of enzymes, potential of biotechnology and environment.

  13. [Meta-Mesh: metagenomic data analysis system].

    Science.gov (United States)

    Su, Xiaoquan; Song, Baoxing; Wang, Xuetao; Ma, Xinle; Xu, Jian; Ning, Kang

    2014-01-01

    With the current accumulation of metagenome data, it is possible to build an integrated platform for processing of rigorously selected metagenomic samples (also referred as "metagenomic communities" here) of interests. Any metagenomic samples could then be searched against this database to find the most similar sample(s). However, on one hand, current databases with a large number of metagenomic samples mostly serve as data repositories but not well annotated database, and only offer few functions for analysis. On the other hand, the few available methods to measure the similarity of metagenomic data could only compare a few pre-defined set of metagenome. It has long been intriguing scientists to effectively calculate similarities between microbial communities in a large repository, to examine how similar these samples are and to find the correlation of the meta-information of these samples. In this work we propose a novel system, Meta-Mesh, which includes a metagenomic database and its companion analysis platform that could systematically and efficiently analyze, compare and search similar metagenomic samples. In the database part, we have collected more than 7 000 high quality and well annotated metagenomic samples from the public domain and in-house facilities. The analysis platform supplies a list of online tools which could accept metagenomic samples, build taxonomical annotations, compare sample in multiple angle, and then search for similar samples against its database by a fast indexing strategy and scoring function. We also used case studies of "database search for identification" and "samples clustering based on similarity matrix" using human-associated habitat samples to demonstrate the performance of Meta-Mesh in metagenomic analysis. Therefore, Meta-Mesh would serve as a database and data analysis system to quickly parse and identify similar

  14. Metagenomics of the subsurface Brazos-Trinity Basin (IODP site 1320): comparison with other sediment and pyrosequenced metagenomes.

    Science.gov (United States)

    Biddle, Jennifer F; White, James Robert; Teske, Andreas P; House, Christopher H

    2011-06-01

    The Brazos-Trinity Basin on the slope of the Gulf of Mexico passive margin was drilled during Integrated Ocean Drilling Progam Expedition 308. The buried anaerobic sediments of this basin are largely organic-poor and have few microbial inhabitants compared with the organic-rich sediments with high cell counts from the Peru Margin that were drilled during Ocean Drilling Program Leg 201. Nucleic acids were extracted from Brazos-Trinity Basin sediments and were subjected to whole-genome amplification and pyrosequencing. A comparison of the Brazos-Trinity Basin metagenome, consisting of 105 Mbp, and the existing Peru Margin metagenome revealed trends linking gene content, phylogenetic content, geological location and geochemical regime. The major microbial groups (Proteobacteria, Firmicutes, Euryarchaeota and Chloroflexi) occur consistently throughout all samples, yet their shifting abundances allow for discrimination between samples. The cluster of orthologous groups category abundances for some classes of genes are correlated with geochemical factors, such as the level of ammonia. Here we describe the sediment metagenome from the oligotrophic Brazos-Trinity Basin (Site 1320) and show similarities and differences with the dataset from the Pacific Peru Margin (Site 1229) and other pyrosequenced datasets. The microbial community found at Integrated Ocean Drilling Program Site 1320 likely represents the subsurface microbial inhabitants of turbiditic slopes that lack substantial upwelling.

  15. Computational prediction of CRISPR cassettes in gut metagenome samples from Chinese type-2 diabetic patients and healthy controls.

    Science.gov (United States)

    Mangericao, Tatiana C; Peng, Zhanhao; Zhang, Xuegong

    2016-01-11

    CRISPR has been becoming a hot topic as a powerful technique for genome editing for human and other higher organisms. The original CRISPR-Cas (Clustered Regularly Interspaced Short Palindromic Repeats coupled with CRISPR-associated proteins) is an important adaptive defence system for prokaryotes that provides resistance against invading elements such as viruses and plasmids. A CRISPR cassette contains short nucleotide sequences called spacers. These unique regions retain a history of the interactions between prokaryotes and their invaders in individual strains and ecosystems. One important ecosystem in the human body is the human gut, a rich habitat populated by a great diversity of microorganisms. Gut microbiomes are important for human physiology and health. Metagenome sequencing has been widely applied for studying the gut microbiomes. Most efforts in metagenome study has been focused on profiling taxa compositions and gene catalogues and identifying their associations with human health. Less attention has been paid to the analysis of the ecosystems of microbiomes themselves especially their CRISPR composition. We conducted a preliminary analysis of CRISPR sequences in a human gut metagenomic data set of Chinese individuals of type-2 diabetes patients and healthy controls. Applying an available CRISPR-identification algorithm, PILER-CR, we identified 3169 CRISPR cassettes in the data, from which we constructed a set of 1302 unique repeat sequences and 36,709 spacers. A more extensive analysis was made for the CRISPR repeats: these repeats were submitted to a more comprehensive clustering and classification using the web server tool CRISPRmap. All repeats were compared with known CRISPRs in the database CRISPRdb. A total of 784 repeats had matches in the database, and the remaining 518 repeats from our set are potentially novel ones. The computational analysis of CRISPR composition based contigs of metagenome sequencing data is feasible. It provides an efficient

  16. The metagenomic telescope.

    Directory of Open Access Journals (Sweden)

    Balázs Szalkai

    Full Text Available Next generation sequencing technologies led to the discovery of numerous new microbe species in diverse environmental samples. Some of the new species contain genes never encountered before. Some of these genes encode proteins with novel functions, and some of these genes encode proteins that perform some well-known function in a novel way. A tool, named the Metagenomic Telescope, is described here that applies artificial intelligence methods, and seems to be capable of identifying new protein functions even in the well-studied model organisms. As a proof-of-principle demonstration of the Metagenomic Telescope, we considered DNA repair enzymes in the present work. First we identified proteins in DNA repair in well-known organisms (i.e., proteins in base excision repair, nucleotide excision repair, mismatch repair and DNA break repair; next we applied multiple alignments and then built hidden Markov profiles for each protein separately, across well-researched organisms; next, using public depositories of metagenomes, originating from extreme environments, we identified DNA repair genes in the samples. While the phylogenetic classification of the metagenomic samples are not typically available, we hypothesized that some very special DNA repair strategies need to be applied in bacteria and Archaea living in those extreme circumstances. It is a difficult task to evaluate the results obtained from mostly unknown species; therefore we applied again the hidden Markov profiling: for the identified DNA repair genes in the extreme metagenomes, we prepared new hidden Markov profiles (for each genes separately, subsequent to a cluster analysis; and we searched for similarities to those profiles in model organisms. We have found well known DNA repair proteins, numerous proteins with unknown functions, and also proteins with known, but different functions in the model organisms.

  17. Metagenomic Detection Methods in Biopreparedness Outbreak Scenarios

    DEFF Research Database (Denmark)

    Karlsson, Oskar Erik; Hansen, Trine; Knutsson, Rickard

    2013-01-01

    of a clinical sample, creating a metagenome, in a single week of laboratory work. As new technologies emerge, their dissemination and capacity building must be facilitated, and criteria for use, as well as guidelines on how to report results, must be established. This article focuses on the use of metagenomics......, gaps in research, and future directions. Examples of metagenomic detection, as well as possible applications of the methods, are described in various biopreparedness outbreak scenarios....

  18. Assembling large, complex environmental metagenomes

    Energy Technology Data Exchange (ETDEWEB)

    Howe, A. C. [Michigan State Univ., East Lansing, MI (United States). Microbiology and Molecular Genetics, Plant Soil and Microbial Sciences; Jansson, J. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Earth Sciences Division; Malfatti, S. A. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Tringe, S. G. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Tiedje, J. M. [Michigan State Univ., East Lansing, MI (United States). Microbiology and Molecular Genetics, Plant Soil and Microbial Sciences; Brown, C. T. [Michigan State Univ., East Lansing, MI (United States). Microbiology and Molecular Genetics, Computer Science and Engineering

    2012-12-28

    The large volumes of sequencing data required to sample complex environments deeply pose new challenges to sequence analysis approaches. De novo metagenomic assembly effectively reduces the total amount of data to be analyzed but requires significant computational resources. We apply two pre-assembly filtering approaches, digital normalization and partitioning, to make large metagenome assemblies more computationaly tractable. Using a human gut mock community dataset, we demonstrate that these methods result in assemblies nearly identical to assemblies from unprocessed data. We then assemble two large soil metagenomes from matched Iowa corn and native prairie soils. The predicted functional content and phylogenetic origin of the assembled contigs indicate significant taxonomic differences despite similar function. The assembly strategies presented are generic and can be extended to any metagenome; full source code is freely available under a BSD license.

  19. Beyond biodiversity: fish metagenomes.

    Directory of Open Access Journals (Sweden)

    Alba Ardura

    Full Text Available Biodiversity and intra-specific genetic diversity are interrelated and determine the potential of a community to survive and evolve. Both are considered together in Prokaryote communities treated as metagenomes or ensembles of functional variants beyond species limits.Many factors alter biodiversity in higher Eukaryote communities, and human exploitation can be one of the most important for some groups of plants and animals. For example, fisheries can modify both biodiversity and genetic diversity (intra specific. Intra-specific diversity can be drastically altered by overfishing. Intense fishing pressure on one stock may imply extinction of some genetic variants and subsequent loss of intra-specific diversity. The objective of this study was to apply a metagenome approach to fish communities and explore its value for rapid evaluation of biodiversity and genetic diversity at community level. Here we have applied the metagenome approach employing the barcoding target gene coi as a model sequence in catch from four very different fish assemblages exploited by fisheries: freshwater communities from the Amazon River and northern Spanish rivers, and marine communities from the Cantabric and Mediterranean seas.Treating all sequences obtained from each regional catch as a biological unit (exploited community we found that metagenomic diversity indices of the Amazonian catch sample here examined were lower than expected. Reduced diversity could be explained, at least partially, by overexploitation of the fish community that had been independently estimated by other methods.We propose using a metagenome approach for estimating diversity in Eukaryote communities and early evaluating genetic variation losses at multi-species level.

  20. Marine metagenomics as a source for bioprospecting

    KAUST Repository

    Kodzius, Rimantas

    2015-08-12

    This review summarizes usage of genome-editing technologies for metagenomic studies; these studies are used to retrieve and modify valuable microorganisms for production, particularly in marine metagenomics. Organisms may be cultivable or uncultivable. Metagenomics is providing especially valuable information for uncultivable samples. The novel genes, pathways and genomes can be deducted. Therefore, metagenomics, particularly genome engineering and system biology, allows for the enhancement of biological and chemical producers and the creation of novel bioresources. With natural resources rapidly depleting, genomics may be an effective way to efficiently produce quantities of known and novel foods, livestock feed, fuels, pharmaceuticals and fine or bulk chemicals.

  1. Marine metagenomics as a source for bioprospecting.

    Science.gov (United States)

    Kodzius, Rimantas; Gojobori, Takashi

    2015-12-01

    This review summarizes usage of genome-editing technologies for metagenomic studies; these studies are used to retrieve and modify valuable microorganisms for production, particularly in marine metagenomics. Organisms may be cultivable or uncultivable. Metagenomics is providing especially valuable information for uncultivable samples. The novel genes, pathways and genomes can be deducted. Therefore, metagenomics, particularly genome engineering and system biology, allows for the enhancement of biological and chemical producers and the creation of novel bioresources. With natural resources rapidly depleting, genomics may be an effective way to efficiently produce quantities of known and novel foods, livestock feed, fuels, pharmaceuticals and fine or bulk chemicals. Copyright © 2015. Published by Elsevier B.V.

  2. The metagenomic data life-cycle: standards and best practices

    Energy Technology Data Exchange (ETDEWEB)

    ten Hoopen, Petra; Finn, Robert D.; Bongo, Lars Ailo; Corre, Erwan; Fosso, Bruno; Meyer, Folker; Mitchell, Alex; Pelletier, Eric; Pesole, Graziano; Santamaria, Monica; Willassen, Nils Peder; Cochrane, Guy

    2017-06-16

    Metagenomics data analyses from independent studies can only be compared if the analysis workflows are described in a harmonised way. In this overview, we have mapped the landscape of data standards available for the description of essential steps in metagenomics: (1) material sampling, (2) material sequencing (3) data analysis and (4) data archiving & publishing. Taking examples from marine research, we summarise essential variables used to describe material sampling processes and sequencing procedures in a metagenomics experiment. These aspects of metagenomics dataset generation have been to some extent addressed by the scientific community but greater awareness and adoption is still needed. We emphasise the lack of standards relating to reporting how metagenomics datasets are analysed and how the metagenomics data analysis outputs should be archived and published. We propose best practice as a foundation for a community standard to enable reproducibility and better sharing of metagenomics datasets, leading ultimately to greater metagenomics data reuse and repurposing.

  3. Metagenomics and CAZyme Discovery.

    Science.gov (United States)

    Kunath, Benoit J; Bremges, Andreas; Weimann, Aaron; McHardy, Alice C; Pope, Phillip B

    2017-01-01

    Microorganisms play a primary role in regulating biogeochemical cycles and are a valuable source of enzymes that have biotechnological applications, such as carbohydrate-active enzymes (CAZymes). However, the inability to culture the majority of microorganisms that exist in natural ecosystems using common culture-dependent techniques restricts access to potentially novel cellulolytic bacteria and beneficial enzymes. The development of molecular-based culture-independent methods such as metagenomics enables researchers to study microbial communities directly from environmental samples, and presents a platform from which enzymes of interest can be sourced. We outline key methodological stages that are required as well as describe specific protocols that are currently used for metagenomic projects dedicated to CAZyme discovery.

  4. Metagenomics using next-generation sequencing.

    Science.gov (United States)

    Bragg, Lauren; Tyson, Gene W

    2014-01-01

    Traditionally, microbial genome sequencing has been restricted to the small number of species that can be grown in pure culture. The progressive development of culture-independent methods over the last 15 years now allows researchers to sequence microbial communities directly from environmental samples. This approach is commonly referred to as "metagenomics" or "community genomics". However, the term metagenomics is applied liberally in the literature to describe any culture-independent analysis of microbial communities. Here, we define metagenomics as shotgun ("random") sequencing of the genomic DNA of a sample taken directly from the environment. The metagenome can be thought of as a sampling of the collective genome of the microbial community. We outline the considerations and analyses that should be undertaken to ensure the success of a metagenomic sequencing project, including the choice of sequencing platform and methods for assembly, binning, annotation, and comparative analysis.

  5. Metagenome Analysis: a Powerful Tool for Enzyme Bioprospecting.

    Science.gov (United States)

    Madhavan, Aravind; Sindhu, Raveendran; Parameswaran, Binod; Sukumaran, Rajeev K; Pandey, Ashok

    2017-10-01

    Microorganisms are found throughout every corner of nature, and vast number of microorganisms is difficult to cultivate by classical microbiological techniques. The advent of metagenomics has revolutionized the field of microbial biotechnology. Metagenomics allow the recovery of genetic material directly from environmental niches without any cultivation techniques. Currently, metagenomic tools are widely employed as powerful tools to isolate and identify enzymes with novel biocatalytic activities from the uncultivable component of microbial communities. The employment of next-generation sequencing techniques for metagenomics resulted in the generation of large sequence data sets derived from various environments, such as soil, the human body and ocean water. This review article describes the state-of-the-art techniques and tools in metagenomics and discusses the potential of metagenomic approaches for the bioprospecting of industrial enzymes from various environmental samples. We also describe the unusual novel enzymes discovered via metagenomic approaches and discuss the future prospects for metagenome technologies.

  6. Metagenomic analysis of medicinal Cannabis samples; pathogenic bacteria, toxigenic fungi, and beneficial microbes grow in culture-based yeast and mold tests.

    Science.gov (United States)

    McKernan, Kevin; Spangler, Jessica; Helbert, Yvonne; Lynch, Ryan C; Devitt-Lee, Adrian; Zhang, Lei; Orphe, Wendell; Warner, Jason; Foss, Theodore; Hudalla, Christopher J; Silva, Matthew; Smith, Douglas R

    2016-01-01

    Background: The presence of bacteria and fungi in medicinal or recreational Cannabis poses a potential threat to consumers if those microbes include pathogenic or toxigenic species. This study evaluated two widely used culture-based platforms for total yeast and mold (TYM) testing marketed by 3M Corporation and Biomérieux, in comparison with a quantitative PCR (qPCR) approach marketed by Medicinal Genomics Corporation. Methods: A set of 15 medicinal Cannabis samples were analyzed using 3M and Biomérieux culture-based platforms and by qPCR to quantify microbial DNA. All samples were then subjected to next-generation sequencing and metagenomics analysis to enumerate the bacteria and fungi present before and after growth on culture-based media. Results: Several pathogenic or toxigenic bacterial and fungal species were identified in proportions of >5% of classified reads on the samples, including Acinetobacter baumannii, Escherichia coli, Pseudomonas aeruginosa, Ralstonia pickettii, Salmonella enterica, Stenotrophomonas maltophilia, Aspergillus ostianus, Aspergillus sydowii, Penicillium citrinum and Penicillium steckii. Samples subjected to culture showed substantial shifts in the number and diversity of species present, including the failure of Aspergillus species to grow well on either platform. Substantial growth of Clostridium botulinum and other bacteria were frequently observed on one or both of the culture-based TYM platforms. The presence of plant growth promoting (beneficial) fungal species further influenced the differential growth of species in the microbiome of each sample. Conclusions: These findings have important implications for the Cannabis and food safety testing industries.

  7. Metazen - metadata capture for metagenomes.

    Science.gov (United States)

    Bischof, Jared; Harrison, Travis; Paczian, Tobias; Glass, Elizabeth; Wilke, Andreas; Meyer, Folker

    2014-01-01

    As the impact and prevalence of large-scale metagenomic surveys grow, so does the acute need for more complete and standards compliant metadata. Metadata (data describing data) provides an essential complement to experimental data, helping to answer questions about its source, mode of collection, and reliability. Metadata collection and interpretation have become vital to the genomics and metagenomics communities, but considerable challenges remain, including exchange, curation, and distribution. Currently, tools are available for capturing basic field metadata during sampling, and for storing, updating and viewing it. Unfortunately, these tools are not specifically designed for metagenomic surveys; in particular, they lack the appropriate metadata collection templates, a centralized storage repository, and a unique ID linking system that can be used to easily port complete and compatible metagenomic metadata into widely used assembly and sequence analysis tools. Metazen was developed as a comprehensive framework designed to enable metadata capture for metagenomic sequencing projects. Specifically, Metazen provides a rapid, easy-to-use portal to encourage early deposition of project and sample metadata. Metazen is an interactive tool that aids users in recording their metadata in a complete and valid format. A defined set of mandatory fields captures vital information, while the option to add fields provides flexibility.

  8. Exploring nucleo-cytoplasmic large DNA viruses in Tara Oceans microbial metagenomes.

    Science.gov (United States)

    Hingamp, Pascal; Grimsley, Nigel; Acinas, Silvia G; Clerissi, Camille; Subirana, Lucie; Poulain, Julie; Ferrera, Isabel; Sarmento, Hugo; Villar, Emilie; Lima-Mendez, Gipsi; Faust, Karoline; Sunagawa, Shinichi; Claverie, Jean-Michel; Moreau, Hervé; Desdevises, Yves; Bork, Peer; Raes, Jeroen; de Vargas, Colomban; Karsenti, Eric; Kandels-Lewis, Stefanie; Jaillon, Olivier; Not, Fabrice; Pesant, Stéphane; Wincker, Patrick; Ogata, Hiroyuki

    2013-09-01

    Nucleo-cytoplasmic large DNA viruses (NCLDVs) constitute a group of eukaryotic viruses that can have crucial ecological roles in the sea by accelerating the turnover of their unicellular hosts or by causing diseases in animals. To better characterize the diversity, abundance and biogeography of marine NCLDVs, we analyzed 17 metagenomes derived from microbial samples (0.2-1.6 μm size range) collected during the Tara Oceans Expedition. The sample set includes ecosystems under-represented in previous studies, such as the Arabian Sea oxygen minimum zone (OMZ) and Indian Ocean lagoons. By combining computationally derived relative abundance and direct prokaryote cell counts, the abundance of NCLDVs was found to be in the order of 10(4)-10(5) genomes ml(-1) for the samples from the photic zone and 10(2)-10(3) genomes ml(-1) for the OMZ. The Megaviridae and Phycodnaviridae dominated the NCLDV populations in the metagenomes, although most of the reads classified in these families showed large divergence from known viral genomes. Our taxon co-occurrence analysis revealed a potential association between viruses of the Megaviridae family and eukaryotes related to oomycetes. In support of this predicted association, we identified six cases of lateral gene transfer between Megaviridae and oomycetes. Our results suggest that marine NCLDVs probably outnumber eukaryotic organisms in the photic layer (per given water mass) and that metagenomic sequence analyses promise to shed new light on the biodiversity of marine viruses and their interactions with potential hosts.

  9. Exploring neighborhoods in the metagenome universe.

    Science.gov (United States)

    Aßhauer, Kathrin P; Klingenberg, Heiner; Lingner, Thomas; Meinicke, Peter

    2014-07-14

    The variety of metagenomes in current databases provides a rapidly growing source of information for comparative studies. However, the quantity and quality of supplementary metadata is still lagging behind. It is therefore important to be able to identify related metagenomes by means of the available sequence data alone. We have studied efficient sequence-based methods for large-scale identification of similar metagenomes within a database retrieval context. In a broad comparison of different profiling methods we found that vector-based distance measures are well-suitable for the detection of metagenomic neighbors. Our evaluation on more than 1700 publicly available metagenomes indicates that for a query metagenome from a particular habitat on average nine out of ten nearest neighbors represent the same habitat category independent of the utilized profiling method or distance measure. While for well-defined labels a neighborhood accuracy of 100% can be achieved, in general the neighbor detection is severely affected by a natural overlap of manually annotated categories. In addition, we present results of a novel visualization method that is able to reflect the similarity of metagenomes in a 2D scatter plot. The visualization method shows a similarly high accuracy in the reduced space as compared with the high-dimensional profile space. Our study suggests that for inspection of metagenome neighborhoods the profiling methods and distance measures can be chosen to provide a convenient interpretation of results in terms of the underlying features. Furthermore, supplementary metadata of metagenome samples in the future needs to comply with readily available ontologies for fine-grained and standardized annotation. To make profile-based k-nearest-neighbor search and the 2D-visualization of the metagenome universe available to the research community, we included the proposed methods in our CoMet-Universe server for comparative metagenome analysis.

  10. Sequencing at sea: challenges and experiences in Ion Torrent PGM sequencing during the 2013 Southern Line Islands Research Expedition

    Directory of Open Access Journals (Sweden)

    Yan Wei Lim

    2014-08-01

    Full Text Available Genomics and metagenomics have revolutionized our understanding of marine microbial ecology and the importance of microbes in global geochemical cycles. However, the process of DNA sequencing has always been an abstract extension of the research expedition, completed once the samples were returned to the laboratory. During the 2013 Southern Line Islands Research Expedition, we started the first effort to bring next generation sequencing to some of the most remote locations on our planet. We successfully sequenced twenty six marine microbial genomes, and two marine microbial metagenomes using the Ion Torrent PGM platform on the Merchant Yacht Hanse Explorer. Onboard sequence assembly, annotation, and analysis enabled us to investigate the role of the microbes in the coral reef ecology of these islands and atolls. This analysis identified phosphonate as an important phosphorous source for microbes growing in the Line Islands and reinforced the importance of L-serine in marine microbial ecosystems. Sequencing in the field allowed us to propose hypotheses and conduct experiments and further sampling based on the sequences generated. By eliminating the delay between sampling and sequencing, we enhanced the productivity of the research expedition. By overcoming the hurdles associated with sequencing on a boat in the middle of the Pacific Ocean we proved the flexibility of the sequencing, annotation, and analysis pipelines.

  11. Sequencing at sea: challenges and experiences in Ion Torrent PGM sequencing during the 2013 Southern Line Islands Research Expedition.

    Science.gov (United States)

    Lim, Yan Wei; Cuevas, Daniel A; Silva, Genivaldo Gueiros Z; Aguinaldo, Kristen; Dinsdale, Elizabeth A; Haas, Andreas F; Hatay, Mark; Sanchez, Savannah E; Wegley-Kelly, Linda; Dutilh, Bas E; Harkins, Timothy T; Lee, Clarence C; Tom, Warren; Sandin, Stuart A; Smith, Jennifer E; Zgliczynski, Brian; Vermeij, Mark J A; Rohwer, Forest; Edwards, Robert A

    2014-01-01

    Genomics and metagenomics have revolutionized our understanding of marine microbial ecology and the importance of microbes in global geochemical cycles. However, the process of DNA sequencing has always been an abstract extension of the research expedition, completed once the samples were returned to the laboratory. During the 2013 Southern Line Islands Research Expedition, we started the first effort to bring next generation sequencing to some of the most remote locations on our planet. We successfully sequenced twenty six marine microbial genomes, and two marine microbial metagenomes using the Ion Torrent PGM platform on the Merchant Yacht Hanse Explorer. Onboard sequence assembly, annotation, and analysis enabled us to investigate the role of the microbes in the coral reef ecology of these islands and atolls. This analysis identified phosphonate as an important phosphorous source for microbes growing in the Line Islands and reinforced the importance of L-serine in marine microbial ecosystems. Sequencing in the field allowed us to propose hypotheses and conduct experiments and further sampling based on the sequences generated. By eliminating the delay between sampling and sequencing, we enhanced the productivity of the research expedition. By overcoming the hurdles associated with sequencing on a boat in the middle of the Pacific Ocean we proved the flexibility of the sequencing, annotation, and analysis pipelines.

  12. [Expedition medicine].

    Science.gov (United States)

    Donlagić, Lana

    2009-01-01

    Expedition and wildeness medicine is a term that combines rescue medicine, sport medicine as well as more specific branches as polar or high altitude medicine. It is being intensively studied both at the reaserch institutes and on expeditions. Ophtalmologists are concentrated on the reaserch of HARH (High Altitude Retinal Hemorrhage), neurologists on HACE reaserch (High Altitude Cerebral Edema), psychologists are developing tests to decsribe cognitive functions and many physicians are being trained to work in extreme enviroment. The result of all this effort are numerous new findings in pathophysiology and therapy of altitude illness, increased security on expedition and further development of expeditionism.

  13. Swine Fecal Metagenomics

    Science.gov (United States)

    Metagenomic approaches are providing rapid and more robust means to investigate the composition and functional genetic potential of complex microbial communities. In this study, we utilized a metagenomic approach to further understand the functional diversity of the swine gut. To...

  14. Integrated Metagenomic and Metatranscriptomic Analyses of Microbial Communities in the Meso- and Bathypelagic Realm of North Pacific Ocean

    Directory of Open Access Journals (Sweden)

    Deirdre R. Meldrum

    2013-10-01

    Full Text Available Although emerging evidence indicates that deep-sea water contains an untapped reservoir of high metabolic and genetic diversity, this realm has not been studied well compared with surface sea water. The study provided the first integrated meta-genomic and -transcriptomic analysis of the microbial communities in deep-sea water of North Pacific Ocean. DNA/RNA amplifications and simultaneous metagenomic and metatranscriptomic analyses were employed to discover information concerning deep-sea microbial communities from four different deep-sea sites ranging from the mesopelagic to pelagic ocean. Within the prokaryotic community, bacteria is absolutely dominant (~90% over archaea in both metagenomic and metatranscriptomic data pools. The emergence of archaeal phyla Crenarchaeota, Euryarchaeota, Thaumarchaeota, bacterial phyla Actinobacteria, Firmicutes, sub-phyla Betaproteobacteria, Deltaproteobacteria, and Gammaproteobacteria, and the decrease of bacterial phyla Bacteroidetes and Alphaproteobacteria are the main composition changes of prokaryotic communities in the deep-sea water, when compared with the reference Global Ocean Sampling Expedition (GOS surface water. Photosynthetic Cyanobacteria exist in all four metagenomic libraries and two metatranscriptomic libraries. In Eukaryota community, decreased abundance of fungi and algae in deep sea was observed. RNA/DNA ratio was employed as an index to show metabolic activity strength of microbes in deep sea. Functional analysis indicated that deep-sea microbes are leading a defensive lifestyle.

  15. Expedition sol

    DEFF Research Database (Denmark)

    Jacobsen, Aase Roland

    2006-01-01

    Tag på expedition sol rundt i museet. Er der nogen, der har taget en bid af solen? Hvorfor bliver der solformørkelse? Kan vi undvære Solen?......Tag på expedition sol rundt i museet. Er der nogen, der har taget en bid af solen? Hvorfor bliver der solformørkelse? Kan vi undvære Solen?...

  16. Parallel-META 2.0: Enhanced Metagenomic Data Analysis with Functional Annotation, High Performance Computing and Advanced Visualization

    OpenAIRE

    Su, Xiaoquan; Pan, Weihua; Song, Baoxing; Xu, Jian; Ning, Kang

    2014-01-01

    The metagenomic method directly sequences and analyses genome information from microbial communities. The main computational tasks for metagenomic analyses include taxonomical and functional structure analysis for all genomes in a microbial community (also referred to as a metagenomic sample). With the advancement of Next Generation Sequencing (NGS) techniques, the number of metagenomic samples and the data size for each sample are increasing rapidly. Current metagenomic analysis is both data...

  17. Metagenomic assembly through the lens of validation: recent advances in assessing and improving the quality of genomes assembled from metagenomes.

    Science.gov (United States)

    Olson, Nathan D; Treangen, Todd J; Hill, Christopher M; Cepeda-Espinoza, Victoria; Ghurye, Jay; Koren, Sergey; Pop, Mihai

    2017-08-07

    Metagenomic samples are snapshots of complex ecosystems at work. They comprise hundreds of known and unknown species, contain multiple strain variants and vary greatly within and across environments. Many microbes found in microbial communities are not easily grown in culture making their DNA sequence our only clue into their evolutionary history and biological function. Metagenomic assembly is a computational process aimed at reconstructing genes and genomes from metagenomic mixtures. Current methods have made significant strides in reconstructing DNA segments comprising operons, tandem gene arrays and syntenic blocks. Shorter, higher-throughput sequencing technologies have become the de facto standard in the field. Sequencers are now able to generate billions of short reads in only a few days. Multiple metagenomic assembly strategies, pipelines and assemblers have appeared in recent years. Owing to the inherent complexity of metagenome assembly, regardless of the assembly algorithm and sequencing method, metagenome assemblies contain errors. Recent developments in assembly validation tools have played a pivotal role in improving metagenomics assemblers. Here, we survey recent progress in the field of metagenomic assembly, provide an overview of key approaches for genomic and metagenomic assembly validation and demonstrate the insights that can be derived from assemblies through the use of assembly validation strategies. We also discuss the potential for impact of long-read technologies in metagenomics. We conclude with a discussion of future challenges and opportunities in the field of metagenomic assembly and validation. © The Author 2017. Published by Oxford University Press.

  18. Diagnostic metagenomics: potential applications to bacterial, viral and parasitic infections.

    Science.gov (United States)

    Pallen, M J

    2014-12-01

    The term 'shotgun metagenomics' is applied to the direct sequencing of DNA extracted from a sample without culture or target-specific amplification or capture. In diagnostic metagenomics, this approach is applied to clinical samples in the hope of detecting and characterizing pathogens. Here, I provide a conceptual overview, before reviewing several recent promising proof-of-principle applications of metagenomics in virus discovery, analysis of outbreaks and detection of pathogens in contemporary and historical samples. I also evaluate future prospects for diagnostic metagenomics in the light of relentless improvements in sequencing technologies.

  19. Metagenomics and future perspectives in virus discovery.

    Science.gov (United States)

    Mokili, John L; Rohwer, Forest; Dutilh, Bas E

    2012-02-01

    Monitoring the emergence and re-emergence of viral diseases with the goal of containing the spread of viral agents requires both adequate preparedness and quick response. Identifying the causative agent of a new epidemic is one of the most important steps for effective response to disease outbreaks. Traditionally, virus discovery required propagation of the virus in cell culture, a proven technique responsible for the identification of the vast majority of viruses known to date. However, many viruses cannot be easily propagated in cell culture, thus limiting our knowledge of viruses. Viral metagenomic analyses of environmental samples suggest that the field of virology has explored less than 1% of the extant viral diversity. In the last decade, the culture-independent and sequence-independent metagenomic approach has permitted the discovery of many viruses in a wide range of samples. Phylogenetically, some of these viruses are distantly related to previously discovered viruses. In addition, 60-99% of the sequences generated in different viral metagenomic studies are not homologous to known viruses. In this review, we discuss the advances in the area of viral metagenomics during the last decade and their relevance to virus discovery, clinical microbiology and public health. We discuss the potential of metagenomics for characterization of the normal viral population in a healthy community and identification of viruses that could pose a threat to humans through zoonosis. In addition, we propose a new model of the Koch's postulates named the 'Metagenomic Koch's Postulates'. Unlike the original Koch's postulates and the Molecular Koch's postulates as formulated by Falkow, the metagenomic Koch's postulates focus on the identification of metagenomic traits in disease cases. The metagenomic traits that can be traced after healthy individuals have been exposed to the source of the suspected pathogen. Copyright © 2011 Elsevier B.V. All rights reserved.

  20. Bioprospecting metagenomes: Glycosyl hydrolases for converting biomass

    Energy Technology Data Exchange (ETDEWEB)

    Li, L.; van der Lelie, D.; McCorkle, S. R.; Monchy, S.; Taghavi, S.

    2009-05-18

    Throughout immeasurable time, microorganisms evolved and accumulated remarkable physiological and functional heterogeneity, and now constitute the major reserve for genetic diversity on earth. Using metagenomics, namely genetic material recovered directly from environmental samples, this biogenetic diversification can be accessed without the need to cultivate cells. Accordingly, microbial communities and their metagenomes, isolated from biotopes with high turnover rates of recalcitrant biomass, such as lignocellulosic plant cell walls, have become a major resource for bioprospecting; furthermore, this material is a major asset in the search for new biocatalytics (enzymes) for various industrial processes, including the production of biofuels from plant feedstocks. However, despite the contributions from metagenomics technologies consequent upon the discovery of novel enzymes, this relatively new enterprise requires major improvements. In this review, we compare function-based metagenome screening and sequence-based metagenome data mining, discussing the advantages and limitations of both methods. We also describe the unusual enzymes discovered via metagenomics approaches, and discuss the future prospects for metagenome technologies.

  1. Bioprospecting metagenomes: glycosyl hydrolases for converting biomass

    Directory of Open Access Journals (Sweden)

    Monchy Sebastien

    2009-05-01

    Full Text Available Abstract Throughout immeasurable time, microorganisms evolved and accumulated remarkable physiological and functional heterogeneity, and now constitute the major reserve for genetic diversity on earth. Using metagenomics, namely genetic material recovered directly from environmental samples, this biogenetic diversification can be accessed without the need to cultivate cells. Accordingly, microbial communities and their metagenomes, isolated from biotopes with high turnover rates of recalcitrant biomass, such as lignocellulosic plant cell walls, have become a major resource for bioprospecting; furthermore, this material is a major asset in the search for new biocatalytics (enzymes for various industrial processes, including the production of biofuels from plant feedstocks. However, despite the contributions from metagenomics technologies consequent upon the discovery of novel enzymes, this relatively new enterprise requires major improvements. In this review, we compare function-based metagenome screening and sequence-based metagenome data mining, discussing the advantages and limitations of both methods. We also describe the unusual enzymes discovered via metagenomics approaches, and discuss the future prospects for metagenome technologies.

  2. [Pathology and viral metagenomics, a recent history].

    Science.gov (United States)

    Bernardo, Pauline; Albina, Emmanuel; Eloit, Marc; Roumagnac, Philippe

    2013-05-01

    Human, animal and plant viral diseases have greatly benefited from recent metagenomics developments. Viral metagenomics is a culture-independent approach used to investigate the complete viral genetic populations of a sample. During the last decade, metagenomics concepts and techniques that were first used by ecologists progressively spread into the scientific field of viral pathology. The sample, which was first for ecologists a fraction of ecosystem, became for pathologists an organism that hosts millions of microbes and viruses. This new approach, providing without a priori high resolution qualitative and quantitative data on the viral diversity, is now revolutionizing the way pathologists decipher viral diseases. This review describes the very last improvements of the high throughput next generation sequencing methods and discusses the applications of viral metagenomics in viral pathology, including discovery of novel viruses, viral surveillance and diagnostic, large-scale molecular epidemiology, and viral evolution. © 2013 médecine/sciences – Inserm.

  3. Metagenomic exploration of viruses throughout the Indian Ocean.

    Directory of Open Access Journals (Sweden)

    Shannon J Williamson

    Full Text Available The characterization of global marine microbial taxonomic and functional diversity is a primary goal of the Global Ocean Sampling Expedition. As part of this study, 19 water samples were collected aboard the Sorcerer II sailing vessel from the southern Indian Ocean in an effort to more thoroughly understand the lifestyle strategies of the microbial inhabitants of this ultra-oligotrophic region. No investigations of whole virioplankton assemblages have been conducted on waters collected from the Indian Ocean or across multiple size fractions thus far. Therefore, the goals of this study were to examine the effect of size fractionation on viral consortia structure and function and understand the diversity and functional potential of the Indian Ocean virome. Five samples were selected for comprehensive metagenomic exploration; and sequencing was performed on the microbes captured on 3.0-, 0.8- and 0.1 µm membrane filters as well as the viral fraction (<0.1 µm. Phylogenetic approaches were also used to identify predicted proteins of viral origin in the larger fractions of data from all Indian Ocean samples, which were included in subsequent metagenomic analyses. Taxonomic profiling of viral sequences suggested that size fractionation of marine microbial communities enriches for specific groups of viruses within the different size classes and functional characterization further substantiated this observation. Functional analyses also revealed a relative enrichment for metabolic proteins of viral origin that potentially reflect the physiological condition of host cells in the Indian Ocean including those involved in nitrogen metabolism and oxidative phosphorylation. A novel classification method, MGTAXA, was used to assess virus-host relationships in the Indian Ocean by predicting the taxonomy of putative host genera, with Prochlorococcus, Acanthochlois and members of the SAR86 cluster comprising the most abundant predictions. This is the first study

  4. Human milk metagenome: a functional capacity analysis

    Science.gov (United States)

    2013-01-01

    Background Human milk contains a diverse population of bacteria that likely influences colonization of the infant gastrointestinal tract. Recent studies, however, have been limited to characterization of this microbial community by 16S rRNA analysis. In the present study, a metagenomic approach using Illumina sequencing of a pooled milk sample (ten donors) was employed to determine the genera of bacteria and the types of bacterial open reading frames in human milk that may influence bacterial establishment and stability in this primal food matrix. The human milk metagenome was also compared to that of breast-fed and formula-fed infants’ feces (n = 5, each) and mothers’ feces (n = 3) at the phylum level and at a functional level using open reading frame abundance. Additionally, immune-modulatory bacterial-DNA motifs were also searched for within human milk. Results The bacterial community in human milk contained over 360 prokaryotic genera, with sequences aligning predominantly to the phyla of Proteobacteria (65%) and Firmicutes (34%), and the genera of Pseudomonas (61.1%), Staphylococcus (33.4%) and Streptococcus (0.5%). From assembled human milk-derived contigs, 30,128 open reading frames were annotated and assigned to functional categories. When compared to the metagenome of infants’ and mothers’ feces, the human milk metagenome was less diverse at the phylum level, and contained more open reading frames associated with nitrogen metabolism, membrane transport and stress response (P milk metagenome also contained a similar occurrence of immune-modulatory DNA motifs to that of infants’ and mothers’ fecal metagenomes. Conclusions Our results further expand the complexity of the human milk metagenome and enforce the benefits of human milk ingestion on the microbial colonization of the infant gut and immunity. Discovery of immune-modulatory motifs in the metagenome of human milk indicates more exhaustive analyses of the functionality of the human

  5. Metagenomic analysis of a sample from a patient with respiratory tract infection reveals the presence of a γ-papillomavirus

    Directory of Open Access Journals (Sweden)

    Marta eCanuti

    2014-07-01

    Full Text Available Previously unknown or unexpected pathogens may be responsible for that proportion of respiratory diseases in which a causative agent cannot be identified. The application of broad-spectrum, sequence independent virus discovery techniques may be useful to reduce this proportion and widen our knowledge about respiratory pathogens. Thanks to the availability of high throughput sequencing (HTS technology, it became today possible to detect viruses which are present at a very low load, but the clinical relevance of those viruses must be investigated. In this study we used VIDISCA-454, a restriction enzyme based virus discovery method that utilizes Roche 454 HTS system, on a nasal swab collected from a subject with respiratory complaints. A γ-papillomavirus was detected (complete genome: 7142 bp and its role in disease was investigated. Respiratory samples collected both during the acute phase of the illness and two weeks after full recovery contained the virus. The patient presented antibodies directed against the virus but there was no difference between IgG levels in blood samples collected during the acute phase and two weeks after full recovery. We therefore concluded that the detected γ-papillomavirus is unlikely to be the causative agent of the respiratory complaints and its presence in the nose of the patient is not related to the disease. Although HTS based virus discovery techniques proved their great potential as a tool to clarify the aetiology of some infectious diseases, the obtained information must be subjected to cautious interpretations. This study underlines the crucial importance of performing careful investigations on viruses identified when applying sensitive virus discovery techniques, since the mere identification of a virus and its presence in a clinical sample are not satisfactory proofs to establish a causative link with a disease.

  6. An algorithm for detecting eukaryotic sequences in metagenomic ...

    Indian Academy of Sciences (India)

    Physical partitioning techniques are routinely employed (during sample preparation stage) for segregating the prokaryotic and eukaryotic fractions of metagenomic samples. In spite of these efforts, several metagenomic studies focusing on bacterial and archaeal populations have reported the presence of contaminating ...

  7. Toward Accurate and Quantitative Comparative Metagenomics.

    Science.gov (United States)

    Nayfach, Stephen; Pollard, Katherine S

    2016-08-25

    Shotgun metagenomics and computational analysis are used to compare the taxonomic and functional profiles of microbial communities. Leveraging this approach to understand roles of microbes in human biology and other environments requires quantitative data summaries whose values are comparable across samples and studies. Comparability is currently hampered by the use of abundance statistics that do not estimate a meaningful parameter of the microbial community and biases introduced by experimental protocols and data-cleaning approaches. Addressing these challenges, along with improving study design, data access, metadata standardization, and analysis tools, will enable accurate comparative metagenomics. We envision a future in which microbiome studies are replicable and new metagenomes are easily and rapidly integrated with existing data. Only then can the potential of metagenomics for predictive ecological modeling, well-powered association studies, and effective microbiome medicine be fully realized. Copyright © 2016 Elsevier Inc. All rights reserved.

  8. Comparative metagenomics of the Red Sea

    KAUST Repository

    Mineta, Katsuhiko

    2016-01-26

    Metagenome produces a tremendous amount of data that comes from the organisms living in the environments. This big data enables us to examine not only microbial genes but also the community structure, interaction and adaptation mechanisms at the specific location and condition. The Red Sea has several unique characteristics such as high salinity, high temperature and low nutrition. These features must contribute to form the unique microbial community during the evolutionary process. Since 2014, we started monthly samplings of the metagenomes in the Red Sea under KAUST-CCF project. In collaboration with Kitasato University, we also collected the metagenome data from the ocean in Japan, which shows contrasting features to the Red Sea. Therefore, the comparative metagenomics of those data provides a comprehensive view of the Red Sea microbes, leading to identify key microbes, genes and networks related to those environmental differences.

  9. Quality control of microbiota metagenomics by k-mer analysis.

    Science.gov (United States)

    Plaza Onate, Florian; Batto, Jean-Michel; Juste, Catherine; Fadlallah, Jehane; Fougeroux, Cyrielle; Gouas, Doriane; Pons, Nicolas; Kennedy, Sean; Levenez, Florence; Dore, Joel; Ehrlich, S Dusko; Gorochov, Guy; Larsen, Martin

    2015-03-14

    The biological and clinical consequences of the tight interactions between host and microbiota are rapidly being unraveled by next generation sequencing technologies and sophisticated bioinformatics, also referred to as microbiota metagenomics. The recent success of metagenomics has created a demand to rapidly apply the technology to large case-control cohort studies and to studies of microbiota from various habitats, including habitats relatively poor in microbes. It is therefore of foremost importance to enable a robust and rapid quality assessment of metagenomic data from samples that challenge present technological limits (sample numbers and size). Here we demonstrate that the distribution of overlapping k-mers of metagenome sequence data predicts sequence quality as defined by gene distribution and efficiency of sequence mapping to a reference gene catalogue. We used serial dilutions of gut microbiota metagenomic datasets to generate well-defined high to low quality metagenomes. We also analyzed a collection of 52 microbiota-derived metagenomes. We demonstrate that k-mer distributions of metagenomic sequence data identify sequence contaminations, such as sequences derived from "empty" ligation products. Of note, k-mer distributions were also able to predict the frequency of sequences mapping to a reference gene catalogue not only for the well-defined serial dilution datasets, but also for 52 human gut microbiota derived metagenomic datasets. We propose that k-mer analysis of raw metagenome sequence reads should be implemented as a first quality assessment prior to more extensive bioinformatics analysis, such as sequence filtering and gene mapping. With the rising demand for metagenomic analysis of microbiota it is crucial to provide tools for rapid and efficient decision making. This will eventually lead to a faster turn-around time, improved analytical quality including sample quality metrics and a significant cost reduction. Finally, improved quality

  10. An Experimental Metagenome Data Management and AnalysisSystem

    Energy Technology Data Exchange (ETDEWEB)

    Markowitz, Victor M.; Korzeniewski, Frank; Palaniappan, Krishna; Szeto, Ernest; Ivanova, Natalia N.; Kyrpides, Nikos C.; Hugenholtz, Philip

    2006-03-01

    The application of shotgun sequencing to environmental samples has revealed a new universe of microbial community genomes (metagenomes) involving previously uncultured organisms. Metagenome analysis, which is expected to provide a comprehensive picture of the gene functions and metabolic capacity of microbial community, needs to be conducted in the context of a comprehensive data management and analysis system. We present in this paper IMG/M, an experimental metagenome data management and analysis system that is based on the Integrated Microbial Genomes (IMG) system. IMG/M provides tools and viewers for analyzing both metagenomes and isolate genomes individually or in a comparative context.

  11. Metagenomics at Grass Roots

    Indian Academy of Sciences (India)

    Home; Journals; Resonance – Journal of Science Education; Volume 22; Issue 3. Metagenomics at Grass Roots. Sudeshna ... benefit human health, agriculture, and ecosystemfunctions. This article provides a brief history of technicaladvances in metagenomics, including DNA sequencing methods,and some case studies.

  12. Metagenomic Analysis of Dairy Bacteriophages: Extraction Method and Pilot Study on Whey Samples Derived from Using Undefined and Defined Mesophilic Starter Cultures.

    Science.gov (United States)

    Muhammed, Musemma K; Kot, Witold; Neve, Horst; Mahony, Jennifer; Castro-Mejía, Josué L; Krych, Lukasz; Hansen, Lars H; Nielsen, Dennis S; Sørensen, Søren J; Heller, Knut J; van Sinderen, Douwe; Vogensen, Finn K

    2017-10-01

    Despite being potentially highly useful for characterizing the biodiversity of phages, metagenomic studies are currently not available for dairy bacteriophages, partly due to the lack of a standard procedure for phage extraction. We optimized an extraction method that allows the removal of the bulk protein from whey and milk samples with losses of less than 50% of spiked phages. The protocol was applied to extract phages from whey in order to test the notion that members of Lactococcus lactis 936 (now Sk1virus), P335, c2 (now C2virus) and Leuconostoc phage groups are the most frequently encountered in the dairy environment. The relative abundance and diversity of phages in eight and four whey mixtures from dairies using undefined mesophilic mixed-strain cultures containing Lactococcus lactis subsp. lactis biovar diacetylactis and Leuconostoc species (i.e., DL starter cultures) and defined cultures, respectively, were assessed. Results obtained from transmission electron microscopy and high-throughput sequence analyses revealed the dominance of Lc. lactis 936 phages (order Caudovirales, family Siphoviridae) in dairies using undefined DL starter cultures and Lc. lactis c2 phages (order Caudovirales, family Siphoviridae) in dairies using defined cultures. The 936 and Leuconostoc phages demonstrated limited diversity. Possible coinduction of temperate P335 prophages and satellite phages in one of the whey mixtures was also observed.IMPORTANCE The method optimized in this study could provide an important basis for understanding the dynamics of the phage community (abundance, development, diversity, evolution, etc.) in dairies with different sizes, locations, and production strategies. It may also enable the discovery of previously unknown phages, which is crucial for the development of rapid molecular biology-based methods for phage burden surveillance systems. The dominance of only a few phage groups in the dairy environment signifies the depth of knowledge gained over

  13. Underway Sampling of Marine Inherent Optical Properties on the Tara Oceans Expedition as a Novel Resource for Ocean Color Satellite Data Product Validation

    Science.gov (United States)

    Werdell, P. Jeremy; Proctor, Christopher W.; Boss, Emmanuel; Leeuw, Thomas; Ouhssain, Mustapha

    2013-01-01

    Developing and validating data records from operational ocean color satellite instruments requires substantial volumes of high quality in situ data. In the absence of broad, institutionally supported field programs, organizations such as the NASA Ocean Biology Processing Group seek opportunistic datasets for use in their operational satellite calibration and validation activities. The publicly available, global biogeochemical dataset collected as part of the two and a half year Tara Oceans expedition provides one such opportunity. We showed how the inline measurements of hyperspectral absorption and attenuation coefficients collected onboard the R/V Tara can be used to evaluate near-surface estimates of chlorophyll-a, spectral particulate backscattering coefficients, particulate organic carbon, and particle size classes derived from the NASA Moderate Resolution Imaging Spectroradiometer onboard Aqua (MODISA). The predominant strength of such flow-through measurements is their sampling rate-the 375 days of measurements resulted in 165 viable MODISA-to-in situ match-ups, compared to 13 from discrete water sampling. While the need to apply bio-optical models to estimate biogeochemical quantities of interest from spectroscopy remains a weakness, we demonstrated how discrete samples can be used in combination with flow-through measurements to create data records of sufficient quality to conduct first order evaluations of satellite-derived data products. Given an emerging agency desire to rapidly evaluate new satellite missions, our results have significant implications on how calibration and validation teams for these missions will be constructed.

  14. Shotgun metagenomic data streams: surfing without fear

    Energy Technology Data Exchange (ETDEWEB)

    Berendzen, Joel R [Los Alamos National Laboratory

    2010-12-06

    Timely information about bio-threat prevalence, consequence, propagation, attribution, and mitigation is needed to support decision-making, both routinely and in a crisis. One DNA sequencer can stream 25 Gbp of information per day, but sampling strategies and analysis techniques are needed to turn raw sequencing power into actionable knowledge. Shotgun metagenomics can enable biosurveillance at the level of a single city, hospital, or airplane. Metagenomics characterizes viruses and bacteria from complex environments such as soil, air filters, or sewage. Unlike targeted-primer-based sequencing, shotgun methods are not blind to sequences that are truly novel, and they can measure absolute prevalence. Shotgun metagenomic sampling can be non-invasive, efficient, and inexpensive while being informative. We have developed analysis techniques for shotgun metagenomic sequencing that rely upon phylogenetic signature patterns. They work by indexing local sequence patterns in a manner similar to web search engines. Our methods are laptop-fast and favorable scaling properties ensure they will be sustainable as sequencing methods grow. We show examples of application to soil metagenomic samples.

  15. Metazen – metadata capture for metagenomes

    Science.gov (United States)

    2014-01-01

    Background As the impact and prevalence of large-scale metagenomic surveys grow, so does the acute need for more complete and standards compliant metadata. Metadata (data describing data) provides an essential complement to experimental data, helping to answer questions about its source, mode of collection, and reliability. Metadata collection and interpretation have become vital to the genomics and metagenomics communities, but considerable challenges remain, including exchange, curation, and distribution. Currently, tools are available for capturing basic field metadata during sampling, and for storing, updating and viewing it. Unfortunately, these tools are not specifically designed for metagenomic surveys; in particular, they lack the appropriate metadata collection templates, a centralized storage repository, and a unique ID linking system that can be used to easily port complete and compatible metagenomic metadata into widely used assembly and sequence analysis tools. Results Metazen was developed as a comprehensive framework designed to enable metadata capture for metagenomic sequencing projects. Specifically, Metazen provides a rapid, easy-to-use portal to encourage early deposition of project and sample metadata. Conclusions Metazen is an interactive tool that aids users in recording their metadata in a complete and valid format. A defined set of mandatory fields captures vital information, while the option to add fields provides flexibility. PMID:25780508

  16. Preliminary results of three-dimensional stress orientation determined by anelastic strain recovery (ASR) measurements of core samples retrieved from IODP Expedition 343

    Science.gov (United States)

    Lin, W.; Yamamoto, Y.; Tanikawa, W.

    2013-12-01

    Integrated Ocean Drilling Program (IODP) Expedition 343, Japan Trench Fast Drilling Project (JFAST) penetrated to ~850 meter below seafloor (mbsf) in a water depth of 6890 m and passed through the plate boundary fault of the overriding North American Plate and the subducting Pacific plate witch. The fault locates at ~820 mbsf and is preliminarily considered to be the source fault of the 2011 Tohoku-oki Mw 9.0 earthquake. Area of JFAST drilling site (C0019) was in the largest coseismic slip zone where the fault slipped more than 50 m during the earthquake. Hole C0019E dedicated to coring retrieved a total of 21 cores having a total of 51 m long cores from both the hanging wall and the footwall of the plate boundary fault. To determine three-dimensional stress state after the huge earthquake, we collected four whole round core samples and measured anelastic strain recovery (ASR) also called 'relaxation' of the core samples onboard D/V Chikyu. The principle idea behind the ASR method is that stress-induced elastic strain is released first instantaneously (i.e., as time-independent elastic strain), followed by a more gradual or time-dependent recovery of anelastic strain. The ASR method takes advantage of the time-dependent strain and has been successfully applied in IODP expeditions (e.g. Byrne et al., 2009; Yamamoto et al., 2013). The four core samples used for ASR measurements were taken from C0019E-1R1 (~177 mbsf), C0019E-5R1 (~697 mbsf)), C0019E-13R1 (~802 mbsf) and C0019E-19R2 (~828 mbsf). The three core samples at shallower depths were in the hanging wall of the fault; and the deepest one was in the footwall. We started ASR measurements approximate three hours after the core was 'on deck', that is approximate six hours from the in situ stress was released, and keep the measurements for about two weeks. The anelastic strains measured in nine directions including six independent directions were extensional; all of the curves varied smoothly and similarly with

  17. Viral metagenomics: are we missing the giants?

    Science.gov (United States)

    Halary, S; Temmam, S; Raoult, D; Desnues, C

    2016-06-01

    Amoeba-infecting giant viruses are recently discovered viruses that have been isolated from diverse environments all around the world. In parallel to isolation efforts, metagenomics confirmed their worldwide distribution from a broad range of environmental and host-associated samples, including humans, depicting them as a major component of eukaryotic viruses in nature and a possible resident of the human/animal virome whose role is still unclear. Nevertheless, metagenomics data about amoeba-infecting giant viruses still remain scarce, mainly because of methodological limitations. Efforts should be pursued both at the metagenomic sample preparation level and on in silico analyses to better understand their roles in the environment and in human/animal health and disease. Copyright © 2016 Elsevier Ltd. All rights reserved.

  18. Functional assignment of metagenomic data: challenges and applications

    Science.gov (United States)

    Prakash, Tulika

    2012-01-01

    Metagenomic sequencing provides a unique opportunity to explore earth’s limitless environments harboring scores of yet unknown and mostly unculturable microbes and other organisms. Functional analysis of the metagenomic data plays a central role in projects aiming to explore the most essential questions in microbiology, namely ‘In a given environment, among the microbes present, what are they doing, and how are they doing it?’ Toward this goal, several large-scale metagenomic projects have recently been conducted or are currently underway. Functional analysis of metagenomic data mainly suffers from the vast amount of data generated in these projects. The shear amount of data requires much computational time and storage space. These problems are compounded by other factors potentially affecting the functional analysis, including, sample preparation, sequencing method and average genome size of the metagenomic samples. In addition, the read-lengths generated during sequencing influence sequence assembly, gene prediction and subsequently the functional analysis. The level of confidence for functional predictions increases with increasing read-length. Usually, the most reliable functional annotations for metagenomic sequences are achieved using homology-based approaches against publicly available reference sequence databases. Here, we present an overview of the current state of functional analysis of metagenomic sequence data, bottlenecks frequently encountered and possible solutions in light of currently available resources and tools. Finally, we provide some examples of applications from recent metagenomic studies which have been successfully conducted in spite of the known difficulties. PMID:22772835

  19. High throughput whole rumen metagenome profiling using untargeted massively parallel sequencing

    Directory of Open Access Journals (Sweden)

    Ross Elizabeth M

    2012-07-01

    Full Text Available Abstract Background Variation of microorganism communities in the rumen of cattle (Bos taurus is of great interest because of possible links to economically or environmentally important traits, such as feed conversion efficiency or methane emission levels. The resolution of studies investigating this variation may be improved by utilizing untargeted massively parallel sequencing (MPS, that is, sequencing without targeted amplification of genes. The objective of this study was to develop a method which used MPS to generate “rumen metagenome profiles”, and to investigate if these profiles were repeatable among samples taken from the same cow. Given faecal samples are much easier to obtain than rumen fluid samples; we also investigated whether rumen metagenome profiles were predictive of faecal metagenome profiles. Results Rather than focusing on individual organisms within the rumen, our method used MPS data to generate quantitative rumen micro-biome profiles, regardless of taxonomic classifications. The method requires a previously assembled reference metagenome. A number of such reference metagenomes were considered, including two rumen derived metagenomes, a human faecal microflora metagenome and a reference metagenome made up of publically available prokaryote sequences. Sequence reads from each test sample were aligned to these references. The “rumen metagenome profile” was generated from the number of the reads that aligned to each contig in the database. We used this method to test the hypothesis that rumen fluid microbial community profiles vary more between cows than within multiple samples from the same cow. Rumen fluid samples were taken from three cows, at three locations within the rumen. DNA from the samples was sequenced on the Illumina GAIIx. When the reads were aligned to a rumen metagenome reference, the rumen metagenome profiles were repeatable (P  Conclusions We have presented a simple and high throughput method of

  20. West Indian amphipod families and genera of the Wagenaar Hummelinck expeditions (Amphipoda, Crustacea) List of sampling stations 1930-1973

    NARCIS (Netherlands)

    Könemann, Stefan

    1997-01-01

    The conservation and scientific evaluation of major zoological collections is a relatively timeconsuming task that crucially depends on regular financial support. In times of low funding and limited grants these activities are often cut back to a minimum. Samples and specimens are stored in

  1. Viral Metagenomics: MetaView Software

    Energy Technology Data Exchange (ETDEWEB)

    Zhou, C; Smith, J

    2007-10-22

    The purpose of this report is to design and develop a tool for analysis of raw sequence read data from viral metagenomics experiments. The tool should compare read sequences of known viral nucleic acid sequence data and enable a user to attempt to determine, with some degree of confidence, what virus groups may be present in the sample. This project was conducted in two phases. In phase 1 we surveyed the literature and examined existing metagenomics tools to educate ourselves and to more precisely define the problem of analyzing raw read data from viral metagenomic experiments. In phase 2 we devised an approach and built a prototype code and database. This code takes viral metagenomic read data in fasta format as input and accesses all complete viral genomes from Kpath for sequence comparison. The system executes at the UNIX command line, producing output that is stored in an Oracle relational database. We provide here a description of the approach we came up with for handling un-assembled, short read data sets from viral metagenomics experiments. We include a discussion of the current MetaView code capabilities and additional functionality that we believe should be added, should additional funding be acquired to continue the work.

  2. Metagenomics for pathogen detection in public health

    Science.gov (United States)

    2013-01-01

    Traditional pathogen detection methods in public health infectious disease surveillance rely upon the identification of agents that are already known to be associated with a particular clinical syndrome. The emerging field of metagenomics has the potential to revolutionize pathogen detection in public health laboratories by allowing the simultaneous detection of all microorganisms in a clinical sample, without a priori knowledge of their identities, through the use of next-generation DNA sequencing. A single metagenomics analysis has the potential to detect rare and novel pathogens, and to uncover the role of dysbiotic microbiomes in infectious and chronic human disease. Making use of advances in sequencing platforms and bioinformatics tools, recent studies have shown that metagenomics can even determine the whole-genome sequences of pathogens, allowing inferences about antibiotic resistance, virulence, evolution and transmission to be made. We are entering an era in which more novel infectious diseases will be identified through metagenomics-based methods than through traditional laboratory methods. The impetus is now on public health laboratories to integrate metagenomics techniques into their diagnostic arsenals. PMID:24050114

  3. [A review on the bioinformatics pipelines for metagenomic research].

    Science.gov (United States)

    Ye, Dan-Dan; Fan, Meng-Meng; Guan, Qiong; Chen, Hong-Ju; Ma, Zhan-Shan

    2012-12-01

    Metagenome, a term first dubbed by Handelsman in 1998 as "the genomes of the total microbiota found in nature", refers to sequence data directly sampled from the environment (which may be any habitat in which microbes live, such as the guts of humans and animals, milk, soil, lakes, glaciers, and oceans). Metagenomic technologies originated from environmental microbiology studies and their wide application has been greatly facilitated by next-generation high throughput sequencing technologies. Like genomics studies, the bottle neck of metagenomic research is how to effectively and efficiently analyze the gigantic amount of metagenomic sequence data using the bioinformatics pipelines to obtain meaningful biological insights. In this article, we briefly review the state-of-the-art bioinformatics software tools in metagenomic research. Due to the differences between the metagenomic data obtained from whole genome sequencing (i.e., shotgun metagenomics) and amplicon sequencing (i.e., 16S-rRNA and gene-targeted metagenomics) methods, there are significant differences between the corresponding bioinformatics tools for these data; accordingly, we review the computational pipelines separately for these two types of data.

  4. Scalable metagenomic taxonomy classification using a reference genome database.

    Science.gov (United States)

    Ames, Sasha K; Hysom, David A; Gardner, Shea N; Lloyd, G Scott; Gokhale, Maya B; Allen, Jonathan E

    2013-09-15

    Deep metagenomic sequencing of biological samples has the potential to recover otherwise difficult-to-detect microorganisms and accurately characterize biological samples with limited prior knowledge of sample contents. Existing metagenomic taxonomic classification algorithms, however, do not scale well to analyze large metagenomic datasets, and balancing classification accuracy with computational efficiency presents a fundamental challenge. A method is presented to shift computational costs to an off-line computation by creating a taxonomy/genome index that supports scalable metagenomic classification. Scalable performance is demonstrated on real and simulated data to show accurate classification in the presence of novel organisms on samples that include viruses, prokaryotes, fungi and protists. Taxonomic classification of the previously published 150 giga-base Tyrolean Iceman dataset was found to take contents of the sample. Software was implemented in C++ and is freely available at http://sourceforge.net/projects/lmat allen99@llnl.gov Supplementary data are available at Bioinformatics online.

  5. A catalog of the mouse gut metagenome

    DEFF Research Database (Denmark)

    Xiao, Liang; Feng, Qiang; Liang, Suisha

    2015-01-01

    We established a catalog of the mouse gut metagenome comprising ∼2.6 million nonredundant genes by sequencing DNA from fecal samples of 184 mice. To secure high microbiome diversity, we used mouse strains of diverse genetic backgrounds, from different providers, kept in different housing...... laboratories and fed either a low-fat or high-fat diet. Similar to the human gut microbiome, >99% of the cataloged genes are bacterial. We identified 541 metagenomic species and defined a core set of 26 metagenomic species found in 95% of the mice. The mouse gut microbiome is functionally similar to its human...... counterpart, with 95.2% of its Kyoto Encyclopedia of Genes and Genomes (KEGG) orthologous groups in common. However, only 4.0% of the mouse gut microbial genes were shared (95% identity, 90% coverage) with those of the human gut microbiome. This catalog provides a useful reference for future studies....

  6. A catalog of the mouse gut metagenome.

    Science.gov (United States)

    Xiao, Liang; Feng, Qiang; Liang, Suisha; Sonne, Si Brask; Xia, Zhongkui; Qiu, Xinmin; Li, Xiaoping; Long, Hua; Zhang, Jianfeng; Zhang, Dongya; Liu, Chuan; Fang, Zhiwei; Chou, Joyce; Glanville, Jacob; Hao, Qin; Kotowska, Dorota; Colding, Camilla; Licht, Tine Rask; Wu, Donghai; Yu, Jun; Sung, Joseph Jao Yiu; Liang, Qiaoyi; Li, Junhua; Jia, Huijue; Lan, Zhou; Tremaroli, Valentina; Dworzynski, Piotr; Nielsen, H Bjørn; Bäckhed, Fredrik; Doré, Joël; Le Chatelier, Emmanuelle; Ehrlich, S Dusko; Lin, John C; Arumugam, Manimozhiyan; Wang, Jun; Madsen, Lise; Kristiansen, Karsten

    2015-10-01

    We established a catalog of the mouse gut metagenome comprising ∼2.6 million nonredundant genes by sequencing DNA from fecal samples of 184 mice. To secure high microbiome diversity, we used mouse strains of diverse genetic backgrounds, from different providers, kept in different housing laboratories and fed either a low-fat or high-fat diet. Similar to the human gut microbiome, >99% of the cataloged genes are bacterial. We identified 541 metagenomic species and defined a core set of 26 metagenomic species found in 95% of the mice. The mouse gut microbiome is functionally similar to its human counterpart, with 95.2% of its Kyoto Encyclopedia of Genes and Genomes (KEGG) orthologous groups in common. However, only 4.0% of the mouse gut microbial genes were shared (95% identity, 90% coverage) with those of the human gut microbiome. This catalog provides a useful reference for future studies.

  7. Critical Assessment of Metagenome Interpretation

    DEFF Research Database (Denmark)

    Sczyrba, Alexander; Hofmann, Peter; Belmann, Peter

    2017-01-01

    Methods for assembly, taxonomic profiling and binning are key to interpreting metagenome data, but a lack of consensus about benchmarking complicates performance assessment. The Critical Assessment of Metagenome Interpretation (CAMI) challenge has engaged the global developer community to benchmark...

  8. The metagenomic approach and causality in virology

    Directory of Open Access Journals (Sweden)

    Silvana Beres Castrignano

    2015-01-01

    Full Text Available Nowadays, the metagenomic approach has been a very important tool in the discovery of new viruses in environmental and biological samples. Here we discuss how these discoveries may help to elucidate the etiology of diseases and the criteria necessary to establish a causal association between a virus and a disease.

  9. Bracken: estimating species abundance in metagenomics data

    Directory of Open Access Journals (Sweden)

    Jennifer Lu

    2017-01-01

    Full Text Available Metagenomic experiments attempt to characterize microbial communities using high-throughput DNA sequencing. Identification of the microorganisms in a sample provides information about the genetic profile, population structure, and role of microorganisms within an environment. Until recently, most metagenomics studies focused on high-level characterization at the level of phyla, or alternatively sequenced the 16S ribosomal RNA gene that is present in bacterial species. As the cost of sequencing has fallen, though, metagenomics experiments have increasingly used unbiased shotgun sequencing to capture all the organisms in a sample. This approach requires a method for estimating abundance directly from the raw read data. Here we describe a fast, accurate new method that computes the abundance at the species level using the reads collected in a metagenomics experiment. Bracken (Bayesian Reestimation of Abundance after Classification with KrakEN uses the taxonomic assignments made by Kraken, a very fast read-level classifier, along with information about the genomes themselves to estimate abundance at the species level, the genus level, or above. We demonstrate that Bracken can produce accurate species- and genus-level abundance estimates even when a sample contains multiple near-identical species.

  10. Recent progresses in metagenomics

    Science.gov (United States)

    Metagenomics addresses the collective genetic structure and functional composition of a microbial community at its native habitat. This approach has emerged as a powerful tool to study the structure and function of the microbiota for the past few years and is revolutionizing studies of microbial ec...

  11. Exploiting topic modeling to boost metagenomic reads binning.

    Science.gov (United States)

    Zhang, Ruichang; Cheng, Zhanzhan; Guan, Jihong; Zhou, Shuigeng

    2015-01-01

    With the rapid development of high-throughput technologies, researchers can sequence the whole metagenome of a microbial community sampled directly from the environment. The assignment of these metagenomic reads into different species or taxonomical classes is a vital step for metagenomic analysis, which is referred to as binning of metagenomic data. In this paper, we propose a new method TM-MCluster for binning metagenomic reads. First, we represent each metagenomic read as a set of "k-mers" with their frequencies occurring in the read. Then, we employ a probabilistic topic model -- the Latent Dirichlet Allocation (LDA) model to the reads, which generates a number of hidden "topics" such that each read can be represented by a distribution vector of the generated topics. Finally, as in the MCluster method, we apply SKWIC -- a variant of the classical K-means algorithm with automatic feature weighting mechanism to cluster these reads represented by topic distributions. Experiments show that the new method TM-MCluster outperforms major existing methods, including AbundanceBin, MetaCluster 3.0/5.0 and MCluster. This result indicates that the exploitation of topic modeling can effectively improve the binning performance of metagenomic reads.

  12. Metagenomic Sequencing of an In Vitro-Simulated Microbial Community

    Energy Technology Data Exchange (ETDEWEB)

    Morgan, Jenna L.; Darling, Aaron E.; Eisen, Jonathan A.

    2009-12-01

    Background: Microbial life dominates the earth, but many species are difficult or even impossible to study under laboratory conditions. Sequencing DNA directly from the environment, a technique commonly referred to as metagenomics, is an important tool for cataloging microbial life. This culture-independent approach involves collecting samples that include microbes in them, extracting DNA from the samples, and sequencing the DNA. A sample may contain many different microorganisms, macroorganisms, and even free-floating environmental DNA. A fundamental challenge in metagenomics has been estimating the abundance of organisms in a sample based on the frequency with which the organism's DNA was observed in reads generated via DNA sequencing. Methodology/Principal Findings: We created mixtures of ten microbial species for which genome sequences are known. Each mixture contained an equal number of cells of each species. We then extracted DNA from the mixtures, sequenced the DNA, and measured the frequency with which genomic regions from each organism was observed in the sequenced DNA. We found that the observed frequency of reads mapping to each organism did not reflect the equal numbers of cells that were known to be included in each mixture. The relative organism abundances varied significantly depending on the DNA extraction and sequencing protocol utilized. Conclusions/Significance: We describe a new data resource for measuring the accuracy of metagenomic binning methods, created by in vitro-simulation of a metagenomic community. Our in vitro simulation can be used to complement previous in silico benchmark studies. In constructing a synthetic community and sequencing its metagenome, we encountered several sources of observation bias that likely affect most metagenomic experiments to date and present challenges for comparative metagenomic studies. DNA preparation methods have a particularly profound effect in our study, implying that samples prepared with

  13. Metagenomics for studying unculturable microorganisms: cutting the Gordian knot.

    Science.gov (United States)

    Schloss, Patrick D; Handelsman, Jo

    2005-01-01

    More than 99% of prokaryotes in the environment cannot be cultured in the laboratory, a phenomenon that limits our understanding of microbial physiology, genetics, and community ecology. One way around this problem is metagenomics, the culture-independent cloning and analysis of microbial DNA extracted directly from an environmental sample. Recent advances in shotgun sequencing and computational methods for genome assembly have advanced the field of metagenomics to provide glimpses into the life of uncultured microorganisms.

  14. Exploration of Metagenome Assemblies with an Interactive Visualization Tool

    Energy Technology Data Exchange (ETDEWEB)

    Cantor, Michael; Nordberg, Henrik; Smirnova, Tatyana; Andersen, Evan; Tringe, Susannah; Hess, Matthias; Dubchak, Inna

    2014-07-09

    Metagenomics, one of the fastest growing areas of modern genomic science, is the genetic profiling of the entire community of microbial organisms present in an environmental sample. Elviz is a web-based tool for the interactive exploration of metagenome assemblies. Elviz can be used with publicly available data sets from the Joint Genome Institute or with custom user-loaded assemblies. Elviz is available at genome.jgi.doe.gov/viz

  15. Clustering metagenomic sequences with interpolated Markov models

    Science.gov (United States)

    2010-01-01

    Background Sequencing of environmental DNA (often called metagenomics) has shown tremendous potential to uncover the vast number of unknown microbes that cannot be cultured and sequenced by traditional methods. Because the output from metagenomic sequencing is a large set of reads of unknown origin, clustering reads together that were sequenced from the same species is a crucial analysis step. Many effective approaches to this task rely on sequenced genomes in public databases, but these genomes are a highly biased sample that is not necessarily representative of environments interesting to many metagenomics projects. Results We present SCIMM (Sequence Clustering with Interpolated Markov Models), an unsupervised sequence clustering method. SCIMM achieves greater clustering accuracy than previous unsupervised approaches. We examine the limitations of unsupervised learning on complex datasets, and suggest a hybrid of SCIMM and supervised learning method Phymm called PHYSCIMM that performs better when evolutionarily close training genomes are available. Conclusions SCIMM and PHYSCIMM are highly accurate methods to cluster metagenomic sequences. SCIMM operates entirely unsupervised, making it ideal for environments containing mostly novel microbes. PHYSCIMM uses supervised learning to improve clustering in environments containing microbial strains from well-characterized genera. SCIMM and PHYSCIMM are available open source from http://www.cbcb.umd.edu/software/scimm. PMID:21044341

  16. Clustering metagenomic sequences with interpolated Markov models.

    Science.gov (United States)

    Kelley, David R; Salzberg, Steven L

    2010-11-02

    Sequencing of environmental DNA (often called metagenomics) has shown tremendous potential to uncover the vast number of unknown microbes that cannot be cultured and sequenced by traditional methods. Because the output from metagenomic sequencing is a large set of reads of unknown origin, clustering reads together that were sequenced from the same species is a crucial analysis step. Many effective approaches to this task rely on sequenced genomes in public databases, but these genomes are a highly biased sample that is not necessarily representative of environments interesting to many metagenomics projects. We present SCIMM (Sequence Clustering with Interpolated Markov Models), an unsupervised sequence clustering method. SCIMM achieves greater clustering accuracy than previous unsupervised approaches. We examine the limitations of unsupervised learning on complex datasets, and suggest a hybrid of SCIMM and supervised learning method Phymm called PHYSCIMM that performs better when evolutionarily close training genomes are available. SCIMM and PHYSCIMM are highly accurate methods to cluster metagenomic sequences. SCIMM operates entirely unsupervised, making it ideal for environments containing mostly novel microbes. PHYSCIMM uses supervised learning to improve clustering in environments containing microbial strains from well-characterized genera. SCIMM and PHYSCIMM are available open source from http://www.cbcb.umd.edu/software/scimm.

  17. Clustering metagenomic sequences with interpolated Markov models

    Directory of Open Access Journals (Sweden)

    Kelley David R

    2010-11-01

    Full Text Available Abstract Background Sequencing of environmental DNA (often called metagenomics has shown tremendous potential to uncover the vast number of unknown microbes that cannot be cultured and sequenced by traditional methods. Because the output from metagenomic sequencing is a large set of reads of unknown origin, clustering reads together that were sequenced from the same species is a crucial analysis step. Many effective approaches to this task rely on sequenced genomes in public databases, but these genomes are a highly biased sample that is not necessarily representative of environments interesting to many metagenomics projects. Results We present SCIMM (Sequence Clustering with Interpolated Markov Models, an unsupervised sequence clustering method. SCIMM achieves greater clustering accuracy than previous unsupervised approaches. We examine the limitations of unsupervised learning on complex datasets, and suggest a hybrid of SCIMM and supervised learning method Phymm called PHYSCIMM that performs better when evolutionarily close training genomes are available. Conclusions SCIMM and PHYSCIMM are highly accurate methods to cluster metagenomic sequences. SCIMM operates entirely unsupervised, making it ideal for environments containing mostly novel microbes. PHYSCIMM uses supervised learning to improve clustering in environments containing microbial strains from well-characterized genera. SCIMM and PHYSCIMM are available open source from http://www.cbcb.umd.edu/software/scimm.

  18. EBI metagenomics in 2016 - an expanding and evolving resource for the analysis and archiving of metagenomic data

    Science.gov (United States)

    Mitchell, Alex; Bucchini, Francois; Cochrane, Guy; Denise, Hubert; Hoopen, Petra ten; Fraser, Matthew; Pesseat, Sebastien; Potter, Simon; Scheremetjew, Maxim; Sterk, Peter; Finn, Robert D.

    2016-01-01

    EBI metagenomics (https://www.ebi.ac.uk/metagenomics/) is a freely available hub for the analysis and archiving of metagenomic and metatranscriptomic data. Over the last 2 years, the resource has undergone rapid growth, with an increase of over five-fold in the number of processed samples and consequently represents one of the largest resources of analysed shotgun metagenomes. Here, we report the status of the resource in 2016 and give an overview of new developments. In particular, we describe updates to data content, a complete overhaul of the analysis pipeline, streamlining of data presentation via the website and the development of a new web based tool to compare functional analyses of sequence runs within a study. We also highlight two of the higher profile projects that have been analysed using the resource in the last year: the oceanographic projects Ocean Sampling Day and Tara Oceans. PMID:26582919

  19. Multi-proxy geochemical analyses of Indus Submarine Fan sediments sampled by IODP Expedition 355: implications for sediment provenance and palaeoclimate reconstructions

    Science.gov (United States)

    Bratenkov, Sophia; George, Simon C.; Bendle, James; Liddy, Hannah; Clift, Peter D.; Pandey, Dhananjai K.; Kulhanek, Denise K.; Andò, Sergio; Tiwari, Manish; Khim, Boo-Keun; Griffith, Elizabeth; Steinke, Stephan; Suzuki, Kenta; Lee, Jongmin; Newton, Kate; Tripathi, Shubham; Expedition 355 Scientific Party

    2016-04-01

    The interplay between the development of the Asian summer monsoon and the growth of mountains in South and Central Asia is perhaps the most compelling example of the relationship between climate and the solid Earth. Understanding this relationship is crucial in the context of understanding past changes and for predicting future impacts in the Monsoon region. Both rapid and gradual mountain uplift influence the surrounding environments and regional climate. The sedimentary record of the Indus Fan offers a unique opportunity to study the climatic changes that occurred in South Asia and their link to the intensity of the erosion during the late Cenozoic. Although some paleoclimate reconstructions in the region can be partly addressed by studies onshore, the dominance of erosional processes in such a mountainous region ensures such records are fragmentary and limited in coverage. Thus ocean drilling is the best way to recover long sequences and to test the possible relations among mountain uplift, erosion, sediment deposition and climate (including carbon burial, chemical weathering and CO2 drawdown). The sediments and sedimentary rocks from the Indian continental margin, adjoining the Arabian Sea, were drilled during the International Ocean Discovery Program (IODP) Expedition 355. Drilling operations at Site U1456 penetrated through 1109.4 m of sediment and sedimentary rocks. The oldest sediment recovered at this site is dated to 13.5-17.7 Ma, with about 390 m of mass transport deposit. This study provides a multiproxy approach for palaeoenvironmental reconstructions in the Arabian Sea area. We use a wide variety of organic geochemical data coupled with inorganic chemistry, mineralogy, and isotopic analyses. For direct comparison among various data sets, we divided whole round residue from the interstitial water samples among different laboratories, with each receiving 50-300 g (dry mass). The preliminary results include initial sediment provenance data based on bulk

  20. Soil metagenomics and tropical soil productivity

    OpenAIRE

    Garrett, Karen A.

    2009-01-01

    This presentation summarizes research in the soil metagenomics cross cutting research activity. Soil metagenomics studies soil microbial communities as contributors to soil health.C CCRA-4 (Soil Metagenomics)

  1. High throughtput comparisons and profiling of metagenomes for industrially relevant enzymes

    KAUST Repository

    Alam, Intikhab

    2016-01-26

    More and more genomes and metagenomes are being sequenced since the advent of Next Generation Sequencing Technologies (NGS). Many metagenomic samples are collected from a variety of environments, each exhibiting a different environmental profile, e.g. temperature, environmental chemistry, etc… These metagenomes can be profiled to unearth enzymes relevant to several industries based on specific enzyme properties such as ability to work on extreme conditions, such as extreme temperatures, salinity, anaerobically, etc.. In this work, we present the DMAP platform comprising of a high-throughput metagenomic annotation pipeline and a data-warehouse for comparisons and profiling across large number of metagenomes. We developed two reference databases for profiling of important genes, one containing enzymes related to different industries and the other containing genes with potential bioactivity roles. In this presentation we describe an example analysis of a large number of publicly available metagenomic sample from TARA oceans study (Science 2015) that covers significant part of world oceans.

  2. MetAMOS: a modular and open source metagenomic assembly and analysis pipeline

    OpenAIRE

    Treangen, Todd J.; Koren, Sergey; Sommer, Daniel D; Liu, Bo; Astrovskaya, Irina; Ondov, Brian; Darling, Aaron E.; Phillippy, Adam M; Pop, Mihai

    2013-01-01

    Abstract We describe MetAMOS, an open source and modular metagenomic assembly and analysis pipeline. MetAMOS represents an important step towards fully automated metagenomic analysis, starting with next-generation sequencing reads and producing genomic scaffolds, open-reading frames and taxonomic or functional annotations. MetAMOS can aid in reducing assembly errors, commonly encountered when assembling metagenomic samples, and improves taxonomic assignment accuracy while also reducing com...

  3. A Bioinformatician's Guide to Metagenomics

    Energy Technology Data Exchange (ETDEWEB)

    Kunin, Victor; Copeland, Alex; Lapidus, Alla; Mavromatis, Konstantinos; Hugenholtz, Philip

    2008-08-01

    As random shotgun metagenomic projects proliferate and become the dominant source of publicly available sequence data, procedures for best practices in their execution and analysis become increasingly important. Based on our experience at the Joint Genome Institute, we describe step-by-step the chain of decisions accompanying a metagenomic project from the viewpoint of a bioinformatician. We guide the reader through a standard workflow for a metagenomic project beginning with pre-sequencing considerations such as community composition and sequence data type that will greatly influence downstream analyses. We proceed with recommendations for sampling and data generation including sample and metadata collection, community profiling, construction of shotgun libraries and sequencing strategies. We then discuss the application of generic sequence processing steps (read preprocessing, assembly, and gene prediction and annotation) to metagenomic datasets by contrast to genome projects. Different types of data analyses particular to metagenomes are then presented including binning, dominant population analysis and gene-centric analysis. Finally data management systems and issues are presented and discussed. We hope that this review will assist bioinformaticians and biologists in making better-informed decisions on their journey during a metagenomic project.

  4. Evaluation of ddRADseq for reduced representation metagenome sequencing.

    Science.gov (United States)

    Liu, Michael Y; Worden, Paul; Monahan, Leigh G; DeMaere, Matthew Z; Burke, Catherine M; Djordjevic, Steven P; Charles, Ian G; Darling, Aaron E

    2017-01-01

    Profiling of microbial communities via metagenomic shotgun sequencing has enabled researches to gain unprecedented insight into microbial community structure and the functional roles of community members. This study describes a method and basic analysis for a metagenomic adaptation of the double digest restriction site associated DNA sequencing (ddRADseq) protocol for reduced representation metagenome profiling. This technique takes advantage of the sequence specificity of restriction endonucleases to construct an Illumina-compatible sequencing library containing DNA fragments that are between a pair of restriction sites located within close proximity. This results in a reduced sequencing library with coverage breadth that can be tuned by size selection. We assessed the performance of the metagenomic ddRADseq approach by applying the full method to human stool samples and generating sequence data. The ddRADseq data yields a similar estimate of community taxonomic profile as obtained from shotgun metagenome sequencing of the same human stool samples. No obvious bias with respect to genomic G + C content and the estimated relative species abundance was detected. Although ddRADseq does introduce some bias in taxonomic representation, the bias is likely to be small relative to DNA extraction bias. ddRADseq appears feasible and could have value as a tool for metagenome-wide association studies.

  5. Multiple comparative metagenomics using multiset k-mer counting

    Directory of Open Access Journals (Sweden)

    Gaëtan Benoit

    2016-11-01

    Full Text Available Background Large scale metagenomic projects aim to extract biodiversity knowledge between different environmental conditions. Current methods for comparing microbial communities face important limitations. Those based on taxonomical or functional assignation rely on a small subset of the sequences that can be associated to known organisms. On the other hand, de novo methods, that compare the whole sets of sequences, either do not scale up on ambitious metagenomic projects or do not provide precise and exhaustive results. Methods These limitations motivated the development of a new de novo metagenomic comparative method, called Simka. This method computes a large collection of standard ecological distances by replacing species counts by k-mer counts. Simka scales-up today’s metagenomic projects thanks to a new parallel k-mer counting strategy on multiple datasets. Results Experiments on public Human Microbiome Project datasets demonstrate that Simka captures the essential underlying biological structure. Simka was able to compute in a few hours both qualitative and quantitative ecological distances on hundreds of metagenomic samples (690 samples, 32 billions of reads. We also demonstrate that analyzing metagenomes at the k-mer level is highly correlated with extremely precise de novo comparison techniques which rely on all-versus-all sequences alignment strategy or which are based on taxonomic profiling.

  6. Evaluation of ddRADseq for reduced representation metagenome sequencing

    Directory of Open Access Journals (Sweden)

    Michael Y. Liu

    2017-09-01

    Full Text Available Background Profiling of microbial communities via metagenomic shotgun sequencing has enabled researches to gain unprecedented insight into microbial community structure and the functional roles of community members. This study describes a method and basic analysis for a metagenomic adaptation of the double digest restriction site associated DNA sequencing (ddRADseq protocol for reduced representation metagenome profiling. Methods This technique takes advantage of the sequence specificity of restriction endonucleases to construct an Illumina-compatible sequencing library containing DNA fragments that are between a pair of restriction sites located within close proximity. This results in a reduced sequencing library with coverage breadth that can be tuned by size selection. We assessed the performance of the metagenomic ddRADseq approach by applying the full method to human stool samples and generating sequence data. Results The ddRADseq data yields a similar estimate of community taxonomic profile as obtained from shotgun metagenome sequencing of the same human stool samples. No obvious bias with respect to genomic G + C content and the estimated relative species abundance was detected. Discussion Although ddRADseq does introduce some bias in taxonomic representation, the bias is likely to be small relative to DNA extraction bias. ddRADseq appears feasible and could have value as a tool for metagenome-wide association studies.

  7. The YNP metagenome project

    DEFF Research Database (Denmark)

    Inskeep, William P.; Jay, Zackary J.; Tringe, Susannah G.

    2013-01-01

    The Yellowstone geothermal complex contains over 10,000 diverse geothermal features that host numerous phylogenetically deeply rooted and poorly understood archaea, bacteria, and viruses. Microbial communities in high-temperature environments are generally less diverse than soil, marine, sediment......, and environmental variables. Twenty geochemically distinct geothermal ecosystems representing a broad spectrum of Yellowstone hot-spring environments were used for metagenomic and geochemical analysis and included approximately equal numbers of: (1) phototrophic mats, (2) “filamentous streamer” communities, and (3...

  8. METAGENassist: a comprehensive web server for comparative metagenomics.

    Science.gov (United States)

    Arndt, David; Xia, Jianguo; Liu, Yifeng; Zhou, You; Guo, An Chi; Cruz, Joseph A; Sinelnikov, Igor; Budwill, Karen; Nesbø, Camilla L; Wishart, David S

    2012-07-01

    With recent improvements in DNA sequencing and sample extraction techniques, the quantity and quality of metagenomic data are now growing exponentially. This abundance of richly annotated metagenomic data and bacterial census information has spawned a new branch of microbiology called comparative metagenomics. Comparative metagenomics involves the comparison of bacterial populations between different environmental samples, different culture conditions or different microbial hosts. However, in order to do comparative metagenomics, one typically requires a sophisticated knowledge of multivariate statistics and/or advanced software programming skills. To make comparative metagenomics more accessible to microbiologists, we have developed a freely accessible, easy-to-use web server for comparative metagenomic analysis called METAGENassist. Users can upload their bacterial census data from a wide variety of common formats, using either amplified 16S rRNA data or shotgun metagenomic data. Metadata concerning environmental, culture, or host conditions can also be uploaded. During the data upload process, METAGENassist also performs an automated taxonomic-to-phenotypic mapping. Phenotypic information covering nearly 20 functional categories such as GC content, genome size, oxygen requirements, energy sources and preferred temperature range is automatically generated from the taxonomic input data. Using this phenotypically enriched data, users can then perform a variety of multivariate and univariate data analyses including fold change analysis, t-tests, PCA, PLS-DA, clustering and classification. To facilitate data processing, users are guided through a step-by-step analysis workflow using a variety of menus, information hyperlinks and check boxes. METAGENassist also generates colorful, publication quality tables and graphs that can be downloaded and used directly in the preparation of scientific papers. METAGENassist is available at http://www.metagenassist.ca.

  9. METAGENassist: a comprehensive web server for comparative metagenomics

    Science.gov (United States)

    Arndt, David; Xia, Jianguo; Liu, Yifeng; Zhou, You; Guo, An Chi; Cruz, Joseph A.; Sinelnikov, Igor; Budwill, Karen; Nesbø, Camilla L.; Wishart, David S.

    2012-01-01

    With recent improvements in DNA sequencing and sample extraction techniques, the quantity and quality of metagenomic data are now growing exponentially. This abundance of richly annotated metagenomic data and bacterial census information has spawned a new branch of microbiology called comparative metagenomics. Comparative metagenomics involves the comparison of bacterial populations between different environmental samples, different culture conditions or different microbial hosts. However, in order to do comparative metagenomics, one typically requires a sophisticated knowledge of multivariate statistics and/or advanced software programming skills. To make comparative metagenomics more accessible to microbiologists, we have developed a freely accessible, easy-to-use web server for comparative metagenomic analysis called METAGENassist. Users can upload their bacterial census data from a wide variety of common formats, using either amplified 16S rRNA data or shotgun metagenomic data. Metadata concerning environmental, culture, or host conditions can also be uploaded. During the data upload process, METAGENassist also performs an automated taxonomic-to-phenotypic mapping. Phenotypic information covering nearly 20 functional categories such as GC content, genome size, oxygen requirements, energy sources and preferred temperature range is automatically generated from the taxonomic input data. Using this phenotypically enriched data, users can then perform a variety of multivariate and univariate data analyses including fold change analysis, t-tests, PCA, PLS-DA, clustering and classification. To facilitate data processing, users are guided through a step-by-step analysis workflow using a variety of menus, information hyperlinks and check boxes. METAGENassist also generates colorful, publication quality tables and graphs that can be downloaded and used directly in the preparation of scientific papers. METAGENassist is available at http://www.metagenassist.ca. PMID

  10. Parallel-META 2.0: enhanced metagenomic data analysis with functional annotation, high performance computing and advanced visualization.

    Science.gov (United States)

    Su, Xiaoquan; Pan, Weihua; Song, Baoxing; Xu, Jian; Ning, Kang

    2014-01-01

    The metagenomic method directly sequences and analyses genome information from microbial communities. The main computational tasks for metagenomic analyses include taxonomical and functional structure analysis for all genomes in a microbial community (also referred to as a metagenomic sample). With the advancement of Next Generation Sequencing (NGS) techniques, the number of metagenomic samples and the data size for each sample are increasing rapidly. Current metagenomic analysis is both data- and computation- intensive, especially when there are many species in a metagenomic sample, and each has a large number of sequences. As such, metagenomic analyses require extensive computational power. The increasing analytical requirements further augment the challenges for computation analysis. In this work, we have proposed Parallel-META 2.0, a metagenomic analysis software package, to cope with such needs for efficient and fast analyses of taxonomical and functional structures for microbial communities. Parallel-META 2.0 is an extended and improved version of Parallel-META 1.0, which enhances the taxonomical analysis using multiple databases, improves computation efficiency by optimized parallel computing, and supports interactive visualization of results in multiple views. Furthermore, it enables functional analysis for metagenomic samples including short-reads assembly, gene prediction and functional annotation. Therefore, it could provide accurate taxonomical and functional analyses of the metagenomic samples in high-throughput manner and on large scale.

  11. Metagenomic sequence of saline desert microbiota from wild ass sanctuary, Little Rann of Kutch, Gujarat, India

    Science.gov (United States)

    Patel, Rajesh; Mevada, Vishal; Prajapati, Dhaval; Dudhagara, Pravin; Koringa, Prakash; Joshi, C.G.

    2015-01-01

    We report Metagenome from the saline desert soil sample of Little Rann of Kutch, Gujarat State, India. Metagenome consisted of 633,760 sequences with size 141,307,202 bp and 56% G + C content. Metagenome sequence data are available at EBI under EBI Metagenomics database with accession no. ERP005612. Community metagenomics revealed total 1802 species belonged to 43 different phyla with dominating Marinobacter (48.7%) and Halobacterium (4.6%) genus in bacterial and archaeal domain respectively. Remarkably, 18.2% sequences in a poorly characterized group and 4% gene for various stress responses along with versatile presence of commercial enzyme were evident in a functional metagenome analysis. PMID:26484162

  12. Metagenomic analysis of medicinal Cannabis samples; pathogenic bacteria, toxigenic fungi, and beneficial microbes grow in culture-based yeast and mold tests [version 1; referees: 2 approved, 1 approved with reservations

    Directory of Open Access Journals (Sweden)

    Kevin McKernan

    2016-10-01

    Full Text Available Background: The presence of bacteria and fungi in medicinal or recreational Cannabis poses a potential threat to consumers if those microbes include pathogenic or toxigenic species. This study evaluated two widely used culture-based platforms for total yeast and mold (TYM testing marketed by 3M Corporation and Biomérieux, in comparison with a quantitative PCR (qPCR approach marketed by Medicinal Genomics Corporation. Methods: A set of 15 medicinal Cannabis samples were analyzed using 3M and Biomérieux culture-based platforms and by qPCR to quantify microbial DNA. All samples were then subjected to next-generation sequencing and metagenomics analysis to enumerate the bacteria and fungi present before and after growth on culture-based media. Results: Several pathogenic or toxigenic bacterial and fungal species were identified in proportions of >5% of classified reads on the samples, including Acinetobacter baumannii, Escherichia coli, Pseudomonas aeruginosa, Ralstonia pickettii, Salmonella enterica, Stenotrophomonas maltophilia, Aspergillus ostianus, Aspergillus sydowii, Penicillium citrinum and Penicillium steckii. Samples subjected to culture showed substantial shifts in the number and diversity of species present, including the failure of Aspergillus species to grow well on either platform. Substantial growth of Clostridium botulinum and other bacteria were frequently observed on one or both of the culture-based TYM platforms. The presence of plant growth promoting (beneficial fungal species further influenced the differential growth of species in the microbiome of each sample. Conclusions: These findings have important implications for the Cannabis and food safety testing industries.

  13. MetLab: An In Silico Experimental Design, Simulation and Analysis Tool for Viral Metagenomics Studies.

    Science.gov (United States)

    Norling, Martin; Karlsson-Lindsjö, Oskar E; Gourlé, Hadrien; Bongcam-Rudloff, Erik; Hayer, Juliette

    2016-01-01

    Metagenomics, the sequence characterization of all genomes within a sample, is widely used as a virus discovery tool as well as a tool to study viral diversity of animals. Metagenomics can be considered to have three main steps; sample collection and preparation, sequencing and finally bioinformatics. Bioinformatic analysis of metagenomic datasets is in itself a complex process, involving few standardized methodologies, thereby hampering comparison of metagenomics studies between research groups. In this publication the new bioinformatics framework MetLab is presented, aimed at providing scientists with an integrated tool for experimental design and analysis of viral metagenomes. MetLab provides support in designing the metagenomics experiment by estimating the sequencing depth needed for the complete coverage of a species. This is achieved by applying a methodology to calculate the probability of coverage using an adaptation of Stevens' theorem. It also provides scientists with several pipelines aimed at simplifying the analysis of viral metagenomes, including; quality control, assembly and taxonomic binning. We also implement a tool for simulating metagenomics datasets from several sequencing platforms. The overall aim is to provide virologists with an easy to use tool for designing, simulating and analyzing viral metagenomes. The results presented here include a benchmark towards other existing software, with emphasis on detection of viruses as well as speed of applications. This is packaged, as comprehensive software, readily available for Linux and OSX users at https://github.com/norling/metlab.

  14. Inference of microbial recombination rates from metagenomic data.

    Directory of Open Access Journals (Sweden)

    Philip L F Johnson

    2009-10-01

    Full Text Available Metagenomic sequencing projects from environments dominated by a small number of species produce genome-wide population samples. We present a two-site composite likelihood estimator of the scaled recombination rate, rho = 2N(ec, that operates on metagenomic assemblies in which each sequenced fragment derives from a different individual. This new estimator properly accounts for sequencing error, as quantified by per-base quality scores, and missing data, as inferred from the placement of reads in a metagenomic assembly. We apply our estimator to data from a sludge metagenome project to demonstrate how this method will elucidate the rates of exchange of genetic material in natural microbial populations. Surprisingly, for a fixed amount of sequencing, this estimator has lower variance than similar methods that operate on more traditional population genetic samples of comparable size. In addition, we can infer variation in recombination rate across the genome because metagenomic projects sample genetic diversity genome-wide, not just at particular loci. The method itself makes no assumption specific to microbial populations, opening the door for application to any mixed population sample where the number of individuals sampled is much greater than the number of fragments sequenced.

  15. Messages from the first International Conference on Clinical Metagenomics (ICCMg).

    Science.gov (United States)

    Ruppé, Etienne; Greub, Gilbert; Schrenzel, Jacques

    Metagenomics is recently entering in the clinical microbiology and an increasing number of diagnostic laboratories are now proposing the sequencing & annotation of bacterial genomes and/or the analysis of clinical samples by direct or PCR-based metagenomics with short time to results. In this context, the first International Conference on Clinical Metagenomics (ICCMg) was held in Geneva in October 2016 and several key aspects have been discussed including: i) the need for improved resolution, ii) the importance of interpretation given the common occurrence of sequence contaminants, iii) the need for improved bioinformatic pipelines, iv) the bottleneck of DNA extraction, v) the importance of gold standards, vi) the need to further reduce time to results, vii) how to improve data sharing, viii) the applications of bacterial genomics and clinical metagenomics in better adapting therapeutics and ix) the impact of metagenomics and new sequencing technologies in discovering new microbes. Further efforts in term of reduced turnaround time, improved quality and lower costs are however warranted to fully translate metagenomics in clinical applications. Copyright © 2017 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.

  16. Longitudinal Metagenomic Analysis of Hospital Air Identifies Clinically Relevant Microbes.

    Science.gov (United States)

    King, Paula; Pham, Long K; Waltz, Shannon; Sphar, Dan; Yamamoto, Robert T; Conrad, Douglas; Taplitz, Randy; Torriani, Francesca; Forsyth, R Allyn

    2016-01-01

    We describe the sampling of sixty-three uncultured hospital air samples collected over a six-month period and analysis using shotgun metagenomic sequencing. Our primary goals were to determine the longitudinal metagenomic variability of this environment, identify and characterize genomes of potential pathogens and determine whether they are atypical to the hospital airborne metagenome. Air samples were collected from eight locations which included patient wards, the main lobby and outside. The resulting DNA libraries produced 972 million sequences representing 51 gigabases. Hierarchical clustering of samples by the most abundant 50 microbial orders generated three major nodes which primarily clustered by type of location. Because the indoor locations were longitudinally consistent, episodic relative increases in microbial genomic signatures related to the opportunistic pathogens Aspergillus, Penicillium and Stenotrophomonas were identified as outliers at specific locations. Further analysis of microbial reads specific for Stenotrophomonas maltophilia indicated homology to a sequenced multi-drug resistant clinical strain and we observed broad sequence coverage of resistance genes. We demonstrate that a shotgun metagenomic sequencing approach can be used to characterize the resistance determinants of pathogen genomes that are uncharacteristic for an otherwise consistent hospital air microbial metagenomic profile.

  17. Random Whole Metagenomic Sequencing for Forensic Discrimination of Soils

    Science.gov (United States)

    Khodakova, Anastasia S.; Smith, Renee J.; Burgoyne, Leigh; Abarno, Damien; Linacre, Adrian

    2014-01-01

    Here we assess the ability of random whole metagenomic sequencing approaches to discriminate between similar soils from two geographically distinct urban sites for application in forensic science. Repeat samples from two parklands in residential areas separated by approximately 3 km were collected and the DNA was extracted. Shotgun, whole genome amplification (WGA) and single arbitrarily primed DNA amplification (AP-PCR) based sequencing techniques were then used to generate soil metagenomic profiles. Full and subsampled metagenomic datasets were then annotated against M5NR/M5RNA (taxonomic classification) and SEED Subsystems (metabolic classification) databases. Further comparative analyses were performed using a number of statistical tools including: hierarchical agglomerative clustering (CLUSTER); similarity profile analysis (SIMPROF); non-metric multidimensional scaling (NMDS); and canonical analysis of principal coordinates (CAP) at all major levels of taxonomic and metabolic classification. Our data showed that shotgun and WGA-based approaches generated highly similar metagenomic profiles for the soil samples such that the soil samples could not be distinguished accurately. An AP-PCR based approach was shown to be successful at obtaining reproducible site-specific metagenomic DNA profiles, which in turn were employed for successful discrimination of visually similar soil samples collected from two different locations. PMID:25111003

  18. Metagenomic Analysis of Bacterial Communities of Antarctic Surface Snow

    Directory of Open Access Journals (Sweden)

    Anna eLopatina

    2016-03-01

    Full Text Available The diversity of bacteria present in surface snow around four Russian stations in Eastern Antarctica was studied by high throughput sequencing of amplified 16S rRNA gene fragments and shotgun metagenomic sequencing. Considerable class- and genus-level variation between the samples was revealed indicating a presence of inter-site diversity of bacteria in Antarctic snow. Flavobacterium was a major genus in one sampling site and was also detected in other sites. The diversity of flavobacterial type II-C CRISPR spacers in the samples was investigated by metagenome sequencing. Thousands of unique spacers were revealed with less than 35% overlap between the sampling sites, indicating an enormous natural variety of flavobacterial CRISPR spacers and, by extension, high level of adaptive activity of the corresponding CRISPR-Cas system. None of the spacers matched known spacers of flavobacterial isolates from the Northern hemisphere. Moreover, the percentage of spacers with matches with Antarctic metagenomic sequences obtained in this work was significantly higher than with sequences from much larger publically available environmental metagenomic database. The results indicate that despite the overall very high level of diversity, Antarctic Flavobacteria comprise a separate pool that experiences pressures from mobile genetic elements different from those present in other parts of the world. The results also establish analysis of metagenomic CRISPR spacer content as a powerful tool to study bacterial populations diversity.

  19. Random whole metagenomic sequencing for forensic discrimination of soils.

    Directory of Open Access Journals (Sweden)

    Anastasia S Khodakova

    Full Text Available Here we assess the ability of random whole metagenomic sequencing approaches to discriminate between similar soils from two geographically distinct urban sites for application in forensic science. Repeat samples from two parklands in residential areas separated by approximately 3 km were collected and the DNA was extracted. Shotgun, whole genome amplification (WGA and single arbitrarily primed DNA amplification (AP-PCR based sequencing techniques were then used to generate soil metagenomic profiles. Full and subsampled metagenomic datasets were then annotated against M5NR/M5RNA (taxonomic classification and SEED Subsystems (metabolic classification databases. Further comparative analyses were performed using a number of statistical tools including: hierarchical agglomerative clustering (CLUSTER; similarity profile analysis (SIMPROF; non-metric multidimensional scaling (NMDS; and canonical analysis of principal coordinates (CAP at all major levels of taxonomic and metabolic classification. Our data showed that shotgun and WGA-based approaches generated highly similar metagenomic profiles for the soil samples such that the soil samples could not be distinguished accurately. An AP-PCR based approach was shown to be successful at obtaining reproducible site-specific metagenomic DNA profiles, which in turn were employed for successful discrimination of visually similar soil samples collected from two different locations.

  20. Reconstructing the genomic content of microbiome taxa through shotgun metagenomic deconvolution.

    Science.gov (United States)

    Carr, Rogan; Shen-Orr, Shai S; Borenstein, Elhanan

    2013-01-01

    Metagenomics has transformed our understanding of the microbial world, allowing researchers to bypass the need to isolate and culture individual taxa and to directly characterize both the taxonomic and gene compositions of environmental samples. However, associating the genes found in a metagenomic sample with the specific taxa of origin remains a critical challenge. Existing binning methods, based on nucleotide composition or alignment to reference genomes allow only a coarse-grained classification and rely heavily on the availability of sequenced genomes from closely related taxa. Here, we introduce a novel computational framework, integrating variation in gene abundances across multiple samples with taxonomic abundance data to deconvolve metagenomic samples into taxa-specific gene profiles and to reconstruct the genomic content of community members. This assembly-free method is not bounded by various factors limiting previously described methods of metagenomic binning or metagenomic assembly and represents a fundamentally different approach to metagenomic-based genome reconstruction. An implementation of this framework is available at http://elbo.gs.washington.edu/software.html. We first describe the mathematical foundations of our framework and discuss considerations for implementing its various components. We demonstrate the ability of this framework to accurately deconvolve a set of metagenomic samples and to recover the gene content of individual taxa using synthetic metagenomic samples. We specifically characterize determinants of prediction accuracy and examine the impact of annotation errors on the reconstructed genomes. We finally apply metagenomic deconvolution to samples from the Human Microbiome Project, successfully reconstructing genus-level genomic content of various microbial genera, based solely on variation in gene count. These reconstructed genera are shown to correctly capture genus-specific properties. With the accumulation of metagenomic

  1. IMG/M: A data management and analysis system for metagenomes

    Energy Technology Data Exchange (ETDEWEB)

    Markowitz, Victor M.; Ivanova, Natalia N.; Szeto, Ernest; Palaniappan, Krishna; Chu, Ken; Dalevi, Daniel; Chen, I-Min A.; Grechkin,Yuri; Dubchak,Inna; Anderson, Iain; Lykidis, Athanasios; Mavromatis,Konstantinos; Hug enholtz, Phil; Kyrpides, Nikos C.

    2007-08-01

    IMG/M is a data management and analysis system for microbial community genomes (metagenomes) hosted at the Joint Genome Institute (JGI). IMG/M consists of metagenome data integrated with isolate microbial genomes from the Integrated Microbial Genomes (IMG) system. IMG/M provides IMG's comparative data analysis tools extended to handle metagenome data, together with metagenome-specific analysis tools. IMG/M is available at http://img.jgi.doe.gov/m. Studies of the collective genomes (also known as metagenomes) of environmental microbial communities (also known as microbiomes) are expected to lead to advances in environmental cleanup, agriculture, industrial processes, alternative energy production, and human health (1). Metagenomes of specific microbiome samples are sequenced by organizations worldwide, such as the Department of Energy's (DOE) Joint Genome Institute (JGI), the Venter Institute and the Washington University in St. Louis using different sequencing strategies, technology platforms, and annotation procedures. According to the Genomes OnLine Database, about 28 metagenome studies have been published to date, with over 60 other projects ongoing and more in the process of being launched (2). The Department of Energy's (DOE) Joint Genome Institute (JGI) is one of the major contributors of metagenome sequence data, currently sequencing more than 50% of the reported metagenome projects worldwide. Due to the higher complexity, inherent incompleteness, and lower quality of metagenome sequence data, traditional assembly, gene prediction, and annotation methods do not perform on these datasets as well as they do on isolate microbial genome sequences (3, 4). In spite of these limitations, metagenome data are amenable to a variety of analyses, as illustrated by several recent studies (5-10). Metagenome data analysis is usually set up in the context of reference isolate genomes and considers the questions of composition and functional or metabolic

  2. Metagenomic analysis of kimchi, a traditional Korean fermented food.

    Science.gov (United States)

    Jung, Ji Young; Lee, Se Hee; Kim, Jeong Myeong; Park, Moon Su; Bae, Jin-Woo; Hahn, Yoonsoo; Madsen, Eugene L; Jeon, Che Ok

    2011-04-01

    Kimchi, a traditional food in the Korean culture, is made from vegetables by fermentation. In this study, metagenomic approaches were used to monitor changes in bacterial populations, metabolic potential, and overall genetic features of the microbial community during the 29-day fermentation process. Metagenomic DNA was extracted from kimchi samples obtained periodically and was sequenced using a 454 GS FLX Titanium system, which yielded a total of 701,556 reads, with an average read length of 438 bp. Phylogenetic analysis based on 16S rRNA genes from the metagenome indicated that the kimchi microbiome was dominated by members of three genera: Leuconostoc, Lactobacillus, and Weissella. Assignment of metagenomic sequences to SEED categories of the Metagenome Rapid Annotation using Subsystem Technology (MG-RAST) server revealed a genetic profile characteristic of heterotrophic lactic acid fermentation of carbohydrates, which was supported by the detection of mannitol, lactate, acetate, and ethanol as fermentation products. When the metagenomic reads were mapped onto the database of completed genomes, the Leuconostoc mesenteroides subsp. mesenteroides ATCC 8293 and Lactobacillus sakei subsp. sakei 23K genomes were highly represented. These same two genera were confirmed to be important in kimchi fermentation when the majority of kimchi metagenomic sequences showed very high identity to Leuconostoc mesenteroides and Lactobacillus genes. Besides microbial genome sequences, a surprisingly large number of phage DNA sequences were identified from the cellular fractions, possibly indicating that a high proportion of cells were infected by bacteriophages during fermentation. Overall, these results provide insights into the kimchi microbial community and also shed light on fermentation processes carried out broadly by complex microbial communities.

  3. Environmental Metagenomics: The Data Assembly and Data Analysis Perspectives

    Science.gov (United States)

    Kumar, Vinay; Maitra, S. S.; Shukla, Rohit Nandan

    2015-01-01

    Novel gene finding is one of the emerging fields in the environmental research. In the past decades the research was focused mainly on the discovery of microorganisms which were capable of degrading a particular compound. A lot of methods are available in literature about the cultivation and screening of these novel microorganisms. All of these methods are efficient for screening of microbes which can be cultivated in the laboratory. Microorganisms which live in extreme conditions like hot springs, frozen glaciers, acid mine drainage, etc. cannot be cultivated in the laboratory, this is because of incomplete knowledge about their growth requirements like temperature, nutrients and their mutual dependence on each other. The microbes that can be cultivated correspond only to less than 1 % of the total microbes which are present in the earth. Rest of the 99 % of uncultivated majority remains inaccessible. Metagenomics transcends the culture requirements of microbes. In metagenomics DNA is directly extracted from the environmental samples such as soil, seawater, acid mine drainage etc., followed by construction and screening of metagenomic library. With the ongoing research, a huge amount of metagenomic data is accumulating. Understanding this data is an essential step to extract novel genes of industrial importance. Various bioinformatics tools have been designed to analyze and annotate the data produced from the metagenome. The Bio-informatic requirements of metagenomics data analysis are different in theory and practice. This paper reviews the tools that are available for metagenomic data analysis and the capability such tools—what they can do and their web availability.

  4. Accessing the Soil Metagenome for Studies of Microbial Diversity▿ †

    Science.gov (United States)

    Delmont, Tom O.; Robe, Patrick; Cecillon, Sébastien; Clark, Ian M.; Constancias, Florentin; Simonet, Pascal; Hirsch, Penny R.; Vogel, Timothy M.

    2011-01-01

    Soil microbial communities contain the highest level of prokaryotic diversity of any environment, and metagenomic approaches involving the extraction of DNA from soil can improve our access to these communities. Most analyses of soil biodiversity and function assume that the DNA extracted represents the microbial community in the soil, but subsequent interpretations are limited by the DNA recovered from the soil. Unfortunately, extraction methods do not provide a uniform and unbiased subsample of metagenomic DNA, and as a consequence, accurate species distributions cannot be determined. Moreover, any bias will propagate errors in estimations of overall microbial diversity and may exclude some microbial classes from study and exploitation. To improve metagenomic approaches, investigate DNA extraction biases, and provide tools for assessing the relative abundances of different groups, we explored the biodiversity of the accessible community DNA by fractioning the metagenomic DNA as a function of (i) vertical soil sampling, (ii) density gradients (cell separation), (iii) cell lysis stringency, and (iv) DNA fragment size distribution. Each fraction had a unique genetic diversity, with different predominant and rare species (based on ribosomal intergenic spacer analysis [RISA] fingerprinting and phylochips). All fractions contributed to the number of bacterial groups uncovered in the metagenome, thus increasing the DNA pool for further applications. Indeed, we were able to access a more genetically diverse proportion of the metagenome (a gain of more than 80% compared to the best single extraction method), limit the predominance of a few genomes, and increase the species richness per sequencing effort. This work stresses the difference between extracted DNA pools and the currently inaccessible complete soil metagenome. PMID:21183646

  5. metaSNV: A tool for metagenomic strain level analysis.

    Science.gov (United States)

    Costea, Paul Igor; Munch, Robin; Coelho, Luis Pedro; Paoli, Lucas; Sunagawa, Shinichi; Bork, Peer

    2017-01-01

    We present metaSNV, a tool for single nucleotide variant (SNV) analysis in metagenomic samples, capable of comparing populations of thousands of bacterial and archaeal species. The tool uses as input nucleotide sequence alignments to reference genomes in standard SAM/BAM format, performs SNV calling for individual samples and across the whole data set, and generates various statistics for individual species including allele frequencies and nucleotide diversity per sample as well as distances and fixation indices across samples. Using published data from 676 metagenomic samples of different sites in the oral cavity, we show that the results of metaSNV are comparable to those of MIDAS, an alternative implementation for metagenomic SNV analysis, while data processing is faster and has a smaller storage footprint. Moreover, we implement a set of distance measures that allow the comparison of genomic variation across metagenomic samples and delineate sample-specific variants to enable the tracking of specific strain populations over time. The implementation of metaSNV is available at: http://metasnv.embl.de/.

  6. Open resource metagenomics: a model for sharing metagenomic libraries.

    Science.gov (United States)

    Neufeld, J D; Engel, K; Cheng, J; Moreno-Hagelsieb, G; Rose, D R; Charles, T C

    2011-11-30

    Both sequence-based and activity-based exploitation of environmental DNA have provided unprecedented access to the genomic content of cultivated and uncultivated microorganisms. Although researchers deposit microbial strains in culture collections and DNA sequences in databases, activity-based metagenomic studies typically only publish sequences from the hits retrieved from specific screens. Physical metagenomic libraries, conceptually similar to entire sequence datasets, are usually not straightforward to obtain by interested parties subsequent to publication. In order to facilitate unrestricted distribution of metagenomic libraries, we propose the adoption of open resource metagenomics, in line with the trend towards open access publishing, and similar to culture- and mutant-strain collections that have been the backbone of traditional microbiology and microbial genetics. The concept of open resource metagenomics includes preparation of physical DNA libraries, preferably in versatile vectors that facilitate screening in a diversity of host organisms, and pooling of clones so that single aliquots containing complete libraries can be easily distributed upon request. Database deposition of associated metadata and sequence data for each library provides researchers with information to select the most appropriate libraries for further research projects. As a starting point, we have established the Canadian MetaMicroBiome Library (CM(2)BL [1]). The CM(2)BL is a publicly accessible collection of cosmid libraries containing environmental DNA from soils collected from across Canada, spanning multiple biomes. The libraries were constructed such that the cloned DNA can be easily transferred to Gateway® compliant vectors, facilitating functional screening in virtually any surrogate microbial host for which there are available plasmid vectors. The libraries, which we are placing in the public domain, will be distributed upon request without restriction to members of both the

  7. Selection in coastal Synechococcus (cyanobacteria populations evaluated from environmental metagenomes.

    Directory of Open Access Journals (Sweden)

    Vera Tai

    Full Text Available Environmental metagenomics provides snippets of genomic sequences from all organisms in an environmental sample and are an unprecedented resource of information for investigating microbial population genetics. Current analytical methods, however, are poorly equipped to handle metagenomic data, particularly of short, unlinked sequences. A custom analytical pipeline was developed to calculate dN/dS ratios, a common metric to evaluate the role of selection in the evolution of a gene, from environmental metagenomes sequenced using 454 technology of flow-sorted populations of marine Synechococcus, the dominant cyanobacteria in coastal environments. The large majority of genes (98% have evolved under purifying selection (dN/dS1, 77 out of 83 (93% were hypothetical. Notable among annotated genes, ribosomal protein L35 appears to be under positive selection in one Synechococcus population. Other annotated genes, in particular a possible porin, a large-conductance mechanosensitive channel, an ATP binding component of an ABC transporter, and a homologue of a pilus retraction protein had regions of the gene with elevated dN/dS. With the increasing use of next-generation sequencing in metagenomic investigations of microbial diversity and ecology, analytical methods need to accommodate the peculiarities of these data streams. By developing a means to analyze population diversity data from these environmental metagenomes, we have provided the first insight into the role of selection in the evolution of Synechococcus, a globally significant primary producer.

  8. MetaStorm: A Public Resource for Customizable Metagenomics Annotation.

    Science.gov (United States)

    Arango-Argoty, Gustavo; Singh, Gargi; Heath, Lenwood S; Pruden, Amy; Xiao, Weidong; Zhang, Liqing

    2016-01-01

    Metagenomics is a trending research area, calling for the need to analyze large quantities of data generated from next generation DNA sequencing technologies. The need to store, retrieve, analyze, share, and visualize such data challenges current online computational systems. Interpretation and annotation of specific information is especially a challenge for metagenomic data sets derived from environmental samples, because current annotation systems only offer broad classification of microbial diversity and function. Moreover, existing resources are not configured to readily address common questions relevant to environmental systems. Here we developed a new online user-friendly metagenomic analysis server called MetaStorm (http://bench.cs.vt.edu/MetaStorm/), which facilitates customization of computational analysis for metagenomic data sets. Users can upload their own reference databases to tailor the metagenomics annotation to focus on various taxonomic and functional gene markers of interest. MetaStorm offers two major analysis pipelines: an assembly-based annotation pipeline and the standard read annotation pipeline used by existing web servers. These pipelines can be selected individually or together. Overall, MetaStorm provides enhanced interactive visualization to allow researchers to explore and manipulate taxonomy and functional annotation at various levels of resolution.

  9. Fast and sensitive taxonomic classification for metagenomics with Kaiju

    DEFF Research Database (Denmark)

    Menzel, Peter; Ng, Kim Lee; Krogh, Anders

    2016-01-01

    The constantly decreasing cost and increasing output of current sequencing technologies enable large scale metagenomic studies of microbial communities from diverse habitats. Therefore, fast and accurate methods for taxonomic classification are needed, which can operate on increasingly larger...... datasets and reference databases. Recently, several fast metagenomic classifiers have been developed, which are based on comparison of genomic k-mers. However, nucleotide comparison using a fixed k-mer length often lacks the sensitivity to overcome the evolutionary distance between sampled species...... and genomes in the reference database. Here, we present the novel metagenome classifier Kaiju for fast assignment of reads to taxa. Kaiju finds maximum exact matches on the protein-level using the Borrows-Wheeler transform, and can optionally allow amino acid substitutions in the search using a greedy...

  10. Parallel-META: efficient metagenomic data analysis based on high-performance computation.

    Science.gov (United States)

    Su, Xiaoquan; Xu, Jian; Ning, Kang

    2012-01-01

    Metagenomics method directly sequences and analyses genome information from microbial communities. There are usually more than hundreds of genomes from different microbial species in the same community, and the main computational tasks for metagenomic data analyses include taxonomical and functional component examination of all genomes in the microbial community. Metagenomic data analysis is both data- and computation- intensive, which requires extensive computational power. Most of the current metagenomic data analysis softwares were designed to be used on a single computer or single computer clusters, which could not match with the fast increasing number of large metagenomic projects' computational requirements. Therefore, advanced computational methods and pipelines have to be developed to cope with such need for efficient analyses. In this paper, we proposed Parallel-META, a GPU- and multi-core-CPU-based open-source pipeline for metagenomic data analysis, which enabled the efficient and parallel analysis of multiple metagenomic datasets and the visualization of the results for multiple samples. In Parallel-META, the similarity-based database search was parallelized based on GPU computing and multi-core CPU computing optimization. Experiments have shown that Parallel-META has at least 15 times speed-up compared to traditional metagenomic data analysis method, with the same accuracy of the results http://www.computationalbioenergy.org/parallel-meta.html. The parallel processing of current metagenomic data would be very promising: with current speed up of 15 times and above, binning would not be a very time-consuming process any more. Therefore, some deeper analysis of the metagenomic data, such as the comparison of different samples, would be feasible in the pipeline, and some of these functionalities have been included into the Parallel-META pipeline.

  11. In silico approach to designing rational metagenomic libraries for functional studies.

    Science.gov (United States)

    Kusnezowa, Anna; Leichert, Lars I

    2017-05-22

    With the development of Next Generation Sequencing technologies, the number of predicted proteins from entire (meta-) genomes has risen exponentially. While for some of these sequences protein functions can be inferred from homology, an experimental characterization is still a requirement for the determination of protein function. However, functional characterization of proteins cannot keep pace with our capabilities to generate more and more sequence data. Here, we present an approach to reduce the number of proteins from entire (meta-) genomes to a reasonably small number for further experimental characterization without loss of important information. About 6.1 million predicted proteins from the Global Ocean Sampling Expedition Metagenome project were distributed into classes based either on homology to existing hidden markov models (HMMs) of known families, or de novo by assessment of pairwise similarity. 5.1 million of these proteins could be classified in this way, yielding 18,437 families. For 4,129 protein families, which did not match existing HMMs from databases, we could create novel HMMs. For each family, we then selected a representative protein, which showed the closest homology to all other proteins in this family. We then selected representatives of four families based on their homology to known and well-characterized lipases. From these four synthesized genes, we could obtain the novel esterase/lipase GOS54, validating our approach. Using an in silico approach, we were able improve the success rate of functional screening and make entire (meta-) genomes amenable for biochemical characterization.

  12. Metagenomics and other Methods for Measuring Antibiotic Resistance in Agroecosystems

    Science.gov (United States)

    Background: There is broad concern regarding antibiotic resistance on farms and in fields, however there is no standard method for defining or measuring antibiotic resistance in environmental samples. Methods: We used metagenomic, culture-based, and molecular methods to characterize the amount, t...

  13. Metagenomic species profiling using universal phylogenetic marker genes

    DEFF Research Database (Denmark)

    Sunagawa, Shinichi; Mende, Daniel R; Zeller, Georg

    2013-01-01

    To quantify known and unknown microorganisms at species-level resolution using shotgun sequencing data, we developed a method that establishes metagenomic operational taxonomic units (mOTUs) based on single-copy phylogenetic marker genes. Applied to 252 human fecal samples, the method revealed...

  14. Evaluating the Quantitative Capabilities of Metagenomic Analysis Software.

    Science.gov (United States)

    Kerepesi, Csaba; Grolmusz, Vince

    2016-05-01

    DNA sequencing technologies are applied widely and frequently today to describe metagenomes, i.e., microbial communities in environmental or clinical samples, without the need for culturing them. These technologies usually return short (100-300 base-pairs long) DNA reads, and these reads are processed by metagenomic analysis software that assign phylogenetic composition-information to the dataset. Here we evaluate three metagenomic analysis software (AmphoraNet--a webserver implementation of AMPHORA2--, MG-RAST, and MEGAN5) for their capabilities of assigning quantitative phylogenetic information for the data, describing the frequency of appearance of the microorganisms of the same taxa in the sample. The difficulties of the task arise from the fact that longer genomes produce more reads from the same organism than shorter genomes, and some software assign higher frequencies to species with longer genomes than to those with shorter ones. This phenomenon is called the "genome length bias." Dozens of complex artificial metagenome benchmarks can be found in the literature. Because of the complexity of those benchmarks, it is usually difficult to judge the resistance of a metagenomic software to this "genome length bias." Therefore, we have made a simple benchmark for the evaluation of the "taxon-counting" in a metagenomic sample: we have taken the same number of copies of three full bacterial genomes of different lengths, break them up randomly to short reads of average length of 150 bp, and mixed the reads, creating our simple benchmark. Because of its simplicity, the benchmark is not supposed to serve as a mock metagenome, but if a software fails on that simple task, it will surely fail on most real metagenomes. We applied three software for the benchmark. The ideal quantitative solution would assign the same proportion to the three bacterial taxa. We have found that AMPHORA2/AmphoraNet gave the most accurate results and the other two software were under

  15. MALINA: a web service for visual analytics of human gut microbiota whole-genome metagenomic reads.

    Science.gov (United States)

    Tyakht, Alexander V; Popenko, Anna S; Belenikin, Maxim S; Altukhov, Ilya A; Pavlenko, Alexander V; Kostryukova, Elena S; Selezneva, Oksana V; Larin, Andrei K; Karpova, Irina Y; Alexeev, Dmitry G

    2012-12-07

    MALINA is a web service for bioinformatic analysis of whole-genome metagenomic data obtained from human gut microbiota sequencing. As input data, it accepts metagenomic reads of various sequencing technologies, including long reads (such as Sanger and 454 sequencing) and next-generation (including SOLiD and Illumina). It is the first metagenomic web service that is capable of processing SOLiD color-space reads, to authors' knowledge. The web service allows phylogenetic and functional profiling of metagenomic samples using coverage depth resulting from the alignment of the reads to the catalogue of reference sequences which are built into the pipeline and contain prevalent microbial genomes and genes of human gut microbiota. The obtained metagenomic composition vectors are processed by the statistical analysis and visualization module containing methods for clustering, dimension reduction and group comparison. Additionally, the MALINA database includes vectors of bacterial and functional composition for human gut microbiota samples from a large number of existing studies allowing their comparative analysis together with user samples, namely datasets from Russian Metagenome project, MetaHIT and Human Microbiome Project (downloaded from http://hmpdacc.org). MALINA is made freely available on the web at http://malina.metagenome.ru. The website is implemented in JavaScript (using Ext JS), Microsoft .NET Framework, MS SQL, Python, with all major browsers supported.

  16. Web Resources for Metagenomics Studies.

    Science.gov (United States)

    Dudhagara, Pravin; Bhavsar, Sunil; Bhagat, Chintan; Ghelani, Anjana; Bhatt, Shreyas; Patel, Rajesh

    2015-10-01

    The development of next-generation sequencing (NGS) platforms spawned an enormous volume of data. This explosion in data has unearthed new scalability challenges for existing bioinformatics tools. The analysis of metagenomic sequences using bioinformatics pipelines is complicated by the substantial complexity of these data. In this article, we review several commonly-used online tools for metagenomics data analysis with respect to their quality and detail of analysis using simulated metagenomics data. There are at least a dozen such software tools presently available in the public domain. Among them, MGRAST, IMG/M, and METAVIR are the most well-known tools according to the number of citations by peer-reviewed scientific media up to mid-2015. Here, we describe 12 online tools with respect to their web link, annotation pipelines, clustering methods, online user support, and availability of data storage. We have also done the rating for each tool to screen more potential and preferential tools and evaluated five best tools using synthetic metagenome. The article comprehensively deals with the contemporary problems and the prospects of metagenomics from a bioinformatics viewpoint. Copyright © 2015. Production and hosting by Elsevier Ltd.

  17. Metagenomics and the niche concept.

    Science.gov (United States)

    Marco, Diana

    2008-08-01

    The metagenomics approach has revolutionised the fields of bacterial diversity, ecology and evolution, as well as derived applications like bioremediation and obtaining bioproducts. A further associated conceptual change has also occurred since in the metagenomics methodology the species is no longer the unit of study, but rather partial genome arrangements or even isolated genes. In spite of this, concepts coming from ecological and evolutionary fields traditionally centred on the species, like the concept of niche, are still being applied without further revision. A reformulation of the niche concept is necessary to deal with the new operative and epistemological challenges posed by the metagenomics approach. To contribute to this end, I review past and present uses of the niche concept in ecology and in microbiological studies, showing that a new, updated definition need to be used in the context of the metagenomics. Finally, I give some insights into a more adequate conceptual background for the utilisation of the niche concept in metagenomic studies. In particular, I raise the necessity of including the microbial genetic background as another variable into the niche space.

  18. Identifying personal microbiomes using metagenomic codes.

    Science.gov (United States)

    Franzosa, Eric A; Huang, Katherine; Meadow, James F; Gevers, Dirk; Lemon, Katherine P; Bohannan, Brendan J M; Huttenhower, Curtis

    2015-06-02

    Community composition within the human microbiome varies across individuals, but it remains unknown if this variation is sufficient to uniquely identify individuals within large populations or stable enough to identify them over time. We investigated this by developing a hitting set-based coding algorithm and applying it to the Human Microbiome Project population. Our approach defined body site-specific metagenomic codes: sets of microbial taxa or genes prioritized to uniquely and stably identify individuals. Codes capturing strain variation in clade-specific marker genes were able to distinguish among 100s of individuals at an initial sampling time point. In comparisons with follow-up samples collected 30-300 d later, ∼30% of individuals could still be uniquely pinpointed using metagenomic codes from a typical body site; coincidental (false positive) matches were rare. Codes based on the gut microbiome were exceptionally stable and pinpointed >80% of individuals. The failure of a code to match its owner at a later time point was largely explained by the loss of specific microbial strains (at current limits of detection) and was only weakly associated with the length of the sampling interval. In addition to highlighting patterns of temporal variation in the ecology of the human microbiome, this work demonstrates the feasibility of microbiome-based identifiability-a result with important ethical implications for microbiome study design. The datasets and code used in this work are available for download from huttenhower.sph.harvard.edu/idability.

  19. A metagenomic framework for the study of airborne microbial communities.

    Directory of Open Access Journals (Sweden)

    Shibu Yooseph

    Full Text Available Understanding the microbial content of the air has important scientific, health, and economic implications. While studies have primarily characterized the taxonomic content of air samples by sequencing the 16S or 18S ribosomal RNA gene, direct analysis of the genomic content of airborne microorganisms has not been possible due to the extremely low density of biological material in airborne environments. We developed sampling and amplification methods to enable adequate DNA recovery to allow metagenomic profiling of air samples collected from indoor and outdoor environments. Air samples were collected from a large urban building, a medical center, a house, and a pier. Analyses of metagenomic data generated from these samples reveal airborne communities with a high degree of diversity and different genera abundance profiles. The identities of many of the taxonomic groups and protein families also allows for the identification of the likely sources of the sampled airborne bacteria.

  20. Precision Metagenomics: Rapid Metagenomic Analyses for Infectious Disease Diagnostics and Public Health Surveillance.

    Science.gov (United States)

    Afshinnekoo, Ebrahim; Chou, Chou; Alexander, Noah; Ahsanuddin, Sofia; Schuetz, Audrey N; Mason, Christopher E

    2017-04-01

    Next-generation sequencing (NGS) technologies have ushered in the era of precision medicine, transforming the way we treat cancer patients and diagnose disease. Concomitantly, the advent of these technologies has created a surge of microbiome and metagenomic studies over the last decade, many of which are focused on investigating the host-gene-microbial interactions responsible for the development and spread of infectious diseases, as well as delineating their key role in maintaining health. As we continue to discover more information about the etiology of infectious diseases, the translational potential of metagenomic NGS methods for treatment and rapid diagnosis is becoming abundantly clear. Here, we present a robust protocol for the implementation and application of "precision metagenomics" across various sequencing platforms for clinical samples. Such a pipeline integrates DNA/RNA extraction, library preparation, sequencing, and bioinformatics analyses for taxonomic classification, antimicrobial resistance (AMR) marker screening, and functional analysis (biochemical and metabolic pathway abundance). Moreover, the pipeline has 3 tracks: STAT for results within 24 h; Comprehensive that affords a more in-depth analysis and takes between 5 and 7 d, but offers antimicrobial resistance information; and Targeted, which also requires 5-7 d, but with more sensitive analysis for specific pathogens. Finally, we discuss the challenges that need to be addressed before full integration in the clinical setting.

  1. Metagenomics and novel gene discovery

    Science.gov (United States)

    Culligan, Eamonn P; Sleator, Roy D; Marchesi, Julian R; Hill, Colin

    2014-01-01

    Metagenomics provides a means of assessing the total genetic pool of all the microbes in a particular environment, in a culture-independent manner. It has revealed unprecedented diversity in microbial community composition, which is further reflected in the encoded functional diversity of the genomes, a large proportion of which consists of novel genes. Herein, we review both sequence-based and functional metagenomic methods to uncover novel genes and outline some of the associated problems of each type of approach, as well as potential solutions. Furthermore, we discuss the potential for metagenomic biotherapeutic discovery, with a particular focus on the human gut microbiome and finally, we outline how the discovery of novel genes may be used to create bioengineered probiotics. PMID:24317337

  2. Revised computational metagenomic processing uncovers hidden and biologically meaningful functional variation in the human microbiome.

    Science.gov (United States)

    Manor, Ohad; Borenstein, Elhanan

    2017-02-08

    Recent metagenomic analyses of the human gut microbiome identified striking variability in its taxonomic composition across individuals. Notably, however, these studies often reported marked functional uniformity, with relatively little variation in the microbiome's gene composition or in its overall metabolic capacity. Here, we address this surprising discrepancy between taxonomic and functional variations and set out to track its origins. Specifically, we demonstrate that the functional uniformity observed in microbiome studies can be attributed, at least partly, to common computational metagenomic processing procedures that mask true functional variation across microbiome samples. We identify several such procedures, including commonly used practices for gene abundance normalization, mapping of gene families to functional pathways, and gene family aggregation. We show that accounting for these factors and using revised metagenomic processing procedures uncovers such hidden functional variation, significantly increasing observed variation in the abundance of functional elements across samples. Importantly, we find that this uncovered variation is biologically meaningful and that it is associated with both host identity and health. Accurate characterization of functional variation in the microbiome is essential for comparative metagenomic analyses in health and disease. Our finding that metagenomic processing procedures mask underlying and biologically meaningful functional variation therefore highlights an important challenge such studies may face. Alternative schemes for metagenomic processing that uncover this hidden functional variation can facilitate improved metagenomic analysis and help pinpoint disease- and host-associated shifts in the microbiome's functional capacity.

  3. Viral to metazoan marine plankton nucleotide sequences from the Tara Oceans expedition.

    Science.gov (United States)

    Alberti, Adriana; Poulain, Julie; Engelen, Stefan; Labadie, Karine; Romac, Sarah; Ferrera, Isabel; Albini, Guillaume; Aury, Jean-Marc; Belser, Caroline; Bertrand, Alexis; Cruaud, Corinne; Da Silva, Corinne; Dossat, Carole; Gavory, Frédérick; Gas, Shahinaz; Guy, Julie; Haquelle, Maud; Jacoby, E'krame; Jaillon, Olivier; Lemainque, Arnaud; Pelletier, Eric; Samson, Gaëlle; Wessner, Mark; Acinas, Silvia G; Royo-Llonch, Marta; Cornejo-Castillo, Francisco M; Logares, Ramiro; Fernández-Gómez, Beatriz; Bowler, Chris; Cochrane, Guy; Amid, Clara; Hoopen, Petra Ten; De Vargas, Colomban; Grimsley, Nigel; Desgranges, Elodie; Kandels-Lewis, Stefanie; Ogata, Hiroyuki; Poulton, Nicole; Sieracki, Michael E; Stepanauskas, Ramunas; Sullivan, Matthew B; Brum, Jennifer R; Duhaime, Melissa B; Poulos, Bonnie T; Hurwitz, Bonnie L; Pesant, Stéphane; Karsenti, Eric; Wincker, Patrick

    2017-08-01

    A unique collection of oceanic samples was gathered by the Tara Oceans expeditions (2009-2013), targeting plankton organisms ranging from viruses to metazoans, and providing rich environmental context measurements. Thanks to recent advances in the field of genomics, extensive sequencing has been performed for a deep genomic analysis of this huge collection of samples. A strategy based on different approaches, such as metabarcoding, metagenomics, single-cell genomics and metatranscriptomics, has been chosen for analysis of size-fractionated plankton communities. Here, we provide detailed procedures applied for genomic data generation, from nucleic acids extraction to sequence production, and we describe registries of genomics datasets available at the European Nucleotide Archive (ENA, www.ebi.ac.uk/ena). The association of these metadata to the experimental procedures applied for their generation will help the scientific community to access these data and facilitate their analysis. This paper complements other efforts to provide a full description of experiments and open science resources generated from the Tara Oceans project, further extending their value for the study of the world's planktonic ecosystems.

  4. Novel Strategies for Applied Metagenomics.

    Science.gov (United States)

    Moore-Connors, Jessica M; Dunn, Katherine A; Bielawski, Joseph P; Van Limbergen, Johan

    2016-03-01

    Detailed analyses of the gut microbiome and its effect on human physiology and disease are emerging, thanks to advances in high-throughput DNA-sequencing technology and the burgeoning field of metagenomics. Metagenomics examines the structure and functional potential of microbial communities in their native habitats through the direct isolation and analysis of community DNA. In inflammatory bowel disease, gut microbiome studies have shown an association with perturbations in community composition and, especially, function. In this review, we discuss the application of next-generation sequencing to microbiome research and highlight the importance of modeling microbiome structure and function to the future of inflammatory bowel disease research and treatment.

  5. Metagenomic characterization of ambulances across the USA.

    Science.gov (United States)

    O'Hara, Niamh B; Reed, Harry J; Afshinnekoo, Ebrahim; Harvin, Donell; Caplan, Nora; Rosen, Gail; Frye, Brook; Woloszynek, Stephen; Ounit, Rachid; Levy, Shawn; Butler, Erin; Mason, Christopher E

    2017-09-22

    Microbial communities in our built environments have great influence on human health and disease. A variety of built environments have been characterized using a metagenomics-based approach, including some healthcare settings. However, there has been no study to date that has used this approach in pre-hospital settings, such as ambulances, an important first point-of-contact between patients and hospitals. We sequenced 398 samples from 137 ambulances across the USA using shotgun sequencing. We analyzed these data to explore the microbial ecology of ambulances including characterizing microbial community composition, nosocomial pathogens, patterns of diversity, presence of functional pathways and antimicrobial resistance, and potential spatial and environmental factors that may contribute to community composition. We found that the top 10 most abundant species are either common built environment microbes, microbes associated with the human microbiome (e.g., skin), or are species associated with nosocomial infections. We also found widespread evidence of antimicrobial resistance markers (hits ~ 90% samples). We identified six factors that may influence the microbial ecology of ambulances including ambulance surfaces, geographical-related factors (including region, longitude, and latitude), and weather-related factors (including temperature and precipitation). While the vast majority of microbial species classified were beneficial, we also found widespread evidence of species associated with nosocomial infections and antimicrobial resistance markers. This study indicates that metagenomics may be useful to characterize the microbial ecology of pre-hospital ambulance settings and that more rigorous testing and cleaning of ambulances may be warranted.

  6. Metagenomic analysis of microbial communities and beyond

    DEFF Research Database (Denmark)

    Schreiber, Lars

    2014-01-01

    From small clone libraries to large next-generation sequencing datasets – the field of community genomics or metagenomics has developed tremendously within the last years. This chapter will summarize some of these developments and will also highlight pitfalls of current metagenomic analyses....... It will illustrate the general workflow of a metagenomic study and introduce the three different metagenomic approaches: (1) the random shotgun approach that focuses on the metagenome as a whole, (2) the targeted approach that focuses on metagenomic amplicon sequences, and (3) the function-driven approach that uses...... heterologous expression of metagenomic DNA fragments to discover novel metabolic functions. Lastly, the chapter will shortly discuss the meta-analysis of gene expression of microbial communities, more precisely metatranscriptomics and metaproteomics....

  7. Toward molecular trait-based ecology through integration of biogeochemical, geographical and metagenomic data

    DEFF Research Database (Denmark)

    Raes, Jeroen; Letunic, Ivica; Yamada, Takuji

    2011-01-01

    Using metagenomic 'parts lists' to infer global patterns on microbial ecology remains a significant challenge. To deduce important ecological indicators such as environmental adaptation, molecular trait dispersal, diversity variation and primary production from the gene pool of an ecosystem, we...... integrated 25 ocean metagenomes with geographical, meteorological and geophysicochemical data. We find that climatic factors (temperature, sunlight) are the major determinants of the biomolecular repertoire of each sample and the main limiting factor on functional trait dispersal (absence of biogeographic...... composition derived from metagenomes is an important quantitative readout for molecular trait-based biogeography and ecology....

  8. Analysis of composition-based metagenomic classification.

    Science.gov (United States)

    Higashi, Susan; Barreto, André da Motta Salles; Cantão, Maurício Egidio; de Vasconcelos, Ana Tereza Ribeiro

    2012-01-01

    An essential step of a metagenomic study is the taxonomic classification, that is, the identification of the taxonomic lineage of the organisms in a given sample. The taxonomic classification process involves a series of decisions. Currently, in the context of metagenomics, such decisions are usually based on empirical studies that consider one specific type of classifier. In this study we propose a general framework for analyzing the impact that several decisions can have on the classification problem. Instead of focusing on any specific classifier, we define a generic score function that provides a measure of the difficulty of the classification task. Using this framework, we analyze the impact of the following parameters on the taxonomic classification problem: (i) the length of n-mers used to encode the metagenomic sequences, (ii) the similarity measure used to compare sequences, and (iii) the type of taxonomic classification, which can be conventional or hierarchical, depending on whether the classification process occurs in a single shot or in several steps according to the taxonomic tree. We defined a score function that measures the degree of separability of the taxonomic classes under a given configuration induced by the parameters above. We conducted an extensive computational experiment and found out that reasonable values for the parameters of interest could be (i) intermediate values of n, the length of the n-mers; (ii) any similarity measure, because all of them resulted in similar scores; and (iii) the hierarchical strategy, which performed better in all of the cases. As expected, short n-mers generate lower configuration scores because they give rise to frequency vectors that represent distinct sequences in a similar way. On the other hand, large values for n result in sparse frequency vectors that represent differently metagenomic fragments that are in fact similar, also leading to low configuration scores. Regarding the similarity measure, in

  9. Expeditions and other exploration

    NARCIS (Netherlands)

    NN,

    1964-01-01

    Previous to the 4th UNESCO Expedition, Dr H. Sleumer of the Rijksherbarium made three trips together with Mr Tem Smitinand, first to Doi Chiengdao and Doi Suthep in the North (Aug. 15-21, 1963), then to the Khao Yai National Park in Central Siam (Aug. 28-29), then to Pha Nok Khao and Phu Krading

  10. Stalking the fourth domain in metagenomic data: searching for, discovering, and interpreting novel, deep branches in marker gene phylogenetic trees.

    Directory of Open Access Journals (Sweden)

    Dongying Wu

    Full Text Available BACKGROUND: Most of our knowledge about the ancient evolutionary history of organisms has been derived from data associated with specific known organisms (i.e., organisms that we can study directly such as plants, metazoans, and culturable microbes. Recently, however, a new source of data for such studies has arrived: DNA sequence data generated directly from environmental samples. Such metagenomic data has enormous potential in a variety of areas including, as we argue here, in studies of very early events in the evolution of gene families and of species. METHODOLOGY/PRINCIPAL FINDINGS: We designed and implemented new methods for analyzing metagenomic data and used them to search the Global Ocean Sampling (GOS expedition data set for novel lineages in three gene families commonly used in phylogenetic studies of known and unknown organisms: small subunit rRNA and the recA and rpoB superfamilies. Though the methods available could not accurately identify very deeply branched ss-rRNAs (largely due to difficulties in making robust sequence alignments for novel rRNA fragments, our analysis revealed the existence of multiple novel branches in the recA and rpoB gene families. Analysis of available sequence data likely from the same genomes as these novel recA and rpoB homologs was then used to further characterize the possible organismal source of the novel sequences. CONCLUSIONS/SIGNIFICANCE: Of the novel recA and rpoB homologs identified in the metagenomic data, some likely come from uncharacterized viruses while others may represent ancient paralogs not yet seen in any cultured organism. A third possibility is that some come from novel cellular lineages that are only distantly related to any organisms for which sequence data is currently available. If there exist any major, but so-far-undiscovered, deeply branching lineages in the tree of life, we suggest that methods such as those described herein currently offer the best way to search for them.

  11. DectICO: an alignment-free supervised metagenomic classification method based on feature extraction and dynamic selection.

    Science.gov (United States)

    Ding, Xiao; Cheng, Fudong; Cao, Changchang; Sun, Xiao

    2015-10-07

    Continual progress in next-generation sequencing allows for generating increasingly large metagenomes which are over time or space. Comparing and classifying the metagenomes with different microbial communities is critical. Alignment-free supervised classification is important for discriminating between the multifarious components of metagenomic samples, because it can be accomplished independently of known microbial genomes. We propose an alignment-free supervised metagenomic classification method called DectICO. The intrinsic correlation of oligonucleotides provides the feature set, which is selected dynamically using a kernel partial least squares algorithm, and the feature matrices extracted with this set are sequentially employed to train classifiers by support vector machine (SVM). We evaluated the classification performance of DectICO on three actual metagenomic sequencing datasets, two containing deep sequencing metagenomes and one of low coverage. Validation results show that DectICO is powerful, performs well based on long oligonucleotides (i.e., 6-mer to 8-mer), and is more stable and generalized than a sequence-composition-based method. The classifiers trained by our method are more accurate than non-dynamic feature selection methods and a recently published recursive-SVM-based classification approach. The alignment-free supervised classification method DectICO can accurately classify metagenomic samples without dependence on known microbial genomes. Selecting the ICO dynamically offers better stability and generality compared with sequence-composition-based classification algorithms. Our proposed method provides new insights in metagenomic sample classification.

  12. EBI metagenomics in 2016--an expanding and evolving resource for the analysis and archiving of metagenomic data.

    Science.gov (United States)

    Mitchell, Alex; Bucchini, Francois; Cochrane, Guy; Denise, Hubert; ten Hoopen, Petra; Fraser, Matthew; Pesseat, Sebastien; Potter, Simon; Scheremetjew, Maxim; Sterk, Peter; Finn, Robert D

    2016-01-04

    EBI metagenomics (https://www.ebi.ac.uk/metagenomics/) is a freely available hub for the analysis and archiving of metagenomic and metatranscriptomic data. Over the last 2 years, the resource has undergone rapid growth, with an increase of over five-fold in the number of processed samples and consequently represents one of the largest resources of analysed shotgun metagenomes. Here, we report the status of the resource in 2016 and give an overview of new developments. In particular, we describe updates to data content, a complete overhaul of the analysis pipeline, streamlining of data presentation via the website and the development of a new web based tool to compare functional analyses of sequence runs within a study. We also highlight two of the higher profile projects that have been analysed using the resource in the last year: the oceanographic projects Ocean Sampling Day and Tara Oceans. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. Functional Intestinal Metagenomics (Chapter 18)

    NARCIS (Netherlands)

    Bogert, van den B.; Leimena, M.M.; Vos, de W.M.; Zoetendal, E.G.; Kleerebezem, M.

    2011-01-01

    The premiere two-volume reference on revelations from studying complex microbial communities in many distinct habitats Metagenomics is an emerging field that has changed the way microbiologists study microorganisms. It involves the genomic analysis of microorganisms by extraction and cloning of DNA

  14. Estimating richness from phage metagenomes

    Science.gov (United States)

    Bacteriophages are important drivers of ecosystem functions, yet little is known about the vast majority of phages. Phage metagenomics, or the study of the collective genome of an assemblage of phages, enables the investigation of broad ecological questions in phage communities. One ecological cha...

  15. Metagenomic Analysis of Dairy Bacteriophages

    DEFF Research Database (Denmark)

    Muhammed, Musemma K.; Kot, Witold; Neve, Horst

    2017-01-01

    Despite their huge potential for characterizing the biodiversity of phages, metagenomic studies are currently not available for dairy bacteriophages, partly due to the lack of a standard procedure for phage extraction. We optimized an extraction method that allows to remove the bulk protein from...

  16. Use of whole genome shotgun metagenomics: a practical guide for the microbiome-minded physician scientist.

    Science.gov (United States)

    Ma, Jun; Prince, Amanda; Aagaard, Kjersti M

    2014-01-01

    Whole genome shotgun sequencing (WGS) has been increasingly recognized as the most comprehensive and robust approach for metagenomics research. When compared with 16S-based metagenomics, it offers the advantage of identification of species level taxonomy and the estimation of metabolic pathway activities from human and environmental samples. Several large-scale metagenomic projects have been recently conducted or are currently underway utilizing WGS. With the generation of vast amounts of data, the bioinformatics and computational analysis of WGS results become vital for the success of a metagenomics study. However, each step in the WGS data analysis, including metagenome assembly, gene prediction, taxonomy identification, function annotation, and pathway analysis, is complicated by the shear amount of data. Algorithms and tools have been developed specifically to handle WGS-generated metagenomics data with the hope of reducing the requirement on computational time and storage space. Here, we present an overview of the current state of metagenomics through WGS sequencing, challenges frequently encountered, and up-to-date solutions. Several applications that are uniquely applicable to microbiome studies in reproductive and perinatal medicine are also discussed. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.

  17. FY08 LDRD Final Report Probabilistic Inference of Metabolic Pathways from Metagenomic Sequence Data

    Energy Technology Data Exchange (ETDEWEB)

    D' haeseleer, P

    2009-03-01

    Metagenomic 'shotgun' sequencing of environmental microbial communities has the potential to revolutionize microbial ecology, allowing a cultivation-independent, yet sequence-based analysis of the metabolic capabilities and functions present in an environmental sample. Although its intensive sequencing requirements are a good match for the continuously increasing bandwidth at sequencing centers, the complexity, seemingly inexhaustible novelty, and 'scrambled' nature of metagenomic data is also proving a tremendous challenge for analysis. In fact, many metagenomics projects do not go much further than providing a list of novel gene variants and over- or under-represented functional gene categories. In this project, we proposed to develop a set of novel metagenomic sequence analysis tools, including a binning method to group sequences by species, inference of phenotypes and metabolic pathways from these reconstructed species, and extraction of coarse-grained flux models. We proposed to closely collaborate with the DOE Joint Genome Institute to align these tools with their metagenomics analysis needs and the developing IMG/M metagenomics pipeline. Results would be cross-validated with simulated metagenomic data using a testing platform developed at the JGI.

  18. Temperature and chlorophyll a profile data from bottle samples in the Southern Ocean from the R/V Fuji for the Japanese Antarctic Research Expedition, 1968-1969 (NCEI Accession 0001663)

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — These data were key entered from analog manuscript "JARE data Reports. Oceanografic Data of the 10th Japanise Antarctic Research Expedition 1968-1969. Kobe...

  19. Comparative metagenomics demonstrating different degradative capacity of activated biomass treating hydrocarbon contaminated wastewater.

    Science.gov (United States)

    Yadav, Trilok Chandra; Pal, Rajesh Ramavadh; Shastri, Sunita; Jadeja, Niti B; Kapley, Atya

    2015-01-01

    This study demonstrates the diverse degradative capacity of activated biomass, when exposed to different levels of total dissolved solids (TDS) using a comparative metagenomics approach. The biomass was collected at two time points to examine seasonal variations. Four metagenomes were sequenced on Illumina Miseq platform and analysed using MG-RAST. STAMP tool was used to analyse statistically significant differences amongst different attributes of metagenomes. Metabolic pathways related to degradation of aromatics via the central and peripheral pathways were found to be dominant in low TDS metagenome, while pathways corresponding to central carbohydrate metabolism, nitrogen, organic acids were predominant in high TDS sample. Seasonal variation was seen to affect catabolic gene abundance as well as diversity of the microbial community. Degradation of model compounds using activated sludge demonstrated efficient utilisation of single aromatic ring compounds in both samples but cyclic compounds were not efficiently utilised by biomass exposed to high TDS. Copyright © 2015 Elsevier Ltd. All rights reserved.

  20. Metagenome-based analysis: a promising direction for plankton ecological studies.

    Science.gov (United States)

    Yan, QingYun; Yu, YuHe

    2011-01-01

    The plankton community plays an especially important role in the functioning of aquatic ecosystems and also in biogeochemical cycles. Since the beginning of marine research expeditions in the 1870 s, an enormous number of planktonic organisms have been described and studied. Plankton investigation has become one of the most important areas of aquatic ecological study, as well as a crucial component of aquatic environmental evaluation. Nonetheless, traditional investigations have mainly focused on morphospecies composition, abundances and dynamics, which primarily depend on morphological identification and counting under microscopes. However, for many species/groups, with few readily observable characteristics, morphological identification and counting have historically been a difficult task. Over the past decades, microbiologists have endeavored to apply and extend molecular techniques to address questions in microbial ecology. These culture-independent studies have generated new insights into microbial ecology. One such strategy, metagenome-based analysis, has also proved to be a powerful tool for plankton research. This mini-review presents a brief history of plankton research using morphological and metagenome-based approaches and the potential applications and further directions of metagenomic analyses in plankton ecological studies are discussed. The use of metagenome-based approaches for plankton ecological study in aquatic ecosystems is encouraged.

  1. The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes.

    Science.gov (United States)

    Meyer, F; Paarmann, D; D'Souza, M; Olson, R; Glass, E M; Kubal, M; Paczian, T; Rodriguez, A; Stevens, R; Wilke, A; Wilkening, J; Edwards, R A

    2008-09-19

    Random community genomes (metagenomes) are now commonly used to study microbes in different environments. Over the past few years, the major challenge associated with metagenomics shifted from generating to analyzing sequences. High-throughput, low-cost next-generation sequencing has provided access to metagenomics to a wide range of researchers. A high-throughput pipeline has been constructed to provide high-performance computing to all researchers interested in using metagenomics. The pipeline produces automated functional assignments of sequences in the metagenome by comparing both protein and nucleotide databases. Phylogenetic and functional summaries of the metagenomes are generated, and tools for comparative metagenomics are incorporated into the standard views. User access is controlled to ensure data privacy, but the collaborative environment underpinning the service provides a framework for sharing datasets between multiple users. In the metagenomics RAST, all users retain full control of their data, and everything is available for download in a variety of formats. The open-source metagenomics RAST service provides a new paradigm for the annotation and analysis of metagenomes. With built-in support for multiple data sources and a back end that houses abstract data types, the metagenomics RAST is stable, extensible, and freely available to all researchers. This service has removed one of the primary bottlenecks in metagenome sequence analysis - the availability of high-performance computing for annotating the data. http://metagenomics.nmpdr.org.

  2. Metagenome Fragment Classification Using -Mer Frequency Profiles

    Directory of Open Access Journals (Sweden)

    Gail Rosen

    2008-01-01

    Full Text Available A vast amount of microbial sequencing data is being generated through large-scale projects in ecology, agriculture, and human health. Efficient high-throughput methods are needed to analyze the mass amounts of metagenomic data, all DNA present in an environmental sample. A major obstacle in metagenomics is the inability to obtain accuracy using technology that yields short reads. We construct the unique -mer frequency profiles of 635 microbial genomes publicly available as of February 2008. These profiles are used to train a naive Bayes classifier (NBC that can be used to identify the genome of any fragment. We show that our method is comparable to BLAST for small 25 bp fragments but does not have the ambiguity of BLAST's tied top scores. We demonstrate that this approach is scalable to identify any fragment from hundreds of genomes. It also performs quite well at the strain, species, and genera levels and achieves strain resolution despite classifying ubiquitous genomic fragments (gene and nongene regions. Cross-validation analysis demonstrates that species-accuracy achieves 90% for highly-represented species containing an average of 8 strains. We demonstrate that such a tool can be used on the Sargasso Sea dataset, and our analysis shows that NBC can be further enhanced.

  3. CoMeta: Classification of Metagenomes Using k-mers

    OpenAIRE

    Jolanta Kawulok; Sebastian Deorowicz

    2015-01-01

    Nowadays, the study of environmental samples has been developing rapidly. Characterization of the environment composition broadens the knowledge about the relationship between species composition and environmental conditions. An important element of extracting the knowledge of the sample composition is to compare the extracted fragments of DNA with sequences derived from known organisms. In the presented paper, we introduce an algorithm called CoMeta (Classification of metagenomes), which ass...

  4. The Characterization of Novel Tissue Microbiota Using an Optimized 16S Metagenomic Sequencing Pipeline

    OpenAIRE

    Lluch, J?r?me; Servant, Florence; Pa?ss?, Sandrine; Valle, Carine; Vali?re, Sophie; Kuchly, Claire; Vilchez, Ga?lle; Donnadieu, C?cile; Courtney, Michael; Burcelin, R?my; Amar, Jacques; Bouchez, Olivier; Lelouvier, Benjamin

    2015-01-01

    Background Substantial progress in high-throughput metagenomic sequencing methodologies has enabled the characterisation of bacteria from various origins (for example gut and skin). However, the recently-discovered bacterial microbiota present within animal internal tissues has remained unexplored due to technical difficulties associated with these challenging samples. Results We have optimized a specific 16S rDNA-targeted metagenomics sequencing (16S metabarcoding) pipeline based on the Illu...

  5. Contamination of the Arctic reflected in microbial metagenomes from the Greenland ice sheet

    DEFF Research Database (Denmark)

    Hauptmann, Aviaja Zenia Edna Lyberth; Sicheritz-Pontén, Thomas; Cameron, Karen A.

    2017-01-01

    interact with contamination in the Arctic is limited. Through shotgun metagenomic data and binned genomes from metagenomes we show that microbial communities, sampled from multiple surface ice locations on the Greenland ice sheet, have the potential for resistance to and degradation of contaminants....... The microbial potential to degrade anthropogenic contaminants, such as toxic and persistent polychlorinated biphenyls, was found to be spatially variable and not limited to regions close to human activities. Binned genomes showed close resemblance to microorganisms isolated from contaminated habitats...

  6. Exploring variation-aware contig graphs for (comparative) metagenomics using MaryGold

    Science.gov (United States)

    Nijkamp, Jurgen F.; Pop, Mihai; Reinders, Marcel J. T.; de Ridder, Dick

    2013-01-01

    Motivation: Although many tools are available to study variation and its impact in single genomes, there is a lack of algorithms for finding such variation in metagenomes. This hampers the interpretation of metagenomics sequencing datasets, which are increasingly acquired in research on the (human) microbiome, in environmental studies and in the study of processes in the production of foods and beverages. Existing algorithms often depend on the use of reference genomes, which pose a problem when a metagenome of a priori unknown strain composition is studied. In this article, we develop a method to perform reference-free detection and visual exploration of genomic variation, both within a single metagenome and between metagenomes. Results: We present the MaryGold algorithm and its implementation, which efficiently detects bubble structures in contig graphs using graph decomposition. These bubbles represent variable genomic regions in closely related strains in metagenomic samples. The variation found is presented in a condensed Circos-based visualization, which allows for easy exploration and interpretation of the found variation. We validated the algorithm on two simulated datasets containing three respectively seven Escherichia coli genomes and showed that finding allelic variation in these genomes improves assemblies. Additionally, we applied MaryGold to publicly available real metagenomic datasets, enabling us to find within-sample genomic variation in the metagenomes of a kimchi fermentation process, the microbiome of a premature infant and in microbial communities living on acid mine drainage. Moreover, we used MaryGold for between-sample variation detection and exploration by comparing sequencing data sampled at different time points for both of these datasets. Availability: MaryGold has been written in C++ and Python and can be downloaded from http://bioinformatics.tudelft.nl/software Contact: d.deridder@tudelft.nl PMID:24058058

  7. Marine Metagenome as A Resource for Novel Enzymes

    Directory of Open Access Journals (Sweden)

    Amani D. Alma’abadi

    2015-10-01

    Full Text Available More than 99% of identified prokaryotes, including many from the marine environment, cannot be cultured in the laboratory. This lack of capability restricts our knowledge of microbial genetics and community ecology. Metagenomics, the culture-independent cloning of environmental DNAs that are isolated directly from an environmental sample, has already provided a wealth of information about the uncultured microbial world. It has also facilitated the discovery of novel biocatalysts by allowing researchers to probe directly into a huge diversity of enzymes within natural microbial communities. Recent advances in these studies have led to a great interest in recruiting microbial enzymes for the development of environmentally-friendly industry. Although the metagenomics approach has many limitations, it is expected to provide not only scientific insights but also economic benefits, especially in industry. This review highlights the importance of metagenomics in mining microbial lipases, as an example, by using high-throughput techniques. In addition, we discuss challenges in the metagenomics as an important part of bioinformatics analysis in big data.

  8. Marine Metagenome as A Resource for Novel Enzymes

    KAUST Repository

    Alma’abadi, Amani D.

    2015-11-10

    More than 99% of identified prokaryotes, including many from the marine environment, cannot be cultured in the laboratory. This lack of capability restricts our knowledge of microbial genetics and community ecology. Metagenomics, the culture-independent cloning of environmental DNAs that are isolated directly from an environmental sample, has already provided a wealth of information about the uncultured microbial world. It has also facilitated the discovery of novel biocatalysts by allowing researchers to probe directly into a huge diversity of enzymes within natural microbial communities. Recent advances in these studies have led to great interest in recruiting microbial enzymes for the development of environmentally-friendly industry. Although the metagenomics approach has many limitations, it is expected to provide not only scientific insights but also economic benefits, especially in industry. This review highlights the importance of metagenomics in mining microbial lipases, as an example, by using high-throughput techniques. In addition, we discuss challenges in the metagenomics as an important part of bioinformatics analysis in big data.

  9. Gene prediction in metagenomic fragments: A large scale machine learning approach

    Directory of Open Access Journals (Sweden)

    Morgenstern Burkhard

    2008-04-01

    Full Text Available Abstract Background Metagenomics is an approach to the characterization of microbial genomes via the direct isolation of genomic sequences from the environment without prior cultivation. The amount of metagenomic sequence data is growing fast while computational methods for metagenome analysis are still in their infancy. In contrast to genomic sequences of single species, which can usually be assembled and analyzed by many available methods, a large proportion of metagenome data remains as unassembled anonymous sequencing reads. One of the aims of all metagenomic sequencing projects is the identification of novel genes. Short length, for example, Sanger sequencing yields on average 700 bp fragments, and unknown phylogenetic origin of most fragments require approaches to gene prediction that are different from the currently available methods for genomes of single species. In particular, the large size of metagenomic samples requires fast and accurate methods with small numbers of false positive predictions. Results We introduce a novel gene prediction algorithm for metagenomic fragments based on a two-stage machine learning approach. In the first stage, we use linear discriminants for monocodon usage, dicodon usage and translation initiation sites to extract features from DNA sequences. In the second stage, an artificial neural network combines these features with open reading frame length and fragment GC-content to compute the probability that this open reading frame encodes a protein. This probability is used for the classification and scoring of gene candidates. With large scale training, our method provides fast single fragment predictions with good sensitivity and specificity on artificially fragmented genomic DNA. Additionally, this method is able to predict translation initiation sites accurately and distinguishes complete from incomplete genes with high reliability. Conclusion Large scale machine learning methods are well-suited for gene

  10. Probabilistic Inference of Biochemical Reactions in Microbial Communities from Metagenomic Sequences

    Science.gov (United States)

    Jiao, Dazhi; Ye, Yuzhen; Tang, Haixu

    2013-01-01

    Shotgun metagenomics has been applied to the studies of the functionality of various microbial communities. As a critical analysis step in these studies, biological pathways are reconstructed based on the genes predicted from metagenomic shotgun sequences. Pathway reconstruction provides insights into the functionality of a microbial community and can be used for comparing multiple microbial communities. The utilization of pathway reconstruction, however, can be jeopardized because of imperfect functional annotation of genes, and ambiguity in the assignment of predicted enzymes to biochemical reactions (e.g., some enzymes are involved in multiple biochemical reactions). Considering that metabolic functions in a microbial community are carried out by many enzymes in a collaborative manner, we present a probabilistic sampling approach to profiling functional content in a metagenomic dataset, by sampling functions of catalytically promiscuous enzymes within the context of the entire metabolic network defined by the annotated metagenome. We test our approach on metagenomic datasets from environmental and human-associated microbial communities. The results show that our approach provides a more accurate representation of the metabolic activities encoded in a metagenome, and thus improves the comparative analysis of multiple microbial communities. In addition, our approach reports likelihood scores of putative reactions, which can be used to identify important reactions and metabolic pathways that reflect the environmental adaptation of the microbial communities. Source code for sampling metabolic networks is available online at http://omics.informatics.indiana.edu/mg/MetaNetSam/. PMID:23555216

  11. Diversity Indices as Measures of Functional Annotation Methods in Metagenomics Studies

    KAUST Repository

    Jankovic, Boris R.

    2016-01-26

    Applications of high-throughput techniques in metagenomics studies produce massive amounts of data. Fragments of genomic, transcriptomic and proteomic molecules are all found in metagenomics samples. Laborious and meticulous effort in sequencing and functional annotation are then required to, amongst other objectives, reconstruct a taxonomic map of the environment that metagenomics samples were taken from. In addition to computational challenges faced by metagenomics studies, the analysis is further complicated by the presence of contaminants in the samples, potentially resulting in skewed taxonomic analysis. The functional annotation in metagenomics can utilize all available omics data and therefore different methods that are associated with a particular type of data. For example, protein-coding DNA, non-coding RNA or ribosomal RNA data can be used in such an analysis. These methods would have their advantages and disadvantages and the question of comparison among them naturally arises. There are several criteria that can be used when performing such a comparison. Loosely speaking, methods can be evaluated in terms of computational complexity or in terms of the expected biological accuracy. We propose that the concept of diversity that is used in the ecosystems and species diversity studies can be successfully used in evaluating certain aspects of the methods employed in metagenomics studies. We show that when applying the concept of Hill’s diversity, the analysis of variations in the diversity order provides valuable clues into the robustness of methods used in the taxonomical analysis.

  12. A lunar polar expedition.

    Science.gov (United States)

    Dowling, Richard; Staehle, Robert L.; Svitek, Thomas

    This paper reviews issues related to a five-person expedition to the lunar north pole which primarily addresses site selection and the requirements for transportation, power, and life support. A one-year stay on the lunar surface is proposed based on available technology, and proposals are detailed for incorporating flight-proven systems, abort or rescue options, and the use of the base as the nucleus for subsequent operations. Specific details are given regarding lunar orbital data, the characteristics of the proposed base, power and consumables requirements, and equipment such as two-person lunar roving vehicles and space suits. During the expedition: (1) water is recycled; (2) Autolanders are used to deliver equipment; (3) two rovers are included in the mass budget; (4) the lunar surface is studied in detail. A polar lunar-base site offers the advantages of unobstructed astronomy, enhanced heat rejection, and the potential for reuse.

  13. An application of statistics to comparative metagenomics

    Directory of Open Access Journals (Sweden)

    Rohwer Forest

    2006-03-01

    Full Text Available Abstract Background Metagenomics, sequence analyses of genomic DNA isolated directly from the environments, can be used to identify organisms and model community dynamics of a particular ecosystem. Metagenomics also has the potential to identify significantly different metabolic potential in different environments. Results Here we use a statistical method to compare curated subsystems, to predict the physiology, metabolism, and ecology from metagenomes. This approach can be used to identify those subsystems that are significantly different between metagenome sequences. Subsystems that were overrepresented in the Sargasso Sea and Acid Mine Drainage metagenome when compared to non-redundant databases were identified. Conclusion The methodology described herein applies statistics to the comparisons of metabolic potential in metagenomes. This analysis reveals those subsystems that are more, or less, represented in the different environments that are compared. These differences in metabolic potential lead to several testable hypotheses about physiology and metabolism of microbes from these ecosystems.

  14. Potential applications of metagenomics to assess the biological effects of food structure and function.

    Science.gov (United States)

    Santiago-Rodriguez, Tasha M; Cano, Raul; Jiménez-Flores, Rafael

    2016-10-12

    Metagenomics, or the collective study of genomes is an important emerging area in microbiology and related fields, and is increasingly being recognized as a tool to characterize the microbial community structure and function of diverse sample types. Metagenomics compares sequences to existing databases to enable the identification of potential microbial reservoirs and predict specific functions; yet, metagenomics has not been widely applied to understand how changes in the food structure and composition affect microbial communities and their function in the human gut. Studies are needed to understand the digestion of food products, and to measure their effectiveness in preserving a healthy microbiome, as well as intestinal function. We suggest the use of metagenomics with validation techniques such as Polymerase Chain Reaction (PCR), cloning and functional assays to assess the biological effects of food structure and function.

  15. Diagnosis of Bacterial Bloodstream Infections: A 16S Metagenomics Approach.

    Science.gov (United States)

    Decuypere, Saskia; Meehan, Conor J; Van Puyvelde, Sandra; De Block, Tessa; Maltha, Jessica; Palpouguini, Lompo; Tahita, Marc; Tinto, Halidou; Jacobs, Jan; Deborggraeve, Stijn

    2016-02-01

    Bacterial bloodstream infection (bBSI) is one of the leading causes of death in critically ill patients and accurate diagnosis is therefore crucial. We here report a 16S metagenomics approach for diagnosing and understanding bBSI. The proof-of-concept was delivered in 75 children (median age 15 months) with severe febrile illness in Burkina Faso. Standard blood culture and malaria testing were conducted at the time of hospital admission. 16S metagenomics testing was done retrospectively and in duplicate on the blood of all patients. Total DNA was extracted from the blood and the V3-V4 regions of the bacterial 16S rRNA genes were amplified by PCR and deep sequenced on an Illumina MiSeq sequencer. Paired reads were curated, taxonomically labeled, and filtered. Blood culture diagnosed bBSI in 12 patients, but this number increased to 22 patients when combining blood culture and 16S metagenomics results. In addition to superior sensitivity compared to standard blood culture, 16S metagenomics revealed important novel insights into the nature of bBSI. Patients with acute malaria or recovering from malaria had a 7-fold higher risk of presenting polymicrobial bloodstream infections compared to patients with no recent malaria diagnosis (p-value = 0.046). Malaria is known to affect epithelial gut function and may thus facilitate bacterial translocation from the intestinal lumen to the blood. Importantly, patients with such polymicrobial blood infections showed a 9-fold higher risk factor for not surviving their febrile illness (p-value = 0.030). Our data demonstrate that 16S metagenomics is a powerful approach for the diagnosis and understanding of bBSI. This proof-of-concept study also showed that appropriate control samples are crucial to detect background signals due to environmental contamination.

  16. Diagnosis of Bacterial Bloodstream Infections: A 16S Metagenomics Approach.

    Directory of Open Access Journals (Sweden)

    Saskia Decuypere

    2016-02-01

    Full Text Available Bacterial bloodstream infection (bBSI is one of the leading causes of death in critically ill patients and accurate diagnosis is therefore crucial. We here report a 16S metagenomics approach for diagnosing and understanding bBSI.The proof-of-concept was delivered in 75 children (median age 15 months with severe febrile illness in Burkina Faso. Standard blood culture and malaria testing were conducted at the time of hospital admission. 16S metagenomics testing was done retrospectively and in duplicate on the blood of all patients. Total DNA was extracted from the blood and the V3-V4 regions of the bacterial 16S rRNA genes were amplified by PCR and deep sequenced on an Illumina MiSeq sequencer. Paired reads were curated, taxonomically labeled, and filtered. Blood culture diagnosed bBSI in 12 patients, but this number increased to 22 patients when combining blood culture and 16S metagenomics results. In addition to superior sensitivity compared to standard blood culture, 16S metagenomics revealed important novel insights into the nature of bBSI. Patients with acute malaria or recovering from malaria had a 7-fold higher risk of presenting polymicrobial bloodstream infections compared to patients with no recent malaria diagnosis (p-value = 0.046. Malaria is known to affect epithelial gut function and may thus facilitate bacterial translocation from the intestinal lumen to the blood. Importantly, patients with such polymicrobial blood infections showed a 9-fold higher risk factor for not surviving their febrile illness (p-value = 0.030.Our data demonstrate that 16S metagenomics is a powerful approach for the diagnosis and understanding of bBSI. This proof-of-concept study also showed that appropriate control samples are crucial to detect background signals due to environmental contamination.

  17. Metagenomic analysis of faecal microbiome as a tool towards targeted non-invasive biomarkers for colorectal cancer

    DEFF Research Database (Denmark)

    Yu, Jun; Feng, Qiang; Wong, Sunny Hei

    2017-01-01

    OBJECTIVE: To evaluate the potential for diagnosing colorectal cancer (CRC) from faecal metagenomes. DESIGN: We performed metagenome-wide association studies on faecal samples from 74 patients with CRC and 54 controls from China, and validated the results in 16 patients and 24 controls from Denmark...... markers in the Danish cohort. In the French and Austrian cohorts, these four genes distinguished CRC metagenomes from controls with areas under the receiver-operating curve (AUC) of 0.72 and 0.77, respectively. qPCR measurements of two of these genes accurately classified patients with CRC...... in the independent Chinese cohort with AUC=0.84 and OR of 23. These genes were enriched in early-stage (I-II) patient microbiomes, highlighting the potential for using faecal metagenomic biomarkers for early diagnosis of CRC. CONCLUSIONS: We present the first metagenomic profiling study of CRC faecal microbiomes...

  18. Metagenomics and the protein universe

    Science.gov (United States)

    Godzik, Adam

    2011-01-01

    Metagenomics sequencing projects have dramatically increased our knowledge of the protein universe and provided over one-half of currently known protein sequences; they have also introduced a much broader phylogenetic diversity into the protein databases. The full analysis of metagenomic datasets is only beginning, but it has already led to the discovery of thousands of new protein families, likely representing novel functions specific to given environments. At the same time, a deeper analysis of such novel families, including experimental structure determination of some representatives, suggests that most of them represent distant homologs of already characterized protein families, and thus most of the protein diversity present in the new environments are due to functional divergence of the known protein families rather than the emergence of new ones. PMID:21497084

  19. Functional metagenomics of extreme environments.

    Science.gov (United States)

    Mirete, Salvador; Morgante, Verónica; González-Pastor, José Eduardo

    2016-04-01

    The bioprospecting of enzymes that operate under extreme conditions is of particular interest for many biotechnological and industrial processes. Nevertheless, there is a considerable limitation to retrieve novel enzymes as only a small fraction of microorganisms derived from extreme environments can be cultured under standard laboratory conditions. Functional metagenomics has the advantage of not requiring the cultivation of microorganisms or previous sequence information to known genes, thus representing a valuable approach for mining enzymes with new features. In this review, we summarize studies showing how functional metagenomics was employed to retrieve genes encoding for proteins involved not only in molecular adaptation and resistance to extreme environmental conditions but also in other enzymatic activities of biotechnological interest. Copyright © 2016 Elsevier Ltd. All rights reserved.

  20. Biological results of the Snellius expedition. XXIV. Pelagic Tunicates of the Snellius expedition

    NARCIS (Netherlands)

    Tokioka, T.

    1974-01-01

    Eleven samples of pelagic tunicates were found in the material collected during the Snellius Expedition 1929-30. In these, seven species, viz., two pyrosomas and five salpas, are included. In addition, a few old specimens of another species of Pyrosoma were found in the collection of the Leiden

  1. Consistent metagenomic biomarker detection via robust PCA.

    Science.gov (United States)

    Alshawaqfeh, Mustafa; Bashaireh, Ahmad; Serpedin, Erchin; Suchodolski, Jan

    2017-01-31

    Recent developments of high throughput sequencing technologies allow the characterization of the microbial communities inhabiting our world. Various metagenomic studies have suggested using microbial taxa as potential biomarkers for certain diseases. In practice, the number of available samples varies from experiment to experiment. Therefore, a robust biomarker detection algorithm is needed to provide a set of potential markers irrespective of the number of available samples. Consistent performance is essential to derive solid biological conclusions and to transfer these findings into clinical applications. Surprisingly, the consistency of a metagenomic biomarker detection algorithm with respect to the variation in the experiment size has not been addressed by the current state-of-art algorithms. We propose a consistency-classification framework that enables the assessment of consistency and classification performance of a biomarker discovery algorithm. This evaluation protocol is based on random resampling to mimic the variation in the experiment size. Moreover, we model the metagenomic data matrix as a superposition of two matrices. The first matrix is a low-rank matrix that models the abundance levels of the irrelevant bacteria. The second matrix is a sparse matrix that captures the abundance levels of the bacteria that are differentially abundant between different phenotypes. Then, we propose a novel Robust Principal Component Analysis (RPCA) based biomarker discovery algorithm to recover the sparse matrix. RPCA belongs to the class of multivariate feature selection methods which treat the features collectively rather than individually. This provides the proposed algorithm with an inherent ability to handle the complex microbial interactions. Comprehensive comparisons of RPCA with the state-of-the-art algorithms on two realistic datasets are conducted. Results show that RPCA consistently outperforms the other algorithms in terms of classification accuracy and

  2. Bioprospecting for β-lactam resistance genes using a metagenomics-guided strategy.

    Science.gov (United States)

    Yang, Chao; Yang, Ying; Che, You; Xia, Yu; Li, Liguan; Xiong, Wenguang; Zhang, Tong

    2017-08-01

    Emergence of new antibiotic resistance bacteria poses a serious threat to human health, which is largely attributed to the evolution and spread of antibiotic resistance genes (ARGs). In this work, a metagenomics-guided strategy consisting of metagenomic analysis and function validation was proposed for rapidly identifying novel ARGs from hot spots of ARG dissemination, such as wastewater treatment plants (WWTPs) and animal feces. We used an antibiotic resistance gene database to annotate 76 putative β-lactam resistance genes from the metagenomes of sludge and chicken feces. Among these 76 candidate genes, 25 target genes that shared 40~70% amino acid identity to known β-lactamases were cloned by PCR from the metagenomes. Their resistances to four β-lactam antibiotics were further demonstrated. Furthermore, the validated ARGs were used as the reference sequences to identify novel ARGs in eight environmental samples, suggesting the necessity of re-examining the profiles of ARGs in environmental samples using the validated novel ARG sequences. This metagenomics-guided pipeline does not rely on the activity of ARGs during the initial screening process and may specifically select novel ARG sequences for function validation, which make it suitable for the high-throughput screening of novel ARGs from environmental metagenomes.

  3. PhyloSift: phylogenetic analysis of genomes and metagenomes

    Directory of Open Access Journals (Sweden)

    Aaron E. Darling

    2014-01-01

    Full Text Available Like all organisms on the planet, environmental microbes are subject to the forces of molecular evolution. Metagenomic sequencing provides a means to access the DNA sequence of uncultured microbes. By combining DNA sequencing of microbial communities with evolutionary modeling and phylogenetic analysis we might obtain new insights into microbiology and also provide a basis for practical tools such as forensic pathogen detection.In this work we present an approach to leverage phylogenetic analysis of metagenomic sequence data to conduct several types of analysis. First, we present a method to conduct phylogeny-driven Bayesian hypothesis tests for the presence of an organism in a sample. Second, we present a means to compare community structure across a collection of many samples and develop direct associations between the abundance of certain organisms and sample metadata. Third, we apply new tools to analyze the phylogenetic diversity of microbial communities and again demonstrate how this can be associated to sample metadata.These analyses are implemented in an open source software pipeline called PhyloSift. As a pipeline, PhyloSift incorporates several other programs including LAST, HMMER, and pplacer to automate phylogenetic analysis of protein coding and RNA sequences in metagenomic datasets generated by modern sequencing platforms (e.g., Illumina, 454.

  4. Windshield splatter analysis with the Galaxy metagenomic pipeline.

    Science.gov (United States)

    Kosakovsky Pond, Sergei; Wadhawan, Samir; Chiaromonte, Francesca; Ananda, Guruprasad; Chung, Wen-Yu; Taylor, James; Nekrutenko, Anton

    2009-11-01

    How many species inhabit our immediate surroundings? A straightforward collection technique suitable for answering this question is known to anyone who has ever driven a car at highway speeds. The windshield of a moving vehicle is subjected to numerous insect strikes and can be used as a collection device for representative sampling. Unfortunately the analysis of biological material collected in that manner, as with most metagenomic studies, proves to be rather demanding due to the large number of required tools and considerable computational infrastructure. In this study, we use organic matter collected by a moving vehicle to design and test a comprehensive pipeline for phylogenetic profiling of metagenomic samples that includes all steps from processing and quality control of data generated by next-generation sequencing technologies to statistical analyses and data visualization. To the best of our knowledge, this is also the first publication that features a live online supplement providing access to exact analyses and workflows used in the article.

  5. Integrative workflows for metagenomic analysis.

    Science.gov (United States)

    Ladoukakis, Efthymios; Kolisis, Fragiskos N; Chatziioannou, Aristotelis A

    2014-01-01

    The rapid evolution of all sequencing technologies, described by the term Next Generation Sequencing (NGS), have revolutionized metagenomic analysis. They constitute a combination of high-throughput analytical protocols, coupled to delicate measuring techniques, in order to potentially discover, properly assemble and map allelic sequences to the correct genomes, achieving particularly high yields for only a fraction of the cost of traditional processes (i.e., Sanger). From a bioinformatic perspective, this boils down to many GB of data being generated from each single sequencing experiment, rendering the management or even the storage, critical bottlenecks with respect to the overall analytical endeavor. The enormous complexity is even more aggravated by the versatility of the processing steps available, represented by the numerous bioinformatic tools that are essential, for each analytical task, in order to fully unveil the genetic content of a metagenomic dataset. These disparate tasks range from simple, nonetheless non-trivial, quality control of raw data to exceptionally complex protein annotation procedures, requesting a high level of expertise for their proper application or the neat implementation of the whole workflow. Furthermore, a bioinformatic analysis of such scale, requires grand computational resources, imposing as the sole realistic solution, the utilization of cloud computing infrastructures. In this review article we discuss different, integrative, bioinformatic solutions available, which address the aforementioned issues, by performing a critical assessment of the available automated pipelines for data management, quality control, and annotation of metagenomic data, embracing various, major sequencing technologies and applications.

  6. Integrative Workflows for Metagenomic Analysis

    Directory of Open Access Journals (Sweden)

    Efthymios eLadoukakis

    2014-11-01

    Full Text Available The rapid evolution of all sequencing technologies, described by the term Next Generation Sequencing (NGS, have revolutionized metagenomic analysis. They constitute a combination of high-throughput analytical protocols, coupled to delicate measuring techniques, in order to potentially discover, properly assemble and map allelic sequences to the correct genomes, achieving particularly high yields for only a fraction of the cost of traditional processes (i.e. Sanger. From a bioinformatic perspective, this boils down to many gigabytes of data being generated from each single sequencing experiment, rendering the management or even the storage, critical bottlenecks with respect to the overall analytical endeavor. The enormous complexity is even more aggravated by the versatility of the processing steps available, represented by the numerous bioinformatic tools that are essential, for each analytical task, in order to fully unveil the genetic content of a metagenomic dataset. These disparate tasks range from simple, nonetheless non-trivial, quality control of raw data to exceptionally complex protein annotation procedures, requesting a high level of expertise for their proper application or the neat implementation of the whole workflow. Furthermore, a bioinformatic analysis of such scale, requires grand computational resources, imposing as the sole realistic solution, the utilization of cloud computing infrastructures. In this review article we discuss different, integrative, bioinformatic solutions available, which address the aforementioned issues, by performing a critical assessment of the available automated pipelines for data management, quality control and annotation of metagenomic data, embracing various, major sequencing technologies and applications.

  7. Subtractive assembly for comparative metagenomics, and its application to type 2 diabetes metagenomes.

    Science.gov (United States)

    Wang, Mingjie; Doak, Thomas G; Ye, Yuzhen

    2015-11-02

    Comparative metagenomics remains challenging due to the size and complexity of metagenomic datasets. Here we introduce subtractive assembly, a de novo assembly approach for comparative metagenomics that directly assembles only the differential reads that distinguish between two groups of metagenomes. Using simulated datasets, we show it improves both the efficiency of the assembly and the assembly quality of the differential genomes and genes. Further, its application to type 2 diabetes (T2D) metagenomic datasets reveals clear signatures of the T2D gut microbiome, revealing new phylogenetic and functional features of the gut microbial communities associated with T2D.

  8. Expanding the marine virosphere using metagenomics.

    Science.gov (United States)

    Mizuno, Carolina Megumi; Rodriguez-Valera, Francisco; Kimes, Nikole E; Ghai, Rohit

    2013-01-01

    Viruses infecting prokaryotic cells (phages) are the most abundant entities of the biosphere and contain a largely uncharted wealth of genomic diversity. They play a critical role in the biology of their hosts and in ecosystem functioning at large. The classical approaches studying phages require isolation from a pure culture of the host. Direct sequencing approaches have been hampered by the small amounts of phage DNA present in most natural habitats and the difficulty in applying meta-omic approaches, such as annotation of small reads and assembly. Serendipitously, it has been discovered that cellular metagenomes of highly productive ocean waters (the deep chlorophyll maximum) contain significant amounts of viral DNA derived from cells undergoing the lytic cycle. We have taken advantage of this phenomenon to retrieve metagenomic fosmids containing viral DNA from a Mediterranean deep chlorophyll maximum sample. This method allowed description of complete genomes of 208 new marine phages. The diversity of these genomes was remarkable, contributing 21 genomic groups of tailed bacteriophages of which 10 are completely new. Sequence based methods have allowed host assignment to many of them. These predicted hosts represent a wide variety of important marine prokaryotic microbes like members of SAR11 and SAR116 clades, Cyanobacteria and also the newly described low GC Actinobacteria. A metavirome constructed from the same habitat showed that many of the new phage genomes were abundantly represented. Furthermore, other available metaviromes also indicated that some of the new phages are globally distributed in low to medium latitude ocean waters. The availability of many genomes from the same sample allows a direct approach to viral population genomics confirming the remarkable mosaicism of phage genomes.

  9. Expanding the marine virosphere using metagenomics.

    Directory of Open Access Journals (Sweden)

    Carolina Megumi Mizuno

    Full Text Available Viruses infecting prokaryotic cells (phages are the most abundant entities of the biosphere and contain a largely uncharted wealth of genomic diversity. They play a critical role in the biology of their hosts and in ecosystem functioning at large. The classical approaches studying phages require isolation from a pure culture of the host. Direct sequencing approaches have been hampered by the small amounts of phage DNA present in most natural habitats and the difficulty in applying meta-omic approaches, such as annotation of small reads and assembly. Serendipitously, it has been discovered that cellular metagenomes of highly productive ocean waters (the deep chlorophyll maximum contain significant amounts of viral DNA derived from cells undergoing the lytic cycle. We have taken advantage of this phenomenon to retrieve metagenomic fosmids containing viral DNA from a Mediterranean deep chlorophyll maximum sample. This method allowed description of complete genomes of 208 new marine phages. The diversity of these genomes was remarkable, contributing 21 genomic groups of tailed bacteriophages of which 10 are completely new. Sequence based methods have allowed host assignment to many of them. These predicted hosts represent a wide variety of important marine prokaryotic microbes like members of SAR11 and SAR116 clades, Cyanobacteria and also the newly described low GC Actinobacteria. A metavirome constructed from the same habitat showed that many of the new phage genomes were abundantly represented. Furthermore, other available metaviromes also indicated that some of the new phages are globally distributed in low to medium latitude ocean waters. The availability of many genomes from the same sample allows a direct approach to viral population genomics confirming the remarkable mosaicism of phage genomes.

  10. Windshield splatter analysis with the Galaxy metagenomic pipeline

    OpenAIRE

    Kosakovsky Pond, Sergei; Wadhawan, Samir; Chiaromonte, Francesca; Ananda, Guruprasad; Chung, Wen-Yu; Taylor, James; Nekrutenko, Anton

    2009-01-01

    How many species inhabit our immediate surroundings? A straightforward collection technique suitable for answering this question is known to anyone who has ever driven a car at highway speeds. The windshield of a moving vehicle is subjected to numerous insect strikes and can be used as a collection device for representative sampling. Unfortunately the analysis of biological material collected in that manner, as with most metagenomic studies, proves to be rather demanding due to the large numb...

  11. Deconvoluting simulated metagenomes: the performance of hard- and soft- clustering algorithms applied to metagenomic chromosome conformation capture (3C

    Directory of Open Access Journals (Sweden)

    Matthew Z. DeMaere

    2016-11-01

    Full Text Available Background Chromosome conformation capture, coupled with high throughput DNA sequencing in protocols like Hi-C and 3C-seq, has been proposed as a viable means of generating data to resolve the genomes of microorganisms living in naturally occuring environments. Metagenomic Hi-C and 3C-seq datasets have begun to emerge, but the feasibility of resolving genomes when closely related organisms (strain-level diversity are present in the sample has not yet been systematically characterised. Methods We developed a computational simulation pipeline for metagenomic 3C and Hi-C sequencing to evaluate the accuracy of genomic reconstructions at, above, and below an operationally defined species boundary. We simulated datasets and measured accuracy over a wide range of parameters. Five clustering algorithms were evaluated (2 hard, 3 soft using an adaptation of the extended B-cubed validation measure. Results When all genomes in a sample are below 95% sequence identity, all of the tested clustering algorithms performed well. When sequence data contains genomes above 95% identity (our operational definition of strain-level diversity, a naive soft-clustering extension of the Louvain method achieves the highest performance. Discussion Previously, only hard-clustering algorithms have been applied to metagenomic 3C and Hi-C data, yet none of these perform well when strain-level diversity exists in a metagenomic sample. Our simple extension of the Louvain method performed the best in these scenarios, however, accuracy remained well below the levels observed for samples without strain-level diversity. Strain resolution is also highly dependent on the amount of available 3C sequence data, suggesting that depth of sequencing must be carefully considered during experimental design. Finally, there appears to be great scope to improve the accuracy of strain resolution through further algorithm development.

  12. A Delphi Technology Foresight Study: Mapping Social Construction of Scientific Evidence on Metagenomics Tests for Water Safety.

    Science.gov (United States)

    Birko, Stanislav; Dove, Edward S; Özdemir, Vural

    2015-01-01

    Access to clean water is a grand challenge in the 21st century. Water safety testing for pathogens currently depends on surrogate measures such as fecal indicator bacteria (e.g., E. coli). Metagenomics concerns high-throughput, culture-independent, unbiased shotgun sequencing of DNA from environmental samples that might transform water safety by detecting waterborne pathogens directly instead of their surrogates. Yet emerging innovations such as metagenomics are often fiercely contested. Innovations are subject to shaping/construction not only by technology but also social systems/values in which they are embedded, such as experts' attitudes towards new scientific evidence. We conducted a classic three-round Delphi survey, comprised of 107 questions. A multidisciplinary expert panel (n = 24) representing the continuum of discovery scientists and policymakers evaluated the emergence of metagenomics tests. To the best of our knowledge, we report here the first Delphi foresight study of experts' attitudes on (1) the top 10 priority evidentiary criteria for adoption of metagenomics tests for water safety, (2) the specific issues critical to governance of metagenomics innovation trajectory where there is consensus or dissensus among experts, (3) the anticipated time lapse from discovery to practice of metagenomics tests, and (4) the role and timing of public engagement in development of metagenomics tests. The ability of a test to distinguish between harmful and benign waterborne organisms, analytical/clinical sensitivity, and reproducibility were the top three evidentiary criteria for adoption of metagenomics. Experts agree that metagenomic testing will provide novel information but there is dissensus on whether metagenomics will replace the current water safety testing methods or impact the public health end points (e.g., reduction in boil water advisories). Interestingly, experts view the publics relevant in a "downstream capacity" for adoption of metagenomics rather

  13. Functional Metagenomics to Study Antibiotic Resistance.

    Science.gov (United States)

    Boolchandani, Manish; Patel, Sanket; Dantas, Gautam

    2017-01-01

    The construction and screening of metagenomic expression libraries has great potential to identify novel genes and their functions. Here, we describe metagenomic library preparation from fecal DNA, screening of libraries for antibiotic resistance genes (ARGs), massively parallel DNA sequencing of the enriched DNA fragments, and a computational pipeline for high-throughput assembly and annotation of functionally selected DNA.

  14. Metagenomic data analysis : computational methods and applications

    NARCIS (Netherlands)

    Gori, F.

    2013-01-01

    Metagenomics is the study of the genomic content of microbial communities, acquired through DNA sequencing technology. The main advantage of metagenomics is that it can overcome the limitations of individual genome sequencing, that can work only on the few culturable microbes. Unfortunately, the

  15. An integrated metagenome and -proteome analysis of the microbial community residing in a biogas production plant.

    Science.gov (United States)

    Ortseifen, Vera; Stolze, Yvonne; Maus, Irena; Sczyrba, Alexander; Bremges, Andreas; Albaum, Stefan P; Jaenicke, Sebastian; Fracowiak, Jochen; Pühler, Alfred; Schlüter, Andreas

    2016-08-10

    To study the metaproteome of a biogas-producing microbial community, fermentation samples were taken from an agricultural biogas plant for microbial cell and protein extraction and corresponding metagenome analyses. Based on metagenome sequence data, taxonomic community profiling was performed to elucidate the composition of bacterial and archaeal sub-communities. The community's cytosolic metaproteome was represented in a 2D-PAGE approach. Metaproteome databases for protein identification were compiled based on the assembled metagenome sequence dataset for the biogas plant analyzed and non-corresponding biogas metagenomes. Protein identification results revealed that the corresponding biogas protein database facilitated the highest identification rate followed by other biogas-specific databases, whereas common public databases yielded insufficient identification rates. Proteins of the biogas microbiome identified as highly abundant were assigned to the pathways involved in methanogenesis, transport and carbon metabolism. Moreover, the integrated metagenome/-proteome approach enabled the examination of genetic-context information for genes encoding identified proteins by studying neighboring genes on the corresponding contig. Exemplarily, this approach led to the identification of a Methanoculleus sp. contig encoding 16 methanogenesis-related gene products, three of which were also detected as abundant proteins within the community's metaproteome. Thus, metagenome contigs provide additional information on the genetic environment of identified abundant proteins. Copyright © 2016 Elsevier B.V. All rights reserved.

  16. MetaCAA: A clustering-aided methodology for efficient assembly of metagenomic datasets.

    Science.gov (United States)

    Reddy, Rachamalla Maheedhar; Mohammed, Monzoorul Haque; Mande, Sharmila S

    2014-01-01

    A key challenge in analyzing metagenomics data pertains to assembly of sequenced DNA fragments (i.e. reads) originating from various microbes in a given environmental sample. Several existing methodologies can assemble reads originating from a single genome. However, these methodologies cannot be applied for efficient assembly of metagenomic sequence datasets. In this study, we present MetaCAA - a clustering-aided methodology which helps in improving the quality of metagenomic sequence assembly. MetaCAA initially groups sequences constituting a given metagenome into smaller clusters. Subsequently, sequences in each cluster are independently assembled using CAP3, an existing single genome assembly program. Contigs formed in each of the clusters along with the unassembled reads are then subjected to another round of assembly for generating the final set of contigs. Validation using simulated and real-world metagenomic datasets indicates that MetaCAA aids in improving the overall quality of assembly. A software implementation of MetaCAA is available at https://metagenomics.atc.tcs.com/MetaCAA. Copyright © 2014 Elsevier Inc. All rights reserved.

  17. A statistical toolbox for metagenomics: assessing functional diversity in microbial communities

    Directory of Open Access Journals (Sweden)

    Handelsman Jo

    2008-01-01

    Full Text Available Abstract Background The 99% of bacteria in the environment that are recalcitrant to culturing have spurred the development of metagenomics, a culture-independent approach to sample and characterize microbial genomes. Massive datasets of metagenomic sequences have been accumulated, but analysis of these sequences has focused primarily on the descriptive comparison of the relative abundance of proteins that belong to specific functional categories. More robust statistical methods are needed to make inferences from metagenomic data. In this study, we developed and applied a suite of tools to describe and compare the richness, membership, and structure of microbial communities using peptide fragment sequences extracted from metagenomic sequence data. Results Application of these tools to acid mine drainage, soil, and whale fall metagenomic sequence collections revealed groups of peptide fragments with a relatively high abundance and no known function. When combined with analysis of 16S rRNA gene fragments from the same communities these tools enabled us to demonstrate that although there was no overlap in the types of 16S rRNA gene sequence observed, there was a core collection of operational protein families that was shared among the three environments. Conclusion The results of comparisons between the three habitats were surprising considering the relatively low overlap of membership and the distinctively different characteristics of the three habitats. These tools will facilitate the use of metagenomics to pursue statistically sound genome-based ecological analyses.

  18. Metagenomic applications in environmental monitoring and bioremediation.

    Science.gov (United States)

    Techtmann, Stephen M; Hazen, Terry C

    2016-10-01

    With the rapid advances in sequencing technology, the cost of sequencing has dramatically dropped and the scale of sequencing projects has increased accordingly. This has provided the opportunity for the routine use of sequencing techniques in the monitoring of environmental microbes. While metagenomic applications have been routinely applied to better understand the ecology and diversity of microbes, their use in environmental monitoring and bioremediation is increasingly common. In this review we seek to provide an overview of some of the metagenomic techniques used in environmental systems biology, addressing their application and limitation. We will also provide several recent examples of the application of metagenomics to bioremediation. We discuss examples where microbial communities have been used to predict the presence and extent of contamination, examples of how metagenomics can be used to characterize the process of natural attenuation by unculturable microbes, as well as examples detailing the use of metagenomics to understand the impact of biostimulation on microbial communities.

  19. Further steps in TANGO: improved taxonomic assignment in metagenomics.

    Science.gov (United States)

    Alonso-Alemany, Daniel; Barré, Aurélien; Beretta, Stefano; Bonizzoni, Paola; Nikolski, Macha; Valiente, Gabriel

    2014-01-01

    TANGO is one of the most accurate tools for the taxonomic assignment of sequence reads. However, because of the differences in the taxonomy structures, performing a taxonomic assignment on different reference taxonomies will produce divergent results. We have improved the TANGO pipeline to be able to perform the taxonomic assignment of a metagenomic sample using alternative reference taxonomies, coming from different sources. We highlight the novel pre-processing step, necessary to accomplish this task, and describe the improvements in the assignment process. We present the new TANGO pipeline in details, and, finally, we show its performance on four real metagenomic datasets and also on synthetic datasets. The new version of TANGO, including implementation improvements and novel developments to perform the assignment on different reference taxonomies, is freely available at http://sourceforge.net/projects/taxoassignment/.

  20. Explaining diversity in metagenomic datasets by phylogenetic-based feature weighting.

    Science.gov (United States)

    Albanese, Davide; De Filippo, Carlotta; Cavalieri, Duccio; Donati, Claudio

    2015-03-01

    Metagenomics is revolutionizing our understanding of microbial communities, showing that their structure and composition have profound effects on the ecosystem and in a variety of health and disease conditions. Despite the flourishing of new analysis methods, current approaches based on statistical comparisons between high-level taxonomic classes often fail to identify the microbial taxa that are differentially distributed between sets of samples, since in many cases the taxonomic schema do not allow an adequate description of the structure of the microbiota. This constitutes a severe limitation to the use of metagenomic data in therapeutic and diagnostic applications. To provide a more robust statistical framework, we introduce a class of feature-weighting algorithms that discriminate the taxa responsible for the classification of metagenomic samples. The method unambiguously groups the relevant taxa into clades without relying on pre-defined taxonomic categories, thus including in the analysis also those sequences for which a taxonomic classification is difficult. The phylogenetic clades are weighted and ranked according to their abundance measuring their contribution to the differentiation of the classes of samples, and a criterion is provided to define a reduced set of most relevant clades. Applying the method to public datasets, we show that the data-driven definition of relevant phylogenetic clades accomplished by our ranking strategy identifies features in the samples that are lost if phylogenetic relationships are not considered, improving our ability to mine metagenomic datasets. Comparison with supervised classification methods currently used in metagenomic data analysis highlights the advantages of using phylogenetic information.

  1. Revealing large metagenomic regions through long DNA fragment hybridization capture.

    Science.gov (United States)

    Gasc, Cyrielle; Peyret, Pierre

    2017-03-14

    High-throughput DNA sequencing technologies have revolutionized genomic analysis, including the de novo assembly of whole genomes from single organisms or metagenomic samples. However, due to the limited capacity of short-read sequence data to assemble complex or low coverage regions, genomes are typically fragmented, leading to draft genomes with numerous underexplored large genomic regions. Revealing these missing sequences is a major goal to resolve concerns in numerous biological studies. To overcome these limitations, we developed an innovative target enrichment method for the reconstruction of large unknown genomic regions. Based on a hybridization capture strategy, this approach enables the enrichment of large genomic regions allowing the reconstruction of tens of kilobase pairs flanking a short, targeted DNA sequence. Applied to a metagenomic soil sample targeting the linA gene, the biomarker of hexachlorocyclohexane (HCH) degradation, our method permitted the enrichment of the gene and its flanking regions leading to the reconstruction of several contigs and complete plasmids exceeding tens of kilobase pairs surrounding linA. Thus, through gene association and genome reconstruction, we identified microbial species involved in HCH degradation which constitute targets to improve biostimulation treatments. This new hybridization capture strategy makes surveying and deconvoluting complex genomic regions possible through large genomic regions enrichment and allows the efficient exploration of metagenomic diversity. Indeed, this approach enables to assign identity and function to microorganisms in natural environments, one of the ultimate goals of microbial ecology.

  2. Culture-independent discovery of natural products from soil metagenomes.

    Science.gov (United States)

    Katz, Micah; Hover, Bradley M; Brady, Sean F

    2016-03-01

    Bacterial natural products have proven to be invaluable starting points in the development of many currently used therapeutic agents. Unfortunately, traditional culture-based methods for natural product discovery have been deemphasized by pharmaceutical companies due in large part to high rediscovery rates. Culture-independent, or "metagenomic," methods, which rely on the heterologous expression of DNA extracted directly from environmental samples (eDNA), have the potential to provide access to metabolites encoded by a large fraction of the earth's microbial biosynthetic diversity. As soil is both ubiquitous and rich in bacterial diversity, it is an appealing starting point for culture-independent natural product discovery efforts. This review provides an overview of the history of soil metagenome-driven natural product discovery studies and elaborates on the recent development of new tools for sequence-based, high-throughput profiling of environmental samples used in discovering novel natural product biosynthetic gene clusters. We conclude with several examples of these new tools being employed to facilitate the recovery of novel secondary metabolite encoding gene clusters from soil metagenomes and the subsequent heterologous expression of these clusters to produce bioactive small molecules.

  3. MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Yu-Wei [Joint BioEnergy Inst. (JBEI), Emeryville, CA (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Simmons, Blake A. [Joint BioEnergy Inst. (JBEI), Emeryville, CA (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Singer, Steven W. [Joint BioEnergy Inst. (JBEI), Emeryville, CA (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

    2015-10-29

    The recovery of genomes from metagenomic datasets is a critical step to defining the functional roles of the underlying uncultivated populations. We previously developed MaxBin, an automated binning approach for high-throughput recovery of microbial genomes from metagenomes. Here, we present an expanded binning algorithm, MaxBin 2.0, which recovers genomes from co-assembly of a collection of metagenomic datasets. Tests on simulated datasets revealed that MaxBin 2.0 is highly accurate in recovering individual genomes, and the application of MaxBin 2.0 to several metagenomes from environmental samples demonstrated that it could achieve two complementary goals: recovering more bacterial genomes compared to binning a single sample as well as comparing the microbial community composition between different sampling environments. Availability and implementation: MaxBin 2.0 is freely available at http://sourceforge.net/projects/maxbin/ under BSD license. Supplementary information: Supplementary data are available at Bioinformatics online.

  4. Metagenomic diagnostics for the simultaneous detection of multiple pathogens in human stool specimens from Côte d'Ivoire: a proof-of-concept study.

    Science.gov (United States)

    Schneeberger, Pierre H H; Becker, Sören L; Pothier, Joël F; Duffy, Brion; N'Goran, Eliézer K; Beuret, Christian; Frey, Jürg E; Utzinger, Jürg

    2016-06-01

    The intestinal microbiome is a complex community and its role in influencing human health is poorly understood. While conventional microbiology commonly attributes digestive disorders to a single microorganism, a metagenomic approach can detect multiple pathogens simultaneously and might elucidate the role of microbial communities in the pathogenesis of intestinal diseases. We present a proof-of-concept that a shotgun metagenomic approach provides useful information on the diverse composition of intestinal pathogens and antimicrobial resistance profiles in human stool samples. In October 2012, we obtained stool specimens from patients with persistent diarrhea in south Côte d'Ivoire. Four stool samples were purposefully selected and subjected to microscopy, multiplex polymerase chain reaction (PCR), and a metagenomic approach. For the latter, we employed the National Center for Biotechnology Information nucleotide database and screened for 36 pathogenic organisms (bacteria, helminths, intestinal protozoa, and viruses) that may cause digestive disorders. We further characterized the bacterial population and the prevailing resistance patterns by comparing our metagenomic datasets with a genome-specific marker database and with a comprehensive antibiotic resistance database. In the four patients, the metagenomic approach identified between eight and 11 pathogen classes that potentially cause digestive disorders. For bacterial pathogens, the diagnostic agreement between multiplex PCR and metagenomics was high; yet, metagenomics diagnosed several bacteria not detected by multiplex PCR. In contrast, some of the helminth and intestinal protozoa infections detected by microscopy were missed by metagenomics. The antimicrobial resistance analysis revealed the presence of genes conferring resistance to several commonly used antibiotics. A metagenomic approach provides detailed information on the presence and diversity of pathogenic organisms in human stool samples

  5. Simultaneous virus identification and characterization of severe unexplained pneumonia cases using a metagenomics sequencing technique.

    Science.gov (United States)

    Zou, Xiaohui; Tang, Guangpeng; Zhao, Xiang; Huang, Yan; Chen, Tao; Lei, Mingyu; Chen, Wenbing; Yang, Lei; Zhu, Wenfei; Zhuang, Li; Yang, Jing; Feng, Zhaomin; Wang, Dayan; Wang, Dingming; Shu, Yuelong

    2017-03-01

    Many viruses can cause respiratory diseases in humans. Although great advances have been achieved in methods of diagnosis, it remains challenging to identify pathogens in unexplained pneumonia (UP) cases. In this study, we applied next-generation sequencing (NGS) technology and a metagenomic approach to detect and characterize respiratory viruses in UP cases from Guizhou Province, China. A total of 33 oropharyngeal swabs were obtained from hospitalized UP patients and subjected to NGS. An unbiased metagenomic analysis pipeline identified 13 virus species in 16 samples. Human rhinovirus C was the virus most frequently detected and was identified in seven samples. Human measles virus, adenovirus B 55 and coxsackievirus A10 were also identified. Metagenomic sequencing also provided virus genomic sequences, which enabled genotype characterization and phylogenetic analysis. For cases of multiple infection, metagenomic sequencing afforded information regarding the quantity of each virus in the sample, which could be used to evaluate each viruses' role in the disease. Our study highlights the potential of metagenomic sequencing for pathogen identification in UP cases.

  6. 45 CFR 303.101 - Expedited processes.

    Science.gov (United States)

    2010-10-01

    ... 45 Public Welfare 2 2010-10-01 2010-10-01 false Expedited processes. 303.101 Section 303.101... STANDARDS FOR PROGRAM OPERATIONS § 303.101 Expedited processes. (a) Definition. Expedited processes means... intrastate cases, expedited processes as specified under this section to establish paternity and to establish...

  7. 12 CFR 347.118 - Expedited processing.

    Science.gov (United States)

    2010-01-01

    ... 12 Banks and Banking 4 2010-01-01 2010-01-01 false Expedited processing. 347.118 Section 347.118... INTERNATIONAL BANKING § 347.118 Expedited processing. (a) Expedited processing of branch applications. An... foreign country, after complying with the expedited processing requirements contained in § 303.182(b) and...

  8. A lunar polar expedition

    Science.gov (United States)

    Dowling, Richard; Staehle, Robert L.; Svitek, Tomas

    1992-01-01

    Advanced exploration and development in harsh environments require mastery of basic human survival skill. Expeditions into the lethal climates of Earth's polar regions offer useful lessons for tommorrow's lunar pioneers. In Arctic and Antarctic exploration, 'wintering over' was a crucial milestone. The ability to establish a supply base and survive months of polar cold and darkness made extensive travel and exploration possible. Because of the possibility of near-constant solar illumination, the lunar polar regions, unlike Earth's may offer the most hospitable site for habitation. The World Space Foundation is examining a scenario for establishing a five-person expeditionary team on the lunar north pole for one year. This paper is a status report on a point design addressing site selection, transportation, power, and life support requirements.

  9. Strain-Level Metagenomic Analysis of the Fermented Dairy Beverage Nunu Highlights Potential Food Safety Risks.

    Science.gov (United States)

    Walsh, Aaron M; Crispie, Fiona; Daari, Kareem; O'Sullivan, Orla; Martin, Jennifer C; Arthur, Cornelius T; Claesson, Marcus J; Scott, Karen P; Cotter, Paul D

    2017-08-15

    The rapid detection of pathogenic strains in food products is essential for the prevention of disease outbreaks. It has already been demonstrated that whole-metagenome shotgun sequencing can be used to detect pathogens in food but, until recently, strain-level detection of pathogens has relied on whole-metagenome assembly, which is a computationally demanding process. Here we demonstrated that three short-read-alignment-based methods, i.e., MetaMLST, PanPhlAn, and StrainPhlAn, could accurately and rapidly identify pathogenic strains in spinach metagenomes that had been intentionally spiked with Shiga toxin-producing Escherichia coli in a previous study. Subsequently, we employed the methods, in combination with other metagenomics approaches, to assess the safety of nunu, a traditional Ghanaian fermented milk product that is produced by the spontaneous fermentation of raw cow milk. We showed that nunu samples were frequently contaminated with bacteria associated with the bovine gut and, worryingly, we detected putatively pathogenic E. coli and Klebsiella pneumoniae strains in a subset of nunu samples. Ultimately, our work establishes that short-read-alignment-based bioinformatics approaches are suitable food safety tools, and we describe a real-life example of their utilization. IMPORTANCE Foodborne pathogens are responsible for millions of illnesses each year. Here we demonstrate that short-read-alignment-based bioinformatics tools can accurately and rapidly detect pathogenic strains in food products by using shotgun metagenomics data. The methods used here are considerably faster than both traditional culturing methods and alternative bioinformatics approaches that rely on metagenome assembly; therefore, they can potentially be used for more high-throughput food safety testing. Overall, our results suggest that whole-metagenome sequencing can be used as a practical food safety tool to prevent diseases or to link outbreaks to specific food products. Copyright

  10. The future of skin metagenomics.

    Science.gov (United States)

    Mathieu, Alban; Vogel, Timothy M; Simonet, Pascal

    2014-01-01

    Metagenomics, the direct exploitation of environmental microbial DNA, is complementary to traditional culture-based approaches for deciphering taxonomic and functional microbial diversity in a plethora of ecosystems, including those related to the human body such as the mouth, saliva, teeth, gut or skin. DNA extracted from human skin analyzed by sequencing the PCR-amplified rrs gene has already revealed the taxonomic diversity of microbial communities colonizing the human skin ("skin microbiome"). Each individual possesses his/her own skin microbial community structure, with marked taxonomic differences between different parts of the body and temporal evolution depending on physical and chemical conditions (sweat, washing etc.). However, technical limitations due to the low bacterial density at the surface of the human skin or contamination by human DNA still has inhibited extended use of the metagenomic approach for investigating the skin microbiome at a functional level. These difficulties have been overcome in part by the new generation of sequencing platforms that now provide sequences describing the genes and functions carried out by skin bacteria. These methodological advances should help us understand the mechanisms by which these microorganisms adapt to the specific chemical composition of each skin and thereby lead to a better understanding of bacteria/human host interdependence. This knowledge will pave the way for more systemic and individualized pharmaceutical and cosmetic applications. Copyright © 2013 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.

  11. Metagenome reveals potential microbial degradation of hydrocarbon coupled with sulfate reduction in an oil-immersed chimney from Guaymas Basin

    Directory of Open Access Journals (Sweden)

    Ying eHe

    2013-06-01

    Full Text Available Deep-sea hydrothermal vent chimneys contain a high diversity of microorganisms, yet the metabolic activity and the ecological functions of the microbial communities remain largely unexplored. In this study, a metagenomic approach was applied to characterize the metabolic potential in a Guaymas hydrothermal vent chimney and to conduct comparative genomic analysis among a variety of environments with sequenced metagenomes. Complete clustering of functional gene categories with a comparative metagenomic approach showed that this Guaymas chimney metagenome was clustered most closely with a chimney metagenome from Juan de Fuca. All chimney samples were enriched with genes involved in recombination and repair, chemotaxis and flagellar assembly, highlighting their roles in coping with the fluctuating extreme deep-sea environments. A high proportion of transposases was observed in all the metagenomes from deep-sea chimneys, supporting the previous hypothesis that horizontal gene transfer may be common in the deep-sea vent chimney biosphere. In the Guaymas chimney metagenome, thermophilic sulfate reducing microorganisms including bacteria and archaea were found predominant, and genes coding for the degradation of refractory organic compounds such as cellulose, lipid, pullullan, as well as a few hydrocarbons including toluene, ethylbenzene and o-xylene were identified. Therefore, this oil-immersed chimney supported a thermophilic microbial community capable of oxidizing a range of hydrocarbons that served as electron donors for sulphate reduction under anaerobic conditions.

  12. MOCAT: a metagenomics assembly and gene prediction toolkit.

    Directory of Open Access Journals (Sweden)

    Jens Roat Kultima

    Full Text Available MOCAT is a highly configurable, modular pipeline for fast, standardized processing of single or paired-end sequencing data generated by the Illumina platform. The pipeline uses state-of-the-art programs to quality control, map, and assemble reads from metagenomic samples sequenced at a depth of several billion base pairs, and predict protein-coding genes on assembled metagenomes. Mapping against reference databases allows for read extraction or removal, as well as abundance calculations. Relevant statistics for each processing step can be summarized into multi-sheet Excel documents and queryable SQL databases. MOCAT runs on UNIX machines and integrates seamlessly with the SGE and PBS queuing systems, commonly used to process large datasets. The open source code and modular architecture allow users to modify or exchange the programs that are utilized in the various processing steps. Individual processing steps and parameters were benchmarked and tested on artificial, real, and simulated metagenomes resulting in an improvement of selected quality metrics. MOCAT can be freely downloaded at http://www.bork.embl.de/mocat/.

  13. Challenges of the Unknown: Clinical Application of Microbial Metagenomics

    Directory of Open Access Journals (Sweden)

    Graham Rose

    2015-01-01

    Full Text Available Availability of fast, high throughput and low cost whole genome sequencing holds great promise within public health microbiology, with applications ranging from outbreak detection and tracking transmission events to understanding the role played by microbial communities in health and disease. Within clinical metagenomics, identifying microorganisms from a complex and host enriched background remains a central computational challenge. As proof of principle, we sequenced two metagenomic samples, a known viral mixture of 25 human pathogens and an unknown complex biological model using benchtop technology. The datasets were then analysed using a bioinformatic pipeline developed around recent fast classification methods. A targeted approach was able to detect 20 of the viruses against a background of host contamination from multiple sources and bacterial contamination. An alternative untargeted identification method was highly correlated with these classifications, and over 1,600 species were identified when applied to the complex biological model, including several species captured at over 50% genome coverage. In summary, this study demonstrates the great potential of applying metagenomics within the clinical laboratory setting and that this can be achieved using infrastructure available to nondedicated sequencing centres.

  14. A retrospective metagenomics approach to studying Blastocystis

    DEFF Research Database (Denmark)

    Andersen, Lee O'Brien; Bonde, Ida; Nielsen, Henrik Bjørn

    2015-01-01

    Blastocystis is a common single-celled intestinal parasitic genus, comprising several subtypes. Here, we screened data obtained by metagenomic analysis of faecal DNA for Blastocystis by searching for subtype-specific genes in coabundance gene groups, which are groups of genes that covary across......- and Prevotella-driven enterotypes. This is the first study to investigate the relationship between Blastocystis and communities of gut bacteria using a metagenomics approach. The study serves as an example of how it is possible to retrospectively investigate microbial eukaryotic communities in the gut using...... metagenomic datasets targeting the bacterial component of the intestinal microbiome and the interplay between these microbial communities....

  15. Metagenomics of the deep Mediterranean, a warm bathypelagic habitat.

    Directory of Open Access Journals (Sweden)

    Ana-Belen Martín-Cuadrado

    Full Text Available BACKGROUND: Metagenomics is emerging as a powerful method to study the function and physiology of the unexplored microbial biosphere, and is causing us to re-evaluate basic precepts of microbial ecology and evolution. Most marine metagenomic analyses have been nearly exclusively devoted to photic waters. METHODOLOGY/PRINCIPAL FINDINGS: We constructed a metagenomic fosmid library from 3,000 m-deep Mediterranean plankton, which is much warmer (approximately 14 degrees C than waters of similar depth in open oceans (approximately 2 degrees C. We analyzed the library both by phylogenetic screening based on 16S rRNA gene amplification from clone pools and by sequencing both insert extremities of ca. 5,000 fosmids. Genome recruitment strategies showed that the majority of high scoring pairs corresponded to genomes from Rhizobiales within the Alphaproteobacteria, Cenarchaeum symbiosum, Planctomycetes, Acidobacteria, Chloroflexi and Gammaproteobacteria. We have found a community structure similar to that found in the aphotic zone of the Pacific. However, the similarities were significantly higher to the mesopelagic (500-700 m deep in the Pacific than to the single 4000 m deep sample studied at this location. Metabolic genes were mostly related to catabolism, transport and degradation of complex organic molecules, in agreement with a prevalent heterotrophic lifestyle for deep-sea microbes. However, we observed a high percentage of genes encoding dehydrogenases and, among them, cox genes, suggesting that aerobic carbon monoxide oxidation may be important in the deep ocean as an additional energy source. CONCLUSIONS/SIGNIFICANCE: The comparison of metagenomic libraries from the deep Mediterranean and the Pacific ALOHA water column showed that bathypelagic Mediterranean communities resemble more mesopelagic communities in the Pacific, and suggests that, in the absence of light, temperature is a major stratifying factor in the oceanic water column, overriding

  16. EBI metagenomics--a new resource for the analysis and archiving of metagenomic data.

    Science.gov (United States)

    Hunter, Sarah; Corbett, Matthew; Denise, Hubert; Fraser, Matthew; Gonzalez-Beltran, Alejandra; Hunter, Christopher; Jones, Philip; Leinonen, Rasko; McAnulla, Craig; Maguire, Eamonn; Maslen, John; Mitchell, Alex; Nuka, Gift; Oisel, Arnaud; Pesseat, Sebastien; Radhakrishnan, Rajesh; Rocca-Serra, Philippe; Scheremetjew, Maxim; Sterk, Peter; Vaughan, Daniel; Cochrane, Guy; Field, Dawn; Sansone, Susanna-Assunta

    2014-01-01

    Metagenomics is a relatively recently established but rapidly expanding field that uses high-throughput next-generation sequencing technologies to characterize the microbial communities inhabiting different ecosystems (including oceans, lakes, soil, tundra, plants and body sites). Metagenomics brings with it a number of challenges, including the management, analysis, storage and sharing of data. In response to these challenges, we have developed a new metagenomics resource (http://www.ebi.ac.uk/metagenomics/) that allows users to easily submit raw nucleotide reads for functional and taxonomic analysis by a state-of-the-art pipeline, and have them automatically stored (together with descriptive, standards-compliant metadata) in the European Nucleotide Archive.

  17. ISS Expedition 33 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 33 from 07/2012-11/2012. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  18. ISS Expedition 37 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 37 from 05/2013-11/2013. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  19. ISS Expedition 01 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 01 from 10/2000-03/2001. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  20. ISS Expedition 23 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 23 from 12/2009-09/2010. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  1. ISS Expedition 24 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 24 from 04/2010-11/2010. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  2. ISS Expedition 09 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 09 from 04/2004-10/2004. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  3. ISS Expedition 11 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 11 from 04/2005-10/2005. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  4. ISS Expedition 06 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 06 from 11/2002-05/2003. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  5. ISS Expedition 16 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 16 from 10/2007-04/2008. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  6. ISS Expedition 28 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 28 from 04/2011-11/2011. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  7. ISS Expedition 03 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 03 from 08/2001-12/2001. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  8. ISS Expedition 10 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 10 from 10/2004-04/2005. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  9. ISS Expedition 07 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 07 from 04/2003-10/2003. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  10. ISS Expedition 39 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 39 from 11/2013-05/2014. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  11. ISS Expedition 08 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 08 from 10/2003-04/2004. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  12. ISS Expedition 15 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 15 from 04/2007-10/2007. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  13. ISS Expedition 12 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 12 from 10/2005-04/2006. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  14. ISS Expedition 05 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 05 from 06/2002-12/2002. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  15. ISS Expedition 04 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 04 from 12/2001-06/2002. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  16. ISS Expedition 42 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 42 from 09/2014-03/2015. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  17. ISS Expedition 38 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 38 from 09/2013-03/2014. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  18. ISS Expedition 43 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 43 from 11/2014-06/2015. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  19. ISS Expedition 19 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 19 from 03/2009-05/2009. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  20. ISS Expedition 14 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 14 from 09/2006-04/2007. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  1. ISS Expedition 36 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 36 from 03/2013-09/2013. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  2. ISS Expedition 34 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 34 from 12/2012-03/2013. Press kits contain information about each mission overview, crew, mission timeline, benefits, and media...

  3. ViromeScan: a new tool for metagenomic viral community profiling.

    Science.gov (United States)

    Rampelli, Simone; Soverini, Matteo; Turroni, Silvia; Quercia, Sara; Biagi, Elena; Brigidi, Patrizia; Candela, Marco

    2016-03-01

    Bioinformatics tools available for metagenomic sequencing analysis are principally devoted to the identification of microorganisms populating an ecological niche, but they usually do not consider viruses. Only some software have been designed to profile the viral sequences, however they are not efficient in the characterization of viruses in the context of complex communities, like the intestinal microbiota, containing bacteria, archeabacteria, eukaryotic microorganisms and viruses. In any case, a comprehensive description of the host-microbiota interactions can not ignore the profile of eukaryotic viruses within the virome, as viruses are definitely critical for the regulation of the host immunophenotype. ViromeScan is an innovative metagenomic analysis tool that characterizes the taxonomy of the virome directly from raw data of next-generation sequencing. The tool uses hierarchical databases for eukaryotic viruses to unambiguously assign reads to viral species more accurately and >1000 fold faster than other existing approaches. We validated ViromeScan on synthetic microbial communities and applied it on metagenomic samples of the Human Microbiome Project, providing a sensitive eukaryotic virome profiling of different human body sites. ViromeScan allows the user to explore and taxonomically characterize the virome from metagenomic reads, efficiently denoising samples from reads of other microorganisms. This implies that users can fully characterize the microbiome, including bacteria and viruses, by shotgun metagenomic sequencing followed by different bioinformatic pipelines.

  4. Nitrogen Cycling In The Deep Subsurface Underlying Oligotrophic Open Ocean Regions: Metagenomic Analyses of North Pond Sediment Microbial Communities

    Science.gov (United States)

    Ziebis, W.

    2016-12-01

    Recent research of the deep ocean floor underlying open ocean regions in the Atlantic and Pacific has revealed that oxygen penetrates several tens of meters into the sediment column from the overlying water. And, in contrast to the better-studied continental margin setting, nitrate also persists within these organic-poor sediments throughout the sediment column. Moreover, in places where seawater flows through the basaltic crust, it has been shown that oxygen diffuses upward into the overlying sediment, creating an oxic sediment layer above the basalt. The flanks of the Mid-Atlantic Ridge are characterized by sediment-filled depressions that are surrounded by a steep topography of basaltic outcrops, which are the conduits for low-temperature hydrothermally driven seawater circulation through the basaltic basement. IODP Expedition 336 targeted North Pond, one of such sediment ponds. 3 sites were drilled which varied in sediment thickness from about 90 m (U1382B, U1384A) to 40 m (U1383D/E). Oxygen penetrated deeply into the sediment column from the overlying water (30 m) and diffused upward from the basaltic basement to several meters (10 - 20 m) above the basalt. Aerobic respiration created an anoxic zone in the middle of the sediment column. Concurrently, nitrate accumulated above bottom seawater concentrations to up to 50 µM. Previous investigations, using a stable isotope approach, showed an active subsurface nitrogen cycle. We obtained samples from all 3 drilling sites and selected 10 samples from the oxic upper layer, the upper suboxic transition zone, the anoxic middle, the lower suboxic zone and the deep oxic layer above the basalt for detailed analyses. We extracted intact cells from large volumes of sediment (1 L) using a density centrifugation approach for amplicon (16s rRNA) and metagenome sequencing, with the goal to characterize the phylogenetic and functional gene inventory in these different sediment layers. We will provide first results on the deep

  5. MG-RAST, a Metagenomics Service for Analysis of Microbial Community Structure and Function.

    Science.gov (United States)

    Keegan, Kevin P; Glass, Elizabeth M; Meyer, Folker

    2016-01-01

    Approaches in molecular biology, particularly those that deal with high-throughput sequencing of entire microbial communities (the field of metagenomics), are rapidly advancing our understanding of the composition and functional content of microbial communities involved in climate change, environmental pollution, human health, biotechnology, etc. Metagenomics provides researchers with the most complete picture of the taxonomic (i.e., what organisms are there) and functional (i.e., what are those organisms doing) composition of natively sampled microbial communities, making it possible to perform investigations that include organisms that were previously intractable to laboratory-controlled culturing; currently, these constitute the vast majority of all microbes on the planet. All organisms contained in environmental samples are sequenced in a culture-independent manner, most often with 16S ribosomal amplicon methods to investigate the taxonomic or whole-genome shotgun-based methods to investigate the functional content of sampled communities. Metagenomics allows researchers to characterize the community composition and functional content of microbial communities, but it cannot show which functional processes are active; however, near parallel developments in transcriptomics promise a dramatic increase in our knowledge in this area as well. Since 2008, MG-RAST (Meyer et al., BMC Bioinformatics 9:386, 2008) has served as a public resource for annotation and analysis of metagenomic sequence data, providing a repository that currently houses more than 150,000 data sets (containing 60+ tera-base-pairs) with more than 23,000 publically available. MG-RAST, or the metagenomics RAST (rapid annotation using subsystems technology) server makes it possible for users to upload raw metagenomic sequence data in (preferably) fastq or fasta format. Assessments of sequence quality, annotation with respect to multiple reference databases, are performed automatically with minimal

  6. MGmapper: Reference based mapping and taxonomy annotation of metagenomics sequence reads

    DEFF Research Database (Denmark)

    Petersen, Thomas Nordahl; Lukjancenko, Oksana; Thomsen, Martin Christen Frølund

    2017-01-01

    An increasing amount of species and gene identification studies rely on the use of next generation sequence analysis of either single isolate or metagenomics samples. Several methods are available to perform taxonomic annotations and a previous metagenomics benchmark study has shown that a vast......-processing analysis to produce reliable taxonomy annotation at species and strain level resolution. An in-vitro bacterial mock community sample comprised of 8 genuses, 11 species and 12 strains was previously used to benchmark metagenomics classification methods. After applying a post-processing filter, we obtained...... 100% correct taxonomy assignments at species and genus level. A sensitivity and precision at 75% was obtained for strain level annotations. A comparison between MGmapper and Kraken at species level, shows MGmapper assigns taxonomy at species level using 84.8% of the sequence reads, compared to 70...

  7. Acquiring Reference Genomes from Uncultured Microbes by Micromanipulation and Low-complexity Metagenomics

    DEFF Research Database (Denmark)

    Karst, Søren Michael; Albertsen, Mads; Nielsen, Jeppe Lund

    in order to obtain low complexity metagenomes from which high quality draft genomes could be assembled. Microorganisms were visualized with FISH in samples from Danish wastewater treatment plants and single filaments and microcolonies were isolated using an epifluorescense microscope equipped...... with a Skerman micromanipulator mounted with µm sized glass hooks. The isolated cells were lysed and genomic DNA was amplified using multiple displacement amplification. The amplified DNA was validated by 16/18S PCR and Sanger sequencing before being sequenced on an Illumina platform. Metagenome assembly...... analysis several samples showed promising assembly statistics, low metagenome complexity and good binning capabilities. Species obtained belong to a number phylogenetic groups which are highly relevant in wastewater treatment: Chloroflexi, Thiothrix and Microthrix....

  8. Comparative analysis of metagenomes of Italian top soil improvers.

    Science.gov (United States)

    Gigliucci, Federica; Brambilla, Gianfranco; Tozzoli, Rosangela; Michelacci, Valeria; Morabito, Stefano

    2017-05-01

    Biosolids originating from Municipal Waste Water Treatment Plants are proposed as top soil improvers (TSI) for their beneficial input of organic carbon on agriculture lands. Their use to amend soil is controversial, as it may lead to the presence of emerging hazards of anthropogenic or animal origin in the environment devoted to food production. In this study, we used a shotgun metagenomics sequencing as a tool to perform a characterization of the hazards related with the TSIs. The samples showed the presence of many virulence genes associated to different diarrheagenic E. coli pathotypes as well as of different antimicrobial resistance-associated genes. The genes conferring resistance to Fluoroquinolones was the most relevant class of antimicrobial resistance genes observed in all the samples tested. To a lesser extent traits associated with the resistance to Methicillin in Staphylococci and genes conferring resistance to Streptothricin, Fosfomycin and Vancomycin were also identified. The most represented metal resistance genes were cobalt-zinc-cadmium related, accounting for 15-50% of the sequence reads in the different metagenomes out of the total number of those mapping on the class of resistance to compounds determinants. Moreover the taxonomic analysis performed by comparing compost-based samples and biosolids derived from municipal sewage-sludges treatments divided the samples into separate populations, based on the microbiota composition. The results confirm that the metagenomics is efficient to detect genomic traits associated with pathogens and antimicrobial resistance in complex matrices and this approach can be efficiently used for the traceability of TSI samples using the microorganisms' profiles as indicators of their origin. Copyright © 2017 Elsevier Inc. All rights reserved.

  9. MIPE: A metagenome-based community structure explorer and SSU primer evaluation tool.

    Directory of Open Access Journals (Sweden)

    Bin Zou

    Full Text Available An understanding of microbial community structure is an important issue in the field of molecular ecology. The traditional molecular method involves amplification of small subunit ribosomal RNA (SSU rRNA genes by polymerase chain reaction (PCR. However, PCR-based amplicon approaches are affected by primer bias and chimeras. With the development of high-throughput sequencing technology, unbiased SSU rRNA gene sequences can be mined from shotgun sequencing-based metagenomic or metatranscriptomic datasets to obtain a reflection of the microbial community structure in specific types of environment and to evaluate SSU primers. However, the use of short reads obtained through next-generation sequencing for primer evaluation has not been well resolved. The software MIPE (MIcrobiota metagenome Primer Explorer was developed to adapt numerous short reads from metagenomes and metatranscriptomes. Using metagenomic or metatranscriptomic datasets as input, MIPE extracts and aligns rRNA to reveal detailed information on microbial composition and evaluate SSU rRNA primers. A mock dataset, a real Metagenomics Rapid Annotation using Subsystem Technology (MG-RAST test dataset, two PrimerProspector test datasets and a real metatranscriptomic dataset were used to validate MIPE. The software calls Mothur (v1.33.3 and the SILVA database (v119 for the alignment and classification of rRNA genes from a metagenome or metatranscriptome. MIPE can effectively extract shotgun rRNA reads from a metagenome or metatranscriptome and is capable of classifying these sequences and exhibiting sensitivity to different SSU rRNA PCR primers. Therefore, MIPE can be used to guide primer design for specific environmental samples.

  10. Metagenomic Analysis of Kimchi, a Traditional Korean Fermented Food ▿ †

    Science.gov (United States)

    Jung, Ji Young; Lee, Se Hee; Kim, Jeong Myeong; Park, Moon Su; Bae, Jin-Woo; Hahn, Yoonsoo; Madsen, Eugene L.; Jeon, Che Ok

    2011-01-01

    Kimchi, a traditional food in the Korean culture, is made from vegetables by fermentation. In this study, metagenomic approaches were used to monitor changes in bacterial populations, metabolic potential, and overall genetic features of the microbial community during the 29-day fermentation process. Metagenomic DNA was extracted from kimchi samples obtained periodically and was sequenced using a 454 GS FLX Titanium system, which yielded a total of 701,556 reads, with an average read length of 438 bp. Phylogenetic analysis based on 16S rRNA genes from the metagenome indicated that the kimchi microbiome was dominated by members of three genera: Leuconostoc, Lactobacillus, and Weissella. Assignment of metagenomic sequences to SEED categories of the Metagenome Rapid Annotation using Subsystem Technology (MG-RAST) server revealed a genetic profile characteristic of heterotrophic lactic acid fermentation of carbohydrates, which was supported by the detection of mannitol, lactate, acetate, and ethanol as fermentation products. When the metagenomic reads were mapped onto the database of completed genomes, the Leuconostoc mesenteroides subsp. mesenteroides ATCC 8293 and Lactobacillus sakei subsp. sakei 23K genomes were highly represented. These same two genera were confirmed to be important in kimchi fermentation when the majority of kimchi metagenomic sequences showed very high identity to Leuconostoc mesenteroides and Lactobacillus genes. Besides microbial genome sequences, a surprisingly large number of phage DNA sequences were identified from the cellular fractions, possibly indicating that a high proportion of cells were infected by bacteriophages during fermentation. Overall, these results provide insights into the kimchi microbial community and also shed light on fermentation processes carried out broadly by complex microbial communities. PMID:21317261

  11. Tapping uncultured microorganisms through metagenomics for drug ...

    African Journals Online (AJOL)

    Tapping uncultured microorganisms through metagenomics for drug discovery. Abdelnasser Salah Shebl Ibrahim, Ali Abdullah Al-Salamah, Ashraf A Hatamleh, Mohammed S El-Shiekh, Shebl Salah S Ibrahim ...

  12. Resolving prokaryotic taxonomy without rRNA: longer oligonucleotide word lengths improve genome and metagenome taxonomic classification.

    Directory of Open Access Journals (Sweden)

    Eric B Alsop

    Full Text Available Oligonucleotide signatures, especially tetranucleotide signatures, have been used as method for homology binning by exploiting an organism's inherent biases towards the use of specific oligonucleotide words. Tetranucleotide signatures have been especially useful in environmental metagenomics samples as many of these samples contain organisms from poorly classified phyla which cannot be easily identified using traditional homology methods, including NCBI BLAST. This study examines oligonucleotide signatures across 1,424 completed genomes from across the tree of life, substantially expanding upon previous work. A comprehensive analysis of mononucleotide through nonanucleotide word lengths suggests that longer word lengths substantially improve the classification of DNA fragments across a range of sizes of relevance to high throughput sequencing. We find that, at present, heptanucleotide signatures represent an optimal balance between prediction accuracy and computational time for resolving taxonomy using both genomic and metagenomic fragments. We directly compare the ability of tetranucleotide and heptanucleotide world lengths (tetranucleotide signatures are the current standard for oligonucleotide word usage analyses for taxonomic binning of metagenome reads. We present evidence that heptanucleotide word lengths consistently provide more taxonomic resolving power, particularly in distinguishing between closely related organisms that are often present in metagenomic samples. This implies that longer oligonucleotide word lengths should replace tetranucleotide signatures for most analyses. Finally, we show that the application of longer word lengths to metagenomic datasets leads to more accurate taxonomic binning of DNA scaffolds and have the potential to substantially improve taxonomic assignment and assembly of metagenomic data.

  13. The Indigo V Indian Ocean Expedition: a prototype for citizen microbial oceanography

    DEFF Research Database (Denmark)

    Lauro, Frederico; Senstius, Svend Jacob; Cullen, Jay

    2014-01-01

    sample acquisition. The ultimate goal of the Indigo V Expedition is to create a working blue-print for ’citizen microbial oceanography’.We will present the preliminary outcomes of the first Indigo V expedition, from Capetown to Singapore, highlighting the challenges and opportunities of such endeavours....

  14. Syllidae (Annelida: Polychaeta) from Indonesia collected by the Siboga (1899-1900) and Snellius II expeditions

    NARCIS (Netherlands)

    Aguado, M.T.; San Martín, G.; ten Hove, H.A.

    2008-01-01

    Twenty seven samples of syllids (Annelida: Polychaeta) from Indonesia collected during the Siboga Expedition (1899-1900) and five during the Snellius II Expedition (1984) have been examined. Material from several other museums and Institutions has also been included. Unpublished identifications of

  15. Sampling

    CERN Document Server

    Thompson, Steven K

    2012-01-01

    Praise for the Second Edition "This book has never had a competitor. It is the only book that takes a broad approach to sampling . . . any good personal statistics library should include a copy of this book." —Technometrics "Well-written . . . an excellent book on an important subject. Highly recommended." —Choice "An ideal reference for scientific researchers and other professionals who use sampling." —Zentralblatt Math Features new developments in the field combined with all aspects of obtaining, interpreting, and using sample data Sampling provides an up-to-date treat

  16. Metagenomics insights into food fermentations.

    Science.gov (United States)

    De Filippis, Francesca; Parente, Eugenio; Ercolini, Danilo

    2017-01-01

    This review describes the recent advances in the study of food microbial ecology, with a focus on food fermentations. High-throughput sequencing (HTS) technologies have been widely applied to the study of food microbial consortia and the different applications of HTS technologies were exploited in order to monitor microbial dynamics in food fermentative processes. Phylobiomics was the most explored application in the past decade. Metagenomics and metatranscriptomics, although still underexploited, promise to uncover the functionality of complex microbial consortia. The new knowledge acquired will help to understand how to make a profitable use of microbial genetic resources and modulate key activities of beneficial microbes in order to ensure process efficiency, product quality and safety. © 2016 The Authors. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.

  17. Challenges and Opportunities of Airborne Metagenomics

    KAUST Repository

    Behzad, H.

    2015-05-06

    Recent metagenomic studies of environments, such as marine and soil, have significantly enhanced our understanding of the diverse microbial communities living in these habitats and their essential roles in sustaining vast ecosystems. The increase in the number of publications related to soil and marine metagenomics is in sharp contrast to those of air, yet airborne microbes are thought to have significant impacts on many aspects of our lives from their potential roles in atmospheric events such as cloud formation, precipitation, and atmospheric chemistry to their major impact on human health. In this review, we will discuss the current progress in airborne metagenomics, with a special focus on exploring the challenges and opportunities of undertaking such studies. The main challenges of conducting metagenomic studies of airborne microbes are as follows: 1) Low density of microorganisms in the air, 2) efficient retrieval of microorganisms from the air, 3) variability in airborne microbial community composition, 4) the lack of standardized protocols and methodologies, and 5) DNA sequencing and bioinformatics-related challenges. Overcoming these challenges could provide the groundwork for comprehensive analysis of airborne microbes and their potential impact on the atmosphere, global climate, and our health. Metagenomic studies offer a unique opportunity to examine viral and bacterial diversity in the air and monitor their spread locally or across the globe, including threats from pathogenic microorganisms. Airborne metagenomic studies could also lead to discoveries of novel genes and metabolic pathways relevant to meteorological and industrial applications, environmental bioremediation, and biogeochemical cycles.

  18. The Amordad database engine for metagenomics.

    Science.gov (United States)

    Behnam, Ehsan; Smith, Andrew D

    2014-10-15

    Several technical challenges in metagenomic data analysis, including assembling metagenomic sequence data or identifying operational taxonomic units, are both significant and well known. These forms of analysis are increasingly cited as conceptually flawed, given the extreme variation within traditionally defined species and rampant horizontal gene transfer. Furthermore, computational requirements of such analysis have hindered content-based organization of metagenomic data at large scale. In this article, we introduce the Amordad database engine for alignment-free, content-based indexing of metagenomic datasets. Amordad places the metagenome comparison problem in a geometric context, and uses an indexing strategy that combines random hashing with a regular nearest neighbor graph. This framework allows refinement of the database over time by continual application of random hash functions, with the effect of each hash function encoded in the nearest neighbor graph. This eliminates the need to explicitly maintain the hash functions in order for query efficiency to benefit from the accumulated randomness. Results on real and simulated data show that Amordad can support logarithmic query time for identifying similar metagenomes even as the database size reaches into the millions. Source code, licensed under the GNU general public license (version 3) is freely available for download from http://smithlabresearch.org/amordad andrewds@usc.edu Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  19. Variability in Metagenomic Count Data and Its Influence on the Identification of Differentially Abundant Genes.

    Science.gov (United States)

    Jonsson, Viktor; Österlund, Tobias; Nerman, Olle; Kristiansson, Erik

    2017-04-01

    Metagenomics is the study of microorganisms in environmental and clinical samples using high-throughput sequencing of random fragments of their DNA. Since metagenomics does not require any prior culturing of isolates, entire microbial communities can be studied directly in their natural state. In metagenomics, the abundance of genes is quantified by sorting and counting the DNA fragments. The resulting count data are high-dimensional and affected by high levels of technical and biological noise that make the statistical analysis challenging. In this article, we introduce an hierarchical overdispersed Poisson model to explore the variability in metagenomic data. By analyzing three comprehensive data sets, we show that the gene-specific variability varies substantially between genes and is dependent on biological function. We also assess the power of identifying differentially abundant genes and show that incorrect assumptions about the gene-specific variability can lead to unacceptable high rates of false positives. Finally, we evaluate shrinkage approaches to improve the variance estimation and show that the prior choice significantly affects the statistical power. The results presented in this study further elucidate the complex variance structure of metagenomic data and provide suggestions for accurate and reliable identification of differentially abundant genes.

  20. Metagenomic Taxonomy-Guided Database-Searching Strategy for Improving Metaproteomic Analysis.

    Science.gov (United States)

    Xiao, Jinqiu; Tanca, Alessandro; Jia, Ben; Yang, Runqing; Wang, Bo; Zhang, Yu; Li, Jing

    2018-02-26

    Metaproteomics provides a direct measure of the functional information by investigating all proteins expressed by a microbiota. However, due to the complexity and heterogeneity of microbial communities, it is very hard to construct a sequence database suitable for a metaproteomic study. Using a public database, researchers might not be able to identify proteins from poorly characterized microbial species, while a sequencing-based metagenomic database may not provide adequate coverage for all potentially expressed protein sequences. To address this challenge, we propose a metagenomic taxonomy-guided database-search strategy (MT), in which a merged database is employed, consisting of both taxonomy-guided reference protein sequences from public databases and proteins from metagenome assembly. By applying our MT strategy to a mock microbial mixture, about two times as many peptides were detected as with the metagenomic database only. According to the evaluation of the reliability of taxonomic attribution, the rate of misassignments was comparable to that obtained using an a priori matched database. We also evaluated the MT strategy with a human gut microbial sample, and we found 1.7 times as many peptides as using a standard metagenomic database. In conclusion, our MT strategy allows the construction of databases able to provide high sensitivity and precision in peptide identification in metaproteomic studies, enabling the detection of proteins from poorly characterized species within the microbiota.

  1. Comparative metagenome analysis of an Alaskan glacier.

    Science.gov (United States)

    Choudhari, Sulbha; Lohia, Ruchi; Grigoriev, Andrey

    2014-04-01

    The temperature in the Arctic region has been increasing in the recent past accompanied by melting of its glaciers. We took a snapshot of the current microbial inhabitation of an Alaskan glacier (which can be considered as one of the simplest possible ecosystems) by using metagenomic sequencing of 16S rRNA recovered from ice/snow samples. Somewhat contrary to our expectations and earlier estimates, a rich and diverse microbial population of more than 2,500 species was revealed including several species of Archaea that has been identified for the first time in the glaciers of the Northern hemisphere. The most prominent bacterial groups found were Proteobacteria, Bacteroidetes, and Firmicutes. Firmicutes were not reported in large numbers in a previously studied Alpine glacier but were dominant in an Antarctic subglacial lake. Representatives of Cyanobacteria, Actinobacteria and Planctomycetes were among the most numerous, likely reflecting the dependence of the ecosystem on the energy obtained through photosynthesis and close links with the microbial community of the soil. Principal component analysis (PCA) of nucleotide word frequency revealed distinct sequence clusters for different taxonomic groups in the Alaskan glacier community and separate clusters for the glacial communities from other regions of the world. Comparative analysis of the community composition and bacterial diversity present in the Byron glacier in Alaska with other environments showed larger overlap with an Arctic soil than with a high Arctic lake, indicating patterns of community exchange and suggesting that these bacteria may play an important role in soil development during glacial retreat.

  2. OTU analysis using metagenomic shotgun sequencing data.

    Directory of Open Access Journals (Sweden)

    Xiaolin Hao

    Full Text Available Because of technological limitations, the primer and amplification biases in targeted sequencing of 16S rRNA genes have veiled the true microbial diversity underlying environmental samples. However, the protocol of metagenomic shotgun sequencing provides 16S rRNA gene fragment data with natural immunity against the biases raised during priming and thus the potential of uncovering the true structure of microbial community by giving more accurate predictions of operational taxonomic units (OTUs. Nonetheless, the lack of statistically rigorous comparison between 16S rRNA gene fragments and other data types makes it difficult to interpret previously reported results using 16S rRNA gene fragments. Therefore, in the present work, we established a standard analysis pipeline that would help confirm if the differences in the data are true or are just due to potential technical bias. This pipeline is built by using simulated data to find optimal mapping and OTU prediction methods. The comparison between simulated datasets revealed a relationship between 16S rRNA gene fragments and full-length 16S rRNA sequences that a 16S rRNA gene fragment having a length >150 bp provides the same accuracy as a full-length 16S rRNA sequence using our proposed pipeline, which could serve as a good starting point for experimental design and making the comparison between 16S rRNA gene fragment-based and targeted 16S rRNA sequencing-based surveys possible.

  3. Technical Report: Benchmarking for Quasispecies Abundance Inference with Confidence Intervals from Metagenomic Sequence Data

    Energy Technology Data Exchange (ETDEWEB)

    McLoughlin, K. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2016-01-22

    The software application “MetaQuant” was developed by our group at Lawrence Livermore National Laboratory (LLNL). It is designed to profile microbial populations in a sample using data from whole-genome shotgun (WGS) metagenomic DNA sequencing. Several other metagenomic profiling applications have been described in the literature. We ran a series of benchmark tests to compare the performance of MetaQuant against that of a few existing profiling tools, using real and simulated sequence datasets. This report describes our benchmarking procedure and results.

  4. Autotrophic microbe metagenomes and metabolic pathways differentiate adjacent red sea brine pools

    KAUST Repository

    Wang, Yong

    2013-04-29

    In the Red Sea, two neighboring deep-sea brine pools, Atlantis II and Discovery, have been studied extensively, and the results have shown that the temperature and concentrations of metal and methane in Atlantis II have increased over the past decades. Therefore, we investigated changes in the microbial community and metabolic pathways. Here, we compared the metagenomes of the two pools to each other and to those of deep-sea water samples. Archaea were generally absent in the Atlantis II metagenome; Bacteria in the metagenome were typically heterotrophic and depended on aromatic compounds and other extracellular organic carbon compounds as indicated by enrichment of the related metabolic pathways. In contrast, autotrophic Archaea capable of CO2 fixation and methane oxidation were identified in Discovery but not in Atlantis II. Our results suggest that hydrothermal conditions and metal precipitation in the Atlantis II pool have resulted in elimination of the autotrophic community and methanogens.

  5. Guidelines to Statistical Analysis of Microbial Composition Data Inferred from Metagenomic Sequencing.

    Science.gov (United States)

    Odintsova, Vera; Tyakht, Alexander; Alexeev, Dmitry

    2017-01-01

    Metagenomics, the application of high-throughput DNA sequencing for surveys of environmental samples, has revolutionized our view on the taxonomic and genetic composition of complex microbial communities. An enormous richness of microbiota keeps unfolding in the context of various fields ranging from biomedicine and food industry to geology. Primary analysis of metagenomic reads allows to infer semi-quantitative data describing the community structure. However, such compositional data possess statistical specific properties that are important to be considered during preprocessing, hypothesis testing and interpreting the results of statistical tests. Failure to account for these specifics may lead to essentially wrong conclusions as a result of the survey. Here we present a researcher introduced to the field of metagenomics with the basic properties of microbial compositional data including statistical power and proposed distribution models, perform a review of the publicly available software tools developed specifically for such data and outline the recommendations for the application of the methods.

  6. A metagenomic study of methanotrophic microorganisms in Coal Oil Point seep sediments

    Directory of Open Access Journals (Sweden)

    Haverkamp Thomas HA

    2011-10-01

    Full Text Available Abstract Background Methane oxidizing prokaryotes in marine sediments are believed to function as a methane filter reducing the oceanic contribution to the global methane emission. In the anoxic parts of the sediments, oxidation of methane is accomplished by anaerobic methanotrophic archaea (ANME living in syntrophy with sulphate reducing bacteria. This anaerobic oxidation of methane is assumed to be a coupling of reversed methanogenesis and dissimilatory sulphate reduction. Where oxygen is available aerobic methanotrophs take part in methane oxidation. In this study, we used metagenomics to characterize the taxonomic and metabolic potential for methane oxidation at the Tonya seep in the Coal Oil Point area, California. Two metagenomes from different sediment depth horizons (0-4 cm and 10-15 cm below sea floor were sequenced by 454 technology. The metagenomes were analysed to characterize the distribution of aerobic and anaerobic methanotrophic taxa at the two sediment depths. To gain insight into the metabolic potential the metagenomes were searched for marker genes associated with methane oxidation. Results Blast searches followed by taxonomic binning in MEGAN revealed aerobic methanotrophs of the genus Methylococcus to be overrepresented in the 0-4 cm metagenome compared to the 10-15 cm metagenome. In the 10-15 cm metagenome, ANME of the ANME-1 clade, were identified as the most abundant methanotrophic taxon with 8.6% of the reads. Searches for particulate methane monooxygenase (pmoA and methyl-coenzyme M reductase (mcrA, marker genes for aerobic and anaerobic oxidation of methane respectively, identified pmoA in the 0-4 cm metagenome as Methylococcaceae related. The mcrA reads from the 10-15 cm horizon were all classified as originating from the ANME-1 clade. Conclusions Most of the taxa detected were present in both metagenomes and differences in community structure and corresponding metabolic potential between the two samples were mainly

  7. Multisubstrate isotope labeling and metagenomic analysis of active soil bacterial communities.

    Science.gov (United States)

    Verastegui, Y; Cheng, J; Engel, K; Kolczynski, D; Mortimer, S; Lavigne, J; Montalibet, J; Romantsov, T; Hall, M; McConkey, B J; Rose, D R; Tomashek, J J; Scott, B R; Charles, T C; Neufeld, J D

    2014-07-15

    Soil microbial diversity represents the largest global reservoir of novel microorganisms and enzymes. In this study, we coupled functional metagenomics and DNA stable-isotope probing (DNA-SIP) using multiple plant-derived carbon substrates and diverse soils to characterize active soil bacterial communities and their glycoside hydrolase genes, which have value for industrial applications. We incubated samples from three disparate Canadian soils (tundra, temperate rainforest, and agricultural) with five native carbon ((12)C) or stable-isotope-labeled ((13)C) carbohydrates (glucose, cellobiose, xylose, arabinose, and cellulose). Indicator species analysis revealed high specificity and fidelity for many uncultured and unclassified bacterial taxa in the heavy DNA for all soils and substrates. Among characterized taxa, Actinomycetales (Salinibacterium), Rhizobiales (Devosia), Rhodospirillales (Telmatospirillum), and Caulobacterales (Phenylobacterium and Asticcacaulis) were bacterial indicator species for the heavy substrates and soils tested. Both Actinomycetales and Caulobacterales (Phenylobacterium) were associated with metabolism of cellulose, and Alphaproteobacteria were associated with the metabolism of arabinose; members of the order Rhizobiales were strongly associated with the metabolism of xylose. Annotated metagenomic data suggested diverse glycoside hydrolase gene representation within the pooled heavy DNA. By screening 2,876 cloned fragments derived from the (13)C-labeled DNA isolated from soils incubated with cellulose, we demonstrate the power of combining DNA-SIP, multiple-displacement amplification (MDA), and functional metagenomics by efficiently isolating multiple clones with activity on carboxymethyl cellulose and fluorogenic proxy substrates for carbohydrate-active enzymes. Importance: The ability to identify genes based on function, instead of sequence homology, allows the discovery of genes that would not be identified through sequence alone. This

  8. The metagenomics RAST server – a public resource for the automatic phylogenetic and functional analysis of metagenomes

    Directory of Open Access Journals (Sweden)

    Stevens R

    2008-09-01

    Full Text Available Abstract Background Random community genomes (metagenomes are now commonly used to study microbes in different environments. Over the past few years, the major challenge associated with metagenomics shifted from generating to analyzing sequences. High-throughput, low-cost next-generation sequencing has provided access to metagenomics to a wide range of researchers. Results A high-throughput pipeline has been constructed to provide high-performance computing to all researchers interested in using metagenomics. The pipeline produces automated functional assignments of sequences in the metagenome by comparing both protein and nucleotide databases. Phylogenetic and functional summaries of the metagenomes are generated, and tools for comparative metagenomics are incorporated into the standard views. User access is controlled to ensure data privacy, but the collaborative environment underpinning the service provides a framework for sharing datasets between multiple users. In the metagenomics RAST, all users retain full control of their data, and everything is available for download in a variety of formats. Conclusion The open-source metagenomics RAST service provides a new paradigm for the annotation and analysis of metagenomes. With built-in support for multiple data sources and a back end that houses abstract data types, the metagenomics RAST is stable, extensible, and freely available to all researchers. This service has removed one of the primary bottlenecks in metagenome sequence analysis – the availability of high-performance computing for annotating the data. http://metagenomics.nmpdr.org

  9. Metagenomics of Glassy-Winged Sharpshooter, Homalodisca vitripennis (Hemiptera: Cicadellidae)

    Science.gov (United States)

    A Metagenomics approach was used to identify unknown organisms which live in association with the glassy-winged sharpshooter, Homalodisca vitripennis (Hemiptera: Cicadellidae). Metagenomics combines molecular biology and genetics to identify, and characterize genetic material from unique biological ...

  10. Interactive metagenomic visualization in a Web browser.

    Science.gov (United States)

    Ondov, Brian D; Bergman, Nicholas H; Phillippy, Adam M

    2011-09-30

    A critical output of metagenomic studies is the estimation of abundances of taxonomical or functional groups. The inherent uncertainty in assignments to these groups makes it important to consider both their hierarchical contexts and their prediction confidence. The current tools for visualizing metagenomic data, however, omit or distort quantitative hierarchical relationships and lack the facility for displaying secondary variables. Here we present Krona, a new visualization tool that allows intuitive exploration of relative abundances and confidences within the complex hierarchies of metagenomic classifications. Krona combines a variant of radial, space-filling displays with parametric coloring and interactive polar-coordinate zooming. The HTML5 and JavaScript implementation enables fully interactive charts that can be explored with any modern Web browser, without the need for installed software or plug-ins. This Web-based architecture also allows each chart to be an independent document, making them easy to share via e-mail or post to a standard Web server. To illustrate Krona's utility, we describe its application to various metagenomic data sets and its compatibility with popular metagenomic analysis tools. Krona is both a powerful metagenomic visualization tool and a demonstration of the potential of HTML5 for highly accessible bioinformatic visualizations. Its rich and interactive displays facilitate more informed interpretations of metagenomic analyses, while its implementation as a browser-based application makes it extremely portable and easily adopted into existing analysis packages. Both the Krona rendering code and conversion tools are freely available under a BSD open-source license, and available from: http://krona.sourceforge.net.

  11. Metagenomic Guilt by Association: An Operonic Perspective

    Science.gov (United States)

    Vey, Gregory

    2013-01-01

    Next-generation sequencing projects continue to drive a vast accumulation of metagenomic sequence data. Given the growth rate of this data, automated approaches to functional annotation are indispensable and a cornerstone heuristic of many computational protocols is the concept of guilt by association. The guilt by association paradigm has been heavily exploited by genomic context methods that offer functional predictions that are complementary to homology-based annotations, thereby offering a means to extend functional annotation. In particular, operon methods that exploit co-directional intergenic distances can provide homology-free functional annotation through the transfer of functions among co-operonic genes, under the assumption that guilt by association is indeed applicable. Although guilt by association is a well-accepted annotative device, its applicability to metagenomic functional annotation has not been definitively demonstrated. Here a large-scale assessment of metagenomic guilt by association is undertaken where functional associations are predicted on the basis of co-directional intergenic distances. Specifically, functional annotations are compared within pairs of adjacent co-directional genes, as well as operons of various lengths (i.e. number of member genes), in order to reveal new information about annotative cohesion versus operon length. The results suggests that co-directional gene pairs offer reduced confidence for metagenomic guilt by association due to difficulty in resolving the existence of functional associations when intergenic distance is the sole predictor of pairwise gene interactions. However, metagenomic operons, particularly those with substantial lengths, appear to be capable of providing a superior basis for metagenomic guilt by association due to increased annotative stability. The need for improved recognition of metagenomic operons is discussed, as well as the limitations of the present work. PMID:23940763

  12. Interactive metagenomic visualization in a Web browser

    Directory of Open Access Journals (Sweden)

    Phillippy Adam M

    2011-09-01

    Full Text Available Abstract Background A critical output of metagenomic studies is the estimation of abundances of taxonomical or functional groups. The inherent uncertainty in assignments to these groups makes it important to consider both their hierarchical contexts and their prediction confidence. The current tools for visualizing metagenomic data, however, omit or distort quantitative hierarchical relationships and lack the facility for displaying secondary variables. Results Here we present Krona, a new visualization tool that allows intuitive exploration of relative abundances and confidences within the complex hierarchies of metagenomic classifications. Krona combines a variant of radial, space-filling displays with parametric coloring and interactive polar-coordinate zooming. The HTML5 and JavaScript implementation enables fully interactive charts that can be explored with any modern Web browser, without the need for installed software or plug-ins. This Web-based architecture also allows each chart to be an independent document, making them easy to share via e-mail or post to a standard Web server. To illustrate Krona's utility, we describe its application to various metagenomic data sets and its compatibility with popular metagenomic analysis tools. Conclusions Krona is both a powerful metagenomic visualization tool and a demonstration of the potential of HTML5 for highly accessible bioinformatic visualizations. Its rich and interactive displays facilitate more informed interpretations of metagenomic analyses, while its implementation as a browser-based application makes it extremely portable and easily adopted into existing analysis packages. Both the Krona rendering code and conversion tools are freely available under a BSD open-source license, and available from: http://krona.sourceforge.net.

  13. Assessment of the cPAS-based BGISEQ-500 platform for metagenomic sequencing.

    Science.gov (United States)

    Fang, Chao; Zhong, Huanzi; Lin, Yuxiang; Chen, Bin; Han, Mo; Ren, Huahui; Lu, Haorong; Luber, Jacob Mayne; Xia, Min; Li, Wangsheng; Stein, Shayna; Xu, Xun; Zhang, Wenwei; Drmanac, Radoje; Wang, Jian; Yang, Huanming; Hammarström, Lennart; Kostic, Aleksandar David; Kristiansen, Karsten; Li, Junhua

    2017-12-23

    More extensive use of metagenomic shotgun sequencing in microbiome research relies on the development of high-throughput, cost-effective sequencing. Here we present a comprehensive evaluation of the performance of the new high-throughput sequencing platform BGISEQ-500 for metagenomic shotgun sequencing and compare its performance with that of two Illumina platforms. Using fecal samples from 20 healthy individuals we evaluated the intra-platform reproducibility for metagenomic sequencing on the BGISEQ-500 platform in a setup comprising 8 library replicates and 8 sequencing replicates. Cross-platform consistency, was evaluated by comparing 20 pairwise replicates on the BGISEQ-500 platform versus the Illumina HiSeq 2000 platform and the Illumina HiSeq 4000 platform. In addition, we compared the performance of the two Illumina platforms against each other. By a newly developed overall accuracy quality control method, an average of 82.45 million high quality reads (96.06% of raw reads) per sample with 90.56% of bases scoring Q30 and above was obtained using the BGISEQ-500 platform. Quantitative analyses revealed extremely high reproducibility between BGISEQ-500 intra-platform replicates. Cross-platform replicates differed slightly more than intra-platform replicates, yet a high consistency was observed. Only a low percentage (2.02% -3.25%) of genes exhibited significant differences in relative abundance comparing the BGISEQ-500 and HiSeq platforms, with a bias towards genes with higher GC content being enriched on the HiSeq platforms. Our study provides the first set of performance metrics for human gut metagenomic sequencing data using BGISEQ-500. The high accuracy and technical reproducibility confirm the applicability of the new platform for metagenomic studies, though caution is still warranted when combining metagenomic data from different platforms.

  14. Characterization of ancient and modern genomes by SNP detection and phylogenomic and metagenomic analysis using PALEOMIX

    DEFF Research Database (Denmark)

    Schubert, Mikkel; Ermini, Luca; Der Sarkissian, Clio

    2014-01-01

    -generation sequencing reads, PALEOMIX carries out adapter removal, mapping against reference genomes, PCR duplicate removal, characterization of and compensation for postmortem damage, SNP calling and maximum-likelihood phylogenomic inference, and it profiles the metagenomic contents of the samples. As such, PALEOMIX...

  15. myPhyloDB: a local web server for the storage and analysis of metagenomics data

    Science.gov (United States)

    myPhyloDB is a user-friendly personal database with a browser-interface designed to facilitate the storage, processing, analysis, and distribution of metagenomics data. MyPhyloDB archives raw sequencing files, and allows for easy selection of project(s)/sample(s) of any combination from all availab...

  16. What Can We Learn from a Metagenomic Analysis of a Georgian Bacteriophage Cocktail?

    DEFF Research Database (Denmark)

    Zschach, Henrike; Joensen, Katrine Grimstrup; Lindhard, Barbara

    2015-01-01

    study, the Intesti phage cocktail, a key commercial product of the Eliava Institute, Georgia, has been tested on a selection of bacterial strains, sequenced as a metagenomic sample, de novo assembled and analyzed by bioinformatics methods. Furthermore, eight bacterial host strains were infected...

  17. MetaCRAM: an integrated pipeline for metagenomic taxonomy identification and compression.

    Science.gov (United States)

    Kim, Minji; Zhang, Xiejia; Ligo, Jonathan G; Farnoud, Farzad; Veeravalli, Venugopal V; Milenkovic, Olgica

    2016-02-19

    Metagenomics is a genomics research discipline devoted to the study of microbial communities in environmental samples and human and animal organs and tissues. Sequenced metagenomic samples usually comprise reads from a large number of different bacterial communities and hence tend to result in large file sizes, typically ranging between 1-10 GB. This leads to challenges in analyzing, transferring and storing metagenomic data. In order to overcome these data processing issues, we introduce MetaCRAM, the first de novo, parallelized software suite specialized for FASTA and FASTQ format metagenomic read processing and lossless compression. MetaCRAM integrates algorithms for taxonomy identification and assembly, and introduces parallel execution methods; furthermore, it enables genome reference selection and CRAM based compression. MetaCRAM also uses novel reference-based compression methods designed through extensive studies of integer compression techniques and through fitting of empirical distributions of metagenomic read-reference positions. MetaCRAM is a lossless method compatible with standard CRAM formats, and it allows for fast selection of relevant files in the compressed domain via maintenance of taxonomy information. The performance of MetaCRAM as a stand-alone compression platform was evaluated on various metagenomic samples from the NCBI Sequence Read Archive, suggesting 2- to 4-fold compression ratio improvements compared to gzip. On average, the compressed file sizes were 2-13 percent of the original raw metagenomic file sizes. We described the first architecture for reference-based, lossless compression of metagenomic data. The compression scheme proposed offers significantly improved compression ratios as compared to off-the-shelf methods such as zip programs. Furthermore, it enables running different components in parallel and it provides the user with taxonomic and assembly information generated during execution of the compression pipeline. The Meta

  18. Diel Metagenomics and Metatranscriptomics of Elkhorn Slough Hypersaline Microbial Mat

    Science.gov (United States)

    Lee, J.; Detweiler, A. M.; Everroad, R. C.; Bebout, L. E.; Weber, P. K.; Pett-Ridge, J.; Bebout, B.

    2014-12-01

    To understand the variation in gene expression associated with the daytime oxygenic phototrophic and nighttime fermentation regimes seen in hypersaline microbial mats, a contiguous mat piece was subjected to sampling at regular intervals over a 24-hour diel period. Additionally, to understand the impact of sulfate reduction on biohydrogen consumption, molybdate was added to a parallel experiment in the same run. 4 metagenome and 12 metatranscriptome Illumina HiSeq lanes were completed over day / night, and control / molybdate experiments. Preliminary comparative examination of noon and midnight metatranscriptomic samples mapped using bowtie2 to reference genomes has revealed several notable results about the dominant mat-building cyanobacterium Microcoleus chthonoplastes PCC 7420. Dominant cyanobacterium M. chthonoplastes PCC 7420 shows expression in several pathways for nitrogen scavenging, including nitrogen fixation. Reads mapped to M. chthonoplastes PCC 7420 shows expression of two starch storage and utilization pathways, one as a starch-trehalose-maltose-glucose pathway, another through UDP-glucose-cellulose-β-1,4 glucan-glucose pathway. The overall trend of gene expression was primarily light driven up-regulation followed by down-regulation in dark, while much of the remaining expression profile appears to be constitutive. Co-assembly of quality-controlled reads from 4 metagenomes was performed using Ray Meta with progressively smaller K-mer sizes, with bins identified and filtered using principal component analysis of coverages from all libraries and a %GC filter, followed by reassembly of the remaining co-assembly reads and binned reads. Despite having relatively similar abundance profiles in each metagenome, this binning approach was able to distinctly resolve bins from dominant taxa, but also sulfate reducing bacteria that are desired for understanding molybdate inhibition. Bins generated from this iterative assembly process will be used for downstream

  19. Combining gene prediction methods to improve metagenomic gene annotation

    Directory of Open Access Journals (Sweden)

    Rosen Gail L

    2011-01-01

    Full Text Available Abstract Background Traditional gene annotation methods rely on characteristics that may not be available in short reads generated from next generation technology, resulting in suboptimal performance for metagenomic (environmental samples. Therefore, in recent years, new programs have been developed that optimize performance on short reads. In this work, we benchmark three metagenomic gene prediction programs and combine their predictions to improve metagenomic read gene annotation. Results We not only analyze the programs' performance at different read-lengths like similar studies, but also separate different types of reads, including intra- and intergenic regions, for analysis. The main deficiencies are in the algorithms' ability to predict non-coding regions and gene edges, resulting in more false-positives and false-negatives than desired. In fact, the specificities of the algorithms are notably worse than the sensitivities. By combining the programs' predictions, we show significant improvement in specificity at minimal cost to sensitivity, resulting in 4% improvement in accuracy for 100 bp reads with ~1% improvement in accuracy for 200 bp reads and above. To correctly annotate the start and stop of the genes, we find that a consensus of all the predictors performs best for shorter read lengths while a unanimous agreement is better for longer read lengths, boosting annotation accuracy by 1-8%. We also demonstrate use of the classifier combinations on a real dataset. Conclusions To optimize the performance for both prediction and annotation accuracies, we conclude that the consensus of all methods (or a majority vote is the best for reads 400 bp and shorter, while using the intersection of GeneMark and Orphelia predictions is the best for reads 500 bp and longer. We demonstrate that most methods predict over 80% coding (including partially coding reads on a real human gut sample sequenced by Illumina technology.

  20. Novel thermostable amine transferases from hot spring metagenomes.

    Science.gov (United States)

    Ferrandi, Erica Elisa; Previdi, Alessandra; Bassanini, Ivan; Riva, Sergio; Peng, Xu; Monti, Daniela

    2017-06-01

    Hot spring metagenomes, prepared from samples collected at temperatures ranging from 55 to 95 °C, were submitted to an in silico screening aimed at the identification of novel amine transaminases (ATAs), valuable biocatalysts for the preparation of optically pure amines. Three novel (S)-selective ATAs, namely Is3-TA, It6-TA, and B3-TA, were discovered in the metagenome of samples collected from hot springs in Iceland and in Italy, cloned from the corresponding metagenomic DNAs and overexpressed in recombinant form in E. coli. Functional characterization of the novel ATAs demonstrated that they all possess a thermophilic character and are capable of performing amine transfer reactions using a broad range of donor and acceptor substrates, thus suggesting a good potential for practical synthetic applications. In particular, the enzyme B3-TA revealed to be exceptionally thermostable, retaining 85% of activity after 5 days of incubation at 80 °C and more than 40% after 2 weeks under the same condition. These results, which were in agreement with the estimation of an apparent melting temperature around 88 °C, make B3-TA, to the best of our knowledge, the most thermostable natural ATA described to date. This biocatalyst showed also a good tolerance toward different water-miscible and water-immiscible organic solvents. A detailed inspection of the homology-based structural model of B3-TA showed that the overall active site architecture of mesophilic (S)-selective ATAs was mainly conserved in this hyperthermophilic homolog. Additionally, a subfamily of B3-TA-like transaminases, mostly uncharacterized and all from thermophilic microorganisms, was identified and analyzed in terms of phylogenetic relationships and sequence conservation.

  1. Large-scale machine learning for metagenomics sequence classification.

    Science.gov (United States)

    Vervier, Kévin; Mahé, Pierre; Tournoud, Maud; Veyrieras, Jean-Baptiste; Vert, Jean-Philippe

    2016-04-01

    Metagenomics characterizes the taxonomic diversity of microbial communities by sequencing DNA directly from an environmental sample. One of the main challenges in metagenomics data analysis is the binning step, where each sequenced read is assigned to a taxonomic clade. Because of the large volume of metagenomics datasets, binning methods need fast and accurate algorithms that can operate with reasonable computing requirements. While standard alignment-based methods provide state-of-the-art performance, compositional approaches that assign a taxonomic class to a DNA read based on the k-mers it contains have the potential to provide faster solutions. We propose a new rank-flexible machine learning-based compositional approach for taxonomic assignment of metagenomics reads and show that it benefits from increasing the number of fragments sampled from reference genome to tune its parameters, up to a coverage of about 10, and from increasing the k-mer size to about 12. Tuning the method involves training machine learning models on about 10(8) samples in 10(7) dimensions, which is out of reach of standard softwares but can be done efficiently with modern implementations for large-scale machine learning. The resulting method is competitive in terms of accuracy with well-established alignment and composition-based tools for problems involving a small to moderate number of candidate species and for reasonable amounts of sequencing errors. We show, however, that machine learning-based compositional approaches are still limited in their ability to deal with problems involving a greater number of species and more sensitive to sequencing errors. We finally show that the new method outperforms the state-of-the-art in its ability to classify reads from species of lineage absent from the reference database and confirm that compositional approaches achieve faster prediction times, with a gain of 2-17 times with respect to the BWA-MEM short read mapper, depending on the number of

  2. Preliminary High-Throughput Metagenome Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Dusheyko, Serge; Furman, Craig; Pangilinan, Jasmyn; Shapiro, Harris; Tu, Hank

    2007-03-26

    Metagenome data sets present a qualitatively different assembly problem than traditional single-organism whole-genome shotgun (WGS) assembly. The unique aspects of such projects include the presence of a potentially large number of distinct organisms and their representation in the data set at widely different fractions. In addition, multiple closely related strains could be present, which would be difficult to assemble separately. Failure to take these issues into account can result in poor assemblies that either jumble together different strains or which fail to yield useful results. The DOE Joint Genome Institute has sequenced a number of metagenomic projects and plans to considerably increase this number in the coming year. As a result, the JGI has a need for high-throughput tools and techniques for handling metagenome projects. We present the techniques developed to handle metagenome assemblies in a high-throughput environment. This includes a streamlined assembly wrapper, based on the JGI?s in-house WGS assembler, Jazz. It also includes the selection of sensible defaults targeted for metagenome data sets, as well as quality control automation for cleaning up the raw results. While analysis is ongoing, we will discuss preliminary assessments of the quality of the assembly results (http://fames.jgi-psf.org).

  3. 21 CFR 1401.6 - Expedited process.

    Science.gov (United States)

    2010-04-01

    ... 21 Food and Drugs 9 2010-04-01 2010-04-01 false Expedited process. 1401.6 Section 1401.6 Food and Drugs OFFICE OF NATIONAL DRUG CONTROL POLICY PUBLIC AVAILABILITY OF INFORMATION § 1401.6 Expedited process. (a) Requests and appeals will be given expedited treatment whenever ONDCP determines either: (1...

  4. 7 CFR 1.9 - Expedited processing.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 1 2010-01-01 2010-01-01 false Expedited processing. 1.9 Section 1.9 Agriculture... processing. (a) A requester may apply for expedited processing at the time of the initial request for records. Within ten calendar days of its receipt of a request for expedited processing, an agency shall decide...

  5. 21 CFR 20.44 - Expedited processing.

    Science.gov (United States)

    2010-04-01

    ... 21 Food and Drugs 1 2010-04-01 2010-04-01 false Expedited processing. 20.44 Section 20.44 Food and... Procedures and Fees § 20.44 Expedited processing. (a) The Food and Drug Administration will provide expedited processing of a request for records when the requester demonstrates a compelling need, or in other cases as...

  6. 28 CFR 802.8 - Expedited processing.

    Science.gov (United States)

    2010-07-01

    ... 28 Judicial Administration 2 2010-07-01 2010-07-01 false Expedited processing. 802.8 Section 802.8... DISCLOSURE OF RECORDS Freedom of Information Act § 802.8 Expedited processing. (a) Requests and appeals will... basis. (b) If you seek expedited processing, you must submit a statement, certified to be true and...

  7. Scalable metagenomics alignment research tool (SMART): a scalable, rapid, and complete search heuristic for the classification of metagenomic sequences from complex sequence populations.

    Science.gov (United States)

    Lee, Aaron Y; Lee, Cecilia S; Van Gelder, Russell N

    2016-07-28

    Next generation sequencing technology has enabled characterization of metagenomics through massively parallel genomic DNA sequencing. The complexity and diversity of environmental samples such as the human gut microflora, combined with the sustained exponential growth in sequencing capacity, has led to the challenge of identifying microbial organisms by DNA sequence. We sought to validate a Scalable Metagenomics Alignment Research Tool (SMART), a novel searching heuristic for shotgun metagenomics sequencing results. After retrieving all genomic DNA sequences from the NCBI GenBank, over 1 × 10(11) base pairs of 3.3 × 10(6) sequences from 9.25 × 10(5) species were indexed using 4 base pair hashtable shards. A MapReduce searching strategy was used to distribute the search workload in a computing cluster environment. In addition, a one base pair permutation algorithm was used to account for single nucleotide polymorphisms and sequencing errors. Simulated datasets used to evaluate Kraken, a similar metagenomics classification tool, were used to measure and compare precision and accuracy. Finally using a same set of training sequences we compared Kraken, CLARK, and SMART within the same computing environment. Utilizing 12 computational nodes, we completed the classification of all datasets in under 10 min each using exact matching with an average throughput of over 1.95 × 10(6) reads classified per minute. With permutation matching, we achieved sensitivity greater than 83 % and precision greater than 94 % with simulated datasets at the species classification level. We demonstrated the application of this technique applied to conjunctival and gut microbiome metagenomics sequencing results. In our head to head comparison, SMART and CLARK had similar accuracy gains over Kraken at the species classification level, but SMART required approximately half the amount of RAM of CLARK. SMART is the first scalable, efficient, and rapid metagenomics classification algorithm

  8. ISS Potable Water Quality for Expeditions 26 through 30

    Science.gov (United States)

    Straub, John E., II; Plumlee, Debrah K.; Schultz, John R.; McCoy, J. Torin

    2012-01-01

    International Space Station (ISS) Expeditions 26-30 spanned a 16-month period beginning in November of 2010 wherein the final 3 flights of the Space Shuttle program finished ISS construction and delivered supplies to support the post-shuttle era of station operations. Expedition crews relied on several sources of potable water during this period, including water recovered from urine distillate and humidity condensate by the U.S. water processor, water regenerated from humidity condensate by the Russian water recovery system, and Russian ground-supplied potable water. Potable water samples collected during Expeditions 26-30 were returned on Shuttle flights STS-133 (ULF5), STS-134 (ULF6), and STS-135 (ULF7), as well as Soyuz flights 24-27. The chemical quality of the ISS potable water supplies continued to be verified by the Johnson Space Center s Water and Food Analytical Laboratory (WAFAL) via analyses of returned water samples. This paper presents the chemical analysis results for water samples returned from Expeditions 26-30 and discusses their compliance with ISS potable water standards. The presence or absence of dimethylsilanediol (DMSD) is specifically addressed, since DMSD was identified as the primary cause of the temporary rise and fall in total organic carbon of the U.S. product water that occurred in the summer of 2010.

  9. Challenges and opportunities of airborne metagenomics.

    Science.gov (United States)

    Behzad, Hayedeh; Gojobori, Takashi; Mineta, Katsuhiko

    2015-05-06

    Recent metagenomic studies of environments, such as marine and soil, have significantly enhanced our understanding of the diverse microbial communities living in these habitats and their essential roles in sustaining vast ecosystems. The increase in the number of publications related to soil and marine metagenomics is in sharp contrast to those of air, yet airborne microbes are thought to have significant impacts on many aspects of our lives from their potential roles in atmospheric events such as cloud formation, precipitation, and atmospheric chemistry to their major impact on human health. In this review, we will discuss the current progress in airborne metagenomics, with a special focus on exploring the challenges and opportunities of undertaking such studies. The main challenges of conducting metagenomic studies of airborne microbes are as follows: 1) Low density of microorganisms in the air, 2) efficient retrieval of microorganisms from the air, 3) variability in airborne microbial community composition, 4) the lack of standardized protocols and methodologies, and 5) DNA sequencing and bioinformatics-related challenges. Overcoming these challenges could provide the groundwork for comprehensive analysis of airborne microbes and their potential impact on the atmosphere, global climate, and our health. Metagenomic studies offer a unique opportunity to examine viral and bacterial diversity in the air and monitor their spread locally or across the globe, including threats from pathogenic microorganisms. Airborne metagenomic studies could also lead to discoveries of novel genes and metabolic pathways relevant to meteorological and industrial applications, environmental bioremediation, and biogeochemical cycles. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  10. Gene Prediction in Metagenomic Fragments with Deep Learning

    Directory of Open Access Journals (Sweden)

    Shao-Wu Zhang

    2017-01-01

    Full Text Available Next generation sequencing technologies used in metagenomics yield numerous sequencing fragments which come from thousands of different species. Accurately identifying genes from metagenomics fragments is one of the most fundamental issues in metagenomics. In this article, by fusing multifeatures (i.e., monocodon usage, monoamino acid usage, ORF length coverage, and Z-curve features and using deep stacking networks learning model, we present a novel method (called Meta-MFDL to predict the metagenomic genes. The results with 10 CV and independent tests show that Meta-MFDL is a powerful tool for identifying genes from metagenomic fragments.

  11. Gene Prediction in Metagenomic Fragments with Deep Learning.

    Science.gov (United States)

    Zhang, Shao-Wu; Jin, Xiang-Yang; Zhang, Teng

    2017-01-01

    Next generation sequencing technologies used in metagenomics yield numerous sequencing fragments which come from thousands of different species. Accurately identifying genes from metagenomics fragments is one of the most fundamental issues in metagenomics. In this article, by fusing multifeatures (i.e., monocodon usage, monoamino acid usage, ORF length coverage, and Z-curve features) and using deep stacking networks learning model, we present a novel method (called Meta-MFDL) to predict the metagenomic genes. The results with 10 CV and independent tests show that Meta-MFDL is a powerful tool for identifying genes from metagenomic fragments.

  12. Functional Metagenomics of the Bronchial Microbiome in COPD.

    Science.gov (United States)

    Millares, Laura; Pérez-Brocal, Vicente; Ferrari, Rafaela; Gallego, Miguel; Pomares, Xavier; García-Núñez, Marian; Montón, Concepción; Capilla, Silvia; Monsó, Eduard; Moya, Andrés

    2015-01-01

    The course of chronic obstructive pulmonary disease (COPD) is frequently aggravated by exacerbations, and changes in the composition and activity of the microbiome may be implicated in their appearance. The aim of this study was to analyse the composition and the gene content of the microbial community in bronchial secretions of COPD patients in both stability and exacerbation. Taxonomic data were obtained by 16S rRNA gene amplification and pyrosequencing, and metabolic information through shotgun metagenomics, using the Metagenomics RAST server (MG-RAST), and the PICRUSt (Phylogenetic Investigation of Communities by Reconstruction of Unobserved States) programme, which predict metagenomes from 16S data. Eight severe COPD patients provided good quality sputum samples, and no significant differences in the relative abundance of any phyla and genera were found between stability and exacerbation. Bacterial biodiversity (Chao1 and Shannon indexes) did not show statistical differences and beta-diversity analysis (Bray-Curtis dissimilarity index) showed a similar microbial composition in the two clinical situations. Four functional categories showed statistically significant differences with MG-RAST at KEGG level 2: in exacerbation, Cell growth and Death and Transport and Catabolism decreased in abundance [1.6 (0.2-2.3) vs 3.6 (3.3-6.9), p = 0.012; and 1.8 (0-3.3) vs 3.6 (1.8-5.1), p = 0.025 respectively], while Cancer and Carbohydrate Metabolism increased [0.8 (0-1.5) vs 0 (0-0.5), p = 0.043; and 7 (6.4-9) vs 5.9 (6.3-6.1), p = 0.012 respectively]. In conclusion, the bronchial microbiome as a whole is not significantly modified when exacerbation symptoms appear in severe COPD patients, but its functional metabolic capabilities show significant changes in several pathways.

  13. Assembling The Marine Metagenome, One Cell At A Time

    Energy Technology Data Exchange (ETDEWEB)

    Xie, Gang [Los Alamos National Laboratory; Han, Shunsheng [Los Alamos National Laboratory; Kiss, Hajnalka [Los Alamos National Laboratory; Saw, Jimmy [Los Alamos National Laboratory; Senin, Pavel [Los Alamos National Laboratory; Woyke, Tanja [DOE JOINT GENOME INAT.; Copeland, Alex [DOE JOINT GENSOME INST.; Gonzalez, Jose [UNIV OF LAGUNA, SPAIN; Chatterji, Sourav [DOE JOINT GENSOME INST.; Cheng, Jan - Fang [DOE JOINT GENSOME INST.; Eisen, Jonathan A [DOE JOINT GENOME INST.; Sieracki, Michael E [UNIV OF CA-DAVIS; Stepanauskas, Ramunas [BIGELOW LAB

    2008-01-01

    The difficulty associated with the cultivation of most microorganisms and the complexity of natural microbial assemblages, such as marine plankton or human microbiome, hinder genome reconstruction of representative taxa using cultivation or metagenomic approaches. Here we used an alternative, single cell sequencing approach to obtain high-quality genome assemblies of two uncultured, numerically significant marine microorganisms. We employed fluorescence-activated cell sorting and multiple displacement amplification to obtain hundreds of micrograms of genomic DNA from individual, uncultured cells of two marine flavobacteria from the Gulf of Maine that were phylogenetically distant from existing cultured strains. Shotgun sequencing and genome finishing yielded 1.9 Mbp in 17 contigs and 1.5 Mbp in 21 contigs for the two flavobacteria, with estimated genome recoveries of about 91% and 78%, respectively. Only 0.24% of the assembling sequences were contaminants and were removed from further analysis using rigorous quality control. In contrast to all cultured strains of marine flavobacteria, the two single cell genomes were excellent Global Ocean Sampling (GOS) metagenome fragment recruiters, demonstrating their numerical significance in the ocean. The geographic distribution of GOS recruits along the Northwest Atlantic coast coincided with ocean surface currents. Metabolic reconstruction indicated diverse potential energy sources, including biopolymer degradation, proteorhodopsin photometabolism, and hydrogen oxidation. Compared to cultured relatives, the two uncultured flavobacteria have small genome sizes, few non-coding nucleotides, and few paralogous genes, suggesting adaptations to narrow ecological niches. These features may have contributed to the abundance of the two taxa in specific regions of the ocean, and may have hindered their cultivation. We demonstrate the power of single cell DNA sequencing to generate reference genomes of uncultured taxa from a complex

  14. Assembling the Marine Metagenome, One Cell at a Time

    Energy Technology Data Exchange (ETDEWEB)

    Woyke, Tanja; Xie, Gary; Copeland, Alex; Gonzalez, Jose M.; Han, Cliff; Kiss, Hajnalka; Saw, Jimmy H.; Senin, Pavel; Yang, Chi; Chatterji, Sourav; Cheng, Jan-Fang; Eisen, Jonathan A.; Sieracki, Michael E.; Stepanauskas, Ramunas

    2010-06-24

    The difficulty associated with the cultivation of most microorganisms and the complexity of natural microbial assemblages, such as marine plankton or human microbiome, hinder genome reconstruction of representative taxa using cultivation or metagenomic approaches. Here we used an alternative, single cell sequencing approach to obtain high-quality genome assemblies of two uncultured, numerically significant marine microorganisms. We employed fluorescence-activated cell sorting and multiple displacement amplification to obtain hundreds of micrograms of genomic DNA from individual, uncultured cells of two marine flavobacteria from the Gulf of Maine that were phylogenetically distant from existing cultured strains. Shotgun sequencing and genome finishing yielded 1.9 Mbp in 17 contigs and 1.5 Mbp in 21 contigs for the two flavobacteria, with estimated genome recoveries of about 91percent and 78percent, respectively. Only 0.24percent of the assembling sequences were contaminants and were removed from further analysis using rigorous quality control. In contrast to all cultured strains of marine flavobacteria, the two single cell genomes were excellent Global Ocean Sampling (GOS) metagenome fragment recruiters, demonstrating their numerical significance in the ocean. The geographic distribution of GOS recruits along the Northwest Atlantic coast coincided with ocean surface currents. Metabolic reconstruction indicated diverse potential energy sources, including biopolymer degradation, proteorhodopsin photometabolism, and hydrogen oxidation. Compared to cultured relatives, the two uncultured flavobacteria have small genome sizes, few non-coding nucleotides, and few paralogous genes, suggesting adaptations to narrow ecological niches. These features may have contributed to the abundance of the two taxa in specific regions of the ocean, and may have hindered their cultivation. We demonstrate the power of single cell DNA sequencing to generate reference genomes of uncultured

  15. Metagenomics as a Tool for Enzyme Discovery: Hydrolytic Enzymes from Marine-Related Metagenomes.

    Science.gov (United States)

    Popovic, Ana; Tchigvintsev, Anatoly; Tran, Hai; Chernikova, Tatyana N; Golyshina, Olga V; Yakimov, Michail M; Golyshin, Peter N; Yakunin, Alexander F

    2015-01-01

    This chapter discusses metagenomics and its application for enzyme discovery, with a focus on hydrolytic enzymes from marine metagenomic libraries. With less than one percent of culturable microorganisms in the environment, metagenomics, or the collective study of community genetics, has opened up a rich pool of uncharacterized metabolic pathways, enzymes, and adaptations. This great untapped pool of genes provides the particularly exciting potential to mine for new biochemical activities or novel enzymes with activities tailored to peculiar sets of environmental conditions. Metagenomes also represent a huge reservoir of novel enzymes for applications in biocatalysis, biofuels, and bioremediation. Here we present the results of enzyme discovery for four enzyme activities, of particular industrial or environmental interest, including esterase/lipase, glycosyl hydrolase, protease and dehalogenase.

  16. Functional Metagenome Mining of Soil for a Novel Gentamicin Resistance Gene.

    Science.gov (United States)

    Im, Hyunjoo; Kim, Kyung Mo; Lee, Sang-Heon; Ryu, Choong-Min

    2016-03-01

    Extensive use of antibiotics over recent decades has led to bacterial resistance against antibiotics, including gentamicin, one of the most effective aminoglycosides. The emergence of resistance is problematic for hospitals, since gentamicin is an important broad-spectrum antibiotic for the control of bacterial pathogens in the clinic. Previous study to identify gentamicin resistance genes from environmental samples have been conducted using culture-dependent screening methods. To overcome these limitations, we employed a metagenome-based culture-independent protocol to identify gentamicin resistance genes. Through functional screening of metagenome libraries derived from soil samples, a fosmid clone was selected as it conferred strong gentamicin resistance. To identify a specific functioning gene conferring gentamicin resistance from a selected fosmid clone (35-40 kb), a shot-gun library was constructed and four shot-gun clones (2-3 kb) were selected. Further characterization of these clones revealed that they contained sequences similar to that of the RNA ligase, T4 rnlA that is known as a toxin gene. The overexpression of the rnlA-like gene in Escherichia coli increased gentamicin resistance, indicating that this toxin gene modulates this trait. The results of our metagenome library analysis suggest that the rnlA-like gene may represent a new class of gentamicin resistance genes in pathogenic bacteria. In addition, we demonstrate that the soil metagenome can provide an important resource for the identification of antibiotic resistance genes, which are valuable molecular targets in efforts to overcome antibiotic resistance.

  17. Metagenomic survey for viruses in Western Arctic caribou, Alaska, through iterative assembly of taxonomic units.

    Directory of Open Access Journals (Sweden)

    Anita C Schürch

    Full Text Available Pathogen surveillance in animals does not provide a sufficient level of vigilance because it is generally confined to surveillance of pathogens with known economic impact in domestic animals and practically nonexistent in wildlife species. As most (re-emerging viral infections originate from animal sources, it is important to obtain insight into viral pathogens present in the wildlife reservoir from a public health perspective. When monitoring living, free-ranging wildlife for viruses, sample collection can be challenging and availability of nucleic acids isolated from samples is often limited. The development of viral metagenomics platforms allows a more comprehensive inventory of viruses present in wildlife. We report a metagenomic viral survey of the Western Arctic herd of barren ground caribou (Rangifer tarandus granti in Alaska, USA. The presence of mammalian viruses in eye and nose swabs of 39 free-ranging caribou was investigated by random amplification combined with a metagenomic analysis approach that applied exhaustive iterative assembly of sequencing results to define taxonomic units of each metagenome. Through homology search methods we identified the presence of several mammalian viruses, including different papillomaviruses, a novel parvovirus, polyomavirus, and a virus that potentially represents a member of a novel genus in the family Coronaviridae.

  18. 16S rRNA metagenome clustering and diversity estimation using locality sensitive hashing.

    Science.gov (United States)

    Rasheed, Zeehasham; Rangwala, Huzefa; Barbará, Daniel

    2013-01-01

    Advances in biotechnology have changed the manner of characterizing large populations of microbial communities that are ubiquitous across several environments."Metagenome" sequencing involves decoding the DNA of organisms co-existing within ecosystems ranging from ocean, soil and human body. Several researchers are interested in metagenomics because it provides an insight into the complex biodiversity across several environments. Clinicians are using metagenomics to determine the role played by collection of microbial organisms within human body with respect to human health wellness and disease. We have developed an efficient and scalable, species richness estimation algorithm that uses locality sensitive hashing (LSH). Our algorithm achieves efficiency by approximating the pairwise sequence comparison operations using hashing and also incorporates matching of fixed-length, gapless subsequences criterion to improve the quality of sequence comparisons. We use LSH-based similarity function to cluster similar sequences and make individual groups, called operational taxonomic units (OTUs). We also compute different species diversity/richness metrics by utilizing OTU assignment results to further extend our analysis. The algorithm is evaluated on synthetic samples and eight targeted 16S rRNA metagenome samples taken from seawater. We compare the performance of our algorithm with several competing diversity estimation algorithms. We show the benefits of our approach with respect to computational runtime and meaningful OTU assignments. We also demonstrate practical significance of the developed algorithm by comparing bacterial diversity and structure across different skin locations. http://www.cs.gmu.edu/~mlbio/LSH-DIV.

  19. EBI Metagenomics in 2017: enriching the analysis of microbial communities, from sequence reads to assemblies

    Science.gov (United States)

    Denise, Hubert; Potter, Simon; Salazar, Gustavo A; Pesseat, Sebastien; Hunter, Fiona M I; ten Hoopen, Petra; Alako, Blaise; Amid, Clara; Wilkinson, Darren J; Curtis, Thomas P

    2018-01-01

    Abstract EBI metagenomics (http://www.ebi.ac.uk/metagenomics) provides a free to use platform for the analysis and archiving of sequence data derived from the microbial populations found in a particular environment. Over the past two years, EBI metagenomics has increased the number of datasets analysed 10-fold. In addition to increased throughput, the underlying analysis pipeline has been overhauled to include both new or updated tools and reference databases. Of particular note is a new workflow for taxonomic assignments that has been extended to include assignments based on both the large and small subunit RNA marker genes and to encompass all cellular micro-organisms. We also describe the addition of metagenomic assembly as a new analysis service. Our pilot studies have produced over 2400 assemblies from datasets in the public domain. From these assemblies, we have produced a searchable, non-redundant protein database of over 50 million sequences. To provide improved access to the data stored within the resource, we have developed a programmatic interface that provides access to the analysis results and associated sample metadata. Finally, we have integrated the results of a series of statistical analyses that provide estimations of diversity and sample comparisons. PMID:29069476

  20. Metataxonomic and Metagenomic Approaches Versus Culture-Based Techniques For Clinical Pathology

    Directory of Open Access Journals (Sweden)

    Sarah K Hilton

    2016-04-01

    Full Text Available Diagnoses that are both timely and accurate are critically important for patients with life-threatening or drug resistant infections. Technological improvements in High-Throughput Sequencing (HTS have led to its use in pathogen detection and its application in clinical diagnoses of infectious diseases. The present study compares two HTS methods, 16S rRNA marker gene sequencing (metataxonomics and whole metagenomic shotgun sequencing (metagenomics, in their respective abilities to match the same diagnosis as traditional culture methods (culture inference for patients with ventilator associated pneumonia (VAP. The metagenomic analysis was able to produce the same diagnosis as culture methods at the species-level for five of the six samples, while the metataxonomic analysis was only able to produce results with the same species-level identification as culture for two of the six samples. These results indicate that metagenomic analyses have the accuracy needed for a clinical diagnostic tool, but full integration in diagnostic protocols is contingent on technological improvements to decrease turnaround time and lower costs.

  1. Separating metagenomic short reads into genomes via clustering

    Directory of Open Access Journals (Sweden)

    Tanaseichuk Olga

    2012-09-01

    Full Text Available Abstract Background The metagenomics approach allows the simultaneous sequencing of all genomes in an environmental sample. This results in high complexity datasets, where in addition to repeats and sequencing errors, the number of genomes and their abundance ratios are unknown. Recently developed next-generation sequencing (NGS technologies significantly improve the sequencing efficiency and cost. On the other hand, they result in shorter reads, which makes the separation of reads from different species harder. Among the existing computational tools for metagenomic analysis, there are similarity-based methods that use reference databases to align reads and composition-based methods that use composition patterns (i.e., frequencies of short words or l-mers to cluster reads. Similarity-based methods are unable to classify reads from unknown species without close references (which constitute the majority of reads. Since composition patterns are preserved only in significantly large fragments, composition-based tools cannot be used for very short reads, which becomes a significant limitation with the development of NGS. A recently proposed algorithm, AbundanceBin, introduced another method that bins reads based on predicted abundances of the genomes sequenced. However, it does not separate reads from genomes of similar abundance levels. Results In this work, we present a two-phase heuristic algorithm for separating short paired-end reads from different genomes in a metagenomic dataset. We use the observation that most of the l-mers belong to unique genomes when l is sufficiently large. The first phase of the algorithm results in clusters of l-mers each of which belongs to one genome. During the second phase, clusters are merged based on l-mer repeat information. These final clusters are used to assign reads. The algorithm could handle very short reads and sequencing errors. It is initially designed for genomes with similar abundance levels and then

  2. Binning sequences using very sparse labels within a metagenome

    Directory of Open Access Journals (Sweden)

    Halgamuge Saman K

    2008-04-01

    Full Text Available Abstract Background In metagenomic studies, a process called binning is necessary to assign contigs that belong to multiple species to their respective phylogenetic groups. Most of the current methods of binning, such as BLAST, k-mer and PhyloPythia, involve assigning sequence fragments by comparing sequence similarity or sequence composition with already-sequenced genomes that are still far from comprehensive. We propose a semi-supervised seeding method for binning that does not depend on knowledge of completed genomes. Instead, it extracts the flanking sequences of highly conserved 16S rRNA from the metagenome and uses them as seeds (labels to assign other reads based on their compositional similarity. Results The proposed seeding method is implemented on an unsupervised Growing Self-Organising Map (GSOM, and called Seeded GSOM (S-GSOM. We compared it with four well-known semi-supervised learning methods in a preliminary test, separating random-length prokaryotic sequence fragments sampled from the NCBI genome database. We identified the flanking sequences of the highly conserved 16S rRNA as suitable seeds that could be used to group the sequence fragments according to their species. S-GSOM showed superior performance compared to the semi-supervised methods tested. Additionally, S-GSOM may also be used to visually identify some species that do not have seeds. The proposed method was then applied to simulated metagenomic datasets using two different confidence threshold settings and compared with PhyloPythia, k-mer and BLAST. At the reference taxonomic level Order, S-GSOM outperformed all k-mer and BLAST results and showed comparable results with PhyloPythia for each of the corresponding confidence settings, where S-GSOM performed better than PhyloPythia in the ≥ 10 reads datasets and comparable in the ≥ 8 kb benchmark tests. Conclusion In the task of binning using semi-supervised learning methods, results indicate S-GSOM to be the best of

  3. MEBS, a software platform to evaluate large (meta)genomic collections according to their metabolic machinery: unraveling the sulfur cycle.

    Science.gov (United States)

    De Anda, Valerie; Zapata-Peñasco, Icoquih; Poot-Hernandez, Augusto Cesar; Eguiarte, Luis E; Contreras-Moreira, Bruno; Souza, Valeria

    2017-11-01

    The increasing number of metagenomic and genomic sequences has dramatically improved our understanding of microbial diversity, yet our ability to infer metabolic capabilities in such datasets remains challenging. We describe the Multigenomic Entropy Based Score pipeline (MEBS), a software platform designed to evaluate, compare, and infer complex metabolic pathways in large "omic" datasets, including entire biogeochemical cycles. MEBS is open source and available through https://github.com/eead-csic-compbio/metagenome_Pfam_score. To demonstrate its use, we modeled the sulfur cycle by exhaustively curating the molecular and ecological elements involved (compounds, genes, metabolic pathways, and microbial taxa). This information was reduced to a collection of 112 characteristic Pfam protein domains and a list of complete-sequenced sulfur genomes. Using the mathematical framework of relative entropy (H΄), we quantitatively measured the enrichment of these domains among sulfur genomes. The entropy of each domain was used both to build up a final score that indicates whether a (meta)genomic sample contains the metabolic machinery of interest and to propose marker domains in metagenomic sequences such as DsrC (PF04358). MEBS was benchmarked with a dataset of 2107 non-redundant microbial genomes from RefSeq and 935 metagenomes from MG-RAST. Its performance, reproducibility, and robustness were evaluated using several approaches, including random sampling, linear regression models, receiver operator characteristic plots, and the area under the curve metric (AUC). Our results support the broad applicability of this algorithm to accurately classify (AUC = 0.985) hard-to-culture genomes (e.g., Candidatus Desulforudis audaxviator), previously characterized ones, and metagenomic environments such as hydrothermal vents, or deep-sea sediment. Our benchmark indicates that an entropy-based score can capture the metabolic machinery of interest and can be used to efficiently classify

  4. VizBin - an application for reference-independent visualization and human-augmented binning of metagenomic data.

    Science.gov (United States)

    Laczny, Cedric C; Sternal, Tomasz; Plugaru, Valentin; Gawron, Piotr; Atashpendar, Arash; Margossian, Houry Hera; Coronado, Sergio; der Maaten, Laurens van; Vlassis, Nikos; Wilmes, Paul

    2015-01-01

    Metagenomics is limited in its ability to link distinct microbial populations to genetic potential due to a current lack of representative isolate genome sequences. Reference-independent approaches, which exploit for example inherent genomic signatures for the clustering of metagenomic fragments (binning), offer the prospect to resolve and reconstruct population-level genomic complements without the need for prior knowledge. We present VizBin, a Java™-based application which offers efficient and intuitive reference-independent visualization of metagenomic datasets from single samples for subsequent human-in-the-loop inspection and binning. The method is based on nonlinear dimension reduction of genomic signatures and exploits the superior pattern recognition capabilities of the human eye-brain system for cluster identification and delineation. We demonstrate the general applicability of VizBin for the analysis of metagenomic sequence data by presenting results from two cellulolytic microbial communities and one human-borne microbial consortium. The superior performance of our application compared to other analogous metagenomic visualization and binning methods is also presented. VizBin can be applied de novo for the visualization and subsequent binning of metagenomic datasets from single samples, and it can be used for the post hoc inspection and refinement of automatically generated bins. Due to its computational efficiency, it can be run on common desktop machines and enables the analysis of complex metagenomic datasets in a matter of minutes. The software implementation is available at https://claczny.github.io/VizBin under the BSD License (four-clause) and runs under Microsoft Windows™, Apple Mac OS X™ (10.7 to 10.10), and Linux.

  5. Rescuing biogeographic legacy data: The "Thor" Expedition, a historical oceanographic expedition to the Mediterranean Sea.

    Science.gov (United States)

    Mavraki, Dimitra; Fanini, Lucia; Tsompanou, Marilena; Gerovasileiou, Vasilis; Nikolopoulou, Stamatina; Chatzinikolaou, Eva; Plaitis, Wanda; Faulwetter, Sarah

    2016-01-01

    This article describes the digitization of a series of historical datasets based οn the reports of the 1908-1910 Danish Oceanographical Expeditions to the Mediterranean and adjacent seas. All station and sampling metadata as well as biodiversity data regarding calcareous rhodophytes, pelagic polychaetes, and fish (families Engraulidae and Clupeidae) obtained during these expeditions were digitized within the activities of the LifeWatchGreece Research Ιnfrastructure project and presented in the present paper. The aim was to safeguard public data availability by using an open access infrastructure, and to prevent potential loss of valuable historical data on the Mediterranean marine biodiversity. The datasets digitized here cover 2,043 samples taken at 567 stations during a time period from 1904 to 1930 in the Mediterranean and adjacent seas. The samples resulted in 1,588 occurrence records of pelagic polychaetes, fish (Clupeiformes) and calcareous algae (Rhodophyta). In addition, basic environmental data (e.g. sea surface temperature, salinity) as well as meterological conditions are included for most sampling events. In addition to the description of the digitized datasets, a detailed description of the problems encountered during the digitization of this historical dataset and a discussion on the value of such data are provided.

  6. The binning of metagenomic contigs for microbial physiology of mixed cultures

    Directory of Open Access Journals (Sweden)

    Marc eStrous

    2012-12-01

    Full Text Available So far, microbial physiology has dedicated itself mainly to pure cultures. In nature, cross feeding and competition are important aspects of microbial physiology and these can only be addressed by studying complete communities such as enrichment cultures. Metagenomic sequencing is a powerful tool to characterize such mixed cultures. In the analysis of metagenomic data, well established algorithms exist for the assembly of short reads into contigs and for the annotation of predicted genes. However, the binning of the assembled contigs or unassembled reads is still a major bottleneck and required to understand how the overall metabolism is partitioned over different community members. Binning consists of the clustering of contigs or reads that apparently originate from the same source population.In the present study eight metagenomic samples originating from the same habitat, a laboratory enrichment culture, were sequenced. Each sample contained 13-23 Mb of assembled contigs and up to eight abundant populations. Binning was attempted with existing methods but they were found to produce poor results, were slow, dependent on non-standard platforms or produced errors. A new binning procedure was developed based on multivariate statistics of tetranucleotide frequencies combined with the use of interpolated Markov models. Its performance was evaluated by comparison of the results between samples with BLAST and in comparison to exisiting algorithms for four publicly available metagenomes and one previously published artificial metagenome. The accuracy of the new approach was comparable or higher than existing methods. Further, it was up to a hunderd times faster. It was implemented in Java Swing as a complete open source graphical binning application available for download and further development (http://sourceforge.net/projects/metawatt.

  7. Assembly of viral genomes from metagenomes

    Directory of Open Access Journals (Sweden)

    Saskia L Smits

    2014-12-01

    Full Text Available Viral infections remain a serious global health issue. Metagenomic approaches are increasingly used in the detection of novel viral pathogens but also to generate complete genomes of uncultivated viruses. In silico identification of complete viral genomes from sequence data would allow rapid phylogenetic characterization of these new viruses. Often, however, complete viral genomes are not recovered, but rather several distinct contigs derived from a single entity, some of which have no sequence homology to any known proteins. De novo assembly of single viruses from a metagenome is challenging, not only because of the lack of a reference genome, but also because of intrapopulation variation and uneven or insufficient coverage. Here we explored different assembly algorithms, remote homology searches, genome-specific sequence motifs, k-mer frequency ranking, and coverage profile binning to detect and obtain viral target genomes from metagenomes. All methods were tested on 454-generated sequencing datasets containing three recently described RNA viruses with a relatively large genome which were divergent to previously known viruses from the viral families Rhabdoviridae and Coronaviridae. Depending on specific characteristics of the target virus and the metagenomic community, different assembly and in silico gap closure strategies were successful in obtaining near complete viral genomes.

  8. Towards a more complete metagenomics toolkit

    Science.gov (United States)

    The emerging scientific discipline of metagenomics has not only created a myriad of opportunities for biologists to reveal new insights into the microbial underpinnings of our environment, but has also presented a number of interesting challenges for bioinformatics algorithms and software developers...

  9. Applications of metagenomics for industrial bioproducts

    Science.gov (United States)

    Recent progress in mining the rich genetic resource of non-culturable microbes has led to the discovery of new genes, enzymes, and natural products. The impact of metagenomics is witnessed in the development of commodity and fine chemicals, agrochemicals and pharmaceuticals where the benefit of enz...

  10. An extended genovo metagenomic assembler by incorporating paired-end information

    Directory of Open Access Journals (Sweden)

    Afiahayati

    2013-10-01

    Full Text Available Metagenomes present assembly challenges, when assembling multiple genomes from mixed reads of multiple species. An assembler for single genomes can’t adapt well when applied in this case. A metagenomic assembler, Genovo, is a de novo assembler for metagenomes under a generative probabilistic model. Genovo assembles all reads without discarding any reads in a preprocessing step, and is therefore able to extract more information from metagenomic data and, in principle, generate better assembly results. Paired end sequencing is currently widely-used yet Genovo was designed for 454 single end reads. In this research, we attempted to extend Genovo by incorporating paired-end information, named Xgenovo, so that it generates higher quality assemblies with paired end reads.First, we extended Genovo by adding a bonus parameter in the Chinese Restaurant Process used to get prior accounts for the unknown number of genomes in the sample. This bonus parameter intends for a pair of reads to be in the same contig and as an effort to solve chimera contig case. Second, we modified the sampling process of the location of a read in a contig. We used relative distance for the number of trials in the symmetric geometric distribution instead of using distance between the offset and the center of contig used in Genovo. Using this relative distance, a read sampled in the appropriate location has higher probability. Therefore a read will be mapped in the correct location.Results of extensive experiments on simulated metagenomic datasets from simple to complex with species coverage setting following uniform and lognormal distribution showed that Xgenovo can be superior to the original Genovo and the recently proposed metagenome assembler for 454 reads, MAP. Xgenovo successfully generated longer N50 than Genovo and MAP while maintaining the assembly quality even for very complex metagenomic datasets consisting of 115 species. Xgenovo also demonstrated the potential to

  11. The Source and Evolutionary History of a Microbial Contaminant Identified Through Soil Metagenomic Analysis.

    Science.gov (United States)

    Olm, Matthew R; Butterfield, Cristina N; Copeland, Alex; Boles, T Christian; Thomas, Brian C; Banfield, Jillian F

    2017-02-21

    In this study, strain-resolved metagenomics was used to solve a mystery. A 6.4-Mbp complete closed genome was recovered from a soil metagenome and found to be astonishingly similar to that of Delftia acidovorans SPH-1, which was isolated in Germany a decade ago. It was suspected that this organism was not native to the soil sample because it lacked the diversity that is characteristic of other soil organisms; this suspicion was confirmed when PCR testing failed to detect the bacterium in the original soil samples. D. acidovorans was also identified in 16 previously published metagenomes from multiple environments, but detailed-scale single nucleotide polymorphism analysis grouped these into five distinct clades. All of the strains indicated as contaminants fell into one clade. Fragment length anomalies were identified in paired reads mapping to the contaminant clade genotypes only. This finding was used to establish that the DNA was present in specific size selection reagents used during sequencing. Ultimately, the source of the contaminant was identified as bacterial biofilms growing in tubing. On the basis of direct measurement of the rate of fixation of mutations across the period of time in which contamination was occurring, we estimated the time of separation of the contaminant strain from the genomically sequenced ancestral population within a factor of 2. This research serves as a case study of high-resolution microbial forensics and strain tracking accomplished through metagenomics-based comparative genomics. The specific case reported here is unusual in that the study was conducted in the background of a soil metagenome and the conclusions were confirmed by independent methods.IMPORTANCE It is often important to determine the source of a microbial strain. Examples include tracking a bacterium linked to a disease epidemic, contaminating the food supply, or used in bioterrorism. Strain identification and tracking are generally approached by using

  12. Detection of Novel Integrons in the Metagenome of Human Saliva.

    Directory of Open Access Journals (Sweden)

    Supathep Tansirichaiya

    Full Text Available Integrons are genetic elements capable of capturing and expressing open reading frames (ORFs embedded within gene cassettes. They are involved in the dissemination of antibiotic resistance genes (ARGs in clinically important pathogens. Although the ARGs are common in the oral cavity the association of integrons and antibiotic resistance has not been reported there. In this work, a PCR-based approach was used to investigate the presence of integrons and associated gene cassettes in human oral metagenomic DNA obtained from both the UK and Bangladesh. We identified a diverse array of gene cassettes containing ORFs predicted to confer antimicrobial resistance and other adaptive traits. The predicted proteins include a putative streptogramin A O-acetyltransferase, a bleomycin binding protein, cof-like hydrolase, competence and motility related proteins. This is the first study detecting integron gene cassettes directly from oral metagenomic DNA samples. The predicted proteins are likely to carry out a multitude of functions; however, the function of the majority is yet unknown.

  13. Comparative metagenome of a stream impacted by the urbanization phenomenon.

    Science.gov (United States)

    Medeiros, Julliane Dutra; Cantão, Maurício Egídio; Cesar, Dionéia Evangelista; Nicolás, Marisa Fabiana; Diniz, Cláudio Galuppo; Silva, Vânia Lúcia; Vasconcelos, Ana Tereza Ribeiro de; Coelho, Cíntia Marques

    Rivers and streams are important reservoirs of freshwater for human consumption. These ecosystems are threatened by increasing urbanization, because raw sewage discharged into them alters their nutrient content and may affect the composition of their microbial community. In the present study, we investigate the taxonomic and functional profile of the microbial community in an urban lotic environment. Samples of running water were collected at two points in the São Pedro stream: an upstream preserved and non-urbanized area, and a polluted urbanized area with discharged sewage. The metagenomic DNA was sequenced by pyrosequencing. Differences were observed in the community composition at the two sites. The non-urbanized area was overrepresented by genera of ubiquitous microbes that act in the maintenance of environments. In contrast, the urbanized metagenome was rich in genera pathogenic to humans. The functional profile indicated that the microbes act on the metabolism of methane, nitrogen and sulfur, especially in the urbanized area. It was also found that virulence/defense (antibiotic resistance and metal resistance) and stress response-related genes were disseminated in the urbanized environment. The structure of the microbial community was altered by uncontrolled anthropic interference, highlighting the selective pressure imposed by high loads of urban sewage discharged into freshwater environments. Copyright © 2016 Sociedade Brasileira de Microbiologia. Published by Elsevier Editora Ltda. All rights reserved.

  14. Compressed sensing methods for DNA microarrays, RNA interference, and metagenomics.

    Science.gov (United States)

    Rao, Aditya; P, Deepthi; Renumadhavi, C H; Chandra, M Girish; Srinivasan, Rajgopal

    2015-02-01

    Compressed sensing (CS) is a sparse signal sampling methodology for efficiently acquiring and reconstructing a signal from relatively few measurements. Recent work shows that CS is well-suited to be applied to problems in genomics, including probe design in microarrays, RNA interference (RNAi), and taxonomic assignment in metagenomics. The principle of using different CS recovery methods in these applications has thus been established, but a comprehensive study of using a wide range of CS methods has not been done. For each of these applications, we apply three hitherto unused CS methods, namely, l1-magic, CoSaMP, and l1-homotopy, in conjunction with CS measurement matrices such as randomly generated CS m matrix, Hamming matrix, and projective geometry-based matrix. We find that, in RNAi, the l1-magic (the standard package for l1 minimization) and l1-homotopy methods show significant reduction in reconstruction error compared to the baseline. In metagenomics, we find that l1-homotopy as well as CoSaMP estimate concentration with significantly reduced time when compared to the GPSR and WGSQuikr methods.

  15. Metagenomic characterization of viral communities in Goseong Bay, Korea

    Science.gov (United States)

    Hwang, Jinik; Park, So Yun; Park, Mirye; Lee, Sukchan; Jo, Yeonhwa; Cho, Won Kyong; Lee, Taek-Kyun

    2016-12-01

    In this study, seawater samples were collected from Goseong Bay, Korea in March 2014 and viral populations were examined by metagenomics assembly. Enrichment of marine viral particles using FeCl3 followed by next-generation sequencing produced numerous sequences. De novo assembly and BLAST search showed that most of the obtained contigs were unknown sequences and only 0.74% of sequences were associated with known viruses. As a result, 138 viruses, including bacteriophages (87%), viruses infecting algae and others (13%) were identified. The identified 138 viruses were divided into 11 orders, 14 families, 34 genera, and 133 species. The dominant viruses were Pelagibacter phage HTVC010P and Roseobacter phage SIO1. The viruses infecting algae, including the Ostreococcus species, accounted for 9.4% of total identified viruses. In addition, we identified pathogenic herpes viruses infecting fishes and giant viruses infecting parasitic acanthamoeba species. This is a comprehensive study to reveal the viral populations in the Goseong Bay using metagenomics. The information associated with the marine viral community in Goseong Bay, Korea will be useful for comparative analysis in other marine viral communities.

  16. Metagenomic insights into important microbes from the Dead Zone

    Science.gov (United States)

    Thrash, C.; Baker, B.; Seitz, K.; Temperton, B.; Gillies, L.; Rabalais, N. N.; Mason, O. U.

    2015-12-01

    Coastal regions of eutrophication-driven oxygen depletion are widespread and increasing in number. Also known as dead zones, these regions take their name from the deleterious effects of hypoxia (dissolved oxygen less than 2 mg/L) on shrimp, demersal fish, and other animal life. Dead zones result from nutrient enrichment of primary production, concomitant consumption by chemoorganotrophic aerobic microorganisms, and strong stratification that prevents ventilation of bottom water. One of the largest dead zones in the world occurs seasonally in the northern Gulf of Mexico (nGOM), where hypoxia can reach up to 22,000 square kilometers. While this dead zone shares many features with more well-known marine oxygen minimum zones, it is nevertheless understudied with regards to the microbial assemblages involved in biogeochemical cycling. We performed metagenomic and metatranscriptomic sequencing on six samples from the 2013 nGOM dead zone from both hypoxic and oxic bottom waters. Assembly and binning led to the recovery of over fifty partial to nearly complete metagenomes from key microbial taxa previously determined to be numerically abundant from 16S rRNA data, such as Thaumarcheaota, Marine Group II Euryarchaeota, SAR406, SAR324, Synechococcus spp., and Planctomycetes. These results provide information about the roles of these taxa in the nGOM dead zone, and opportunities for comparing this region of low oxygen to others around the globe.

  17. Metagenomic analysis of the turkey gut RNA virus community

    Directory of Open Access Journals (Sweden)

    Scheffler Brian E

    2010-11-01

    Full Text Available Abstract Viral enteric disease is an ongoing economic burden to poultry producers worldwide, and despite considerable research, no single virus has emerged as a likely causative agent and target for prevention and control efforts. Historically, electron microscopy has been used to identify suspect viruses, with many small, round viruses eluding classification based solely on morphology. National and regional surveys using molecular diagnostics have revealed that suspect viruses continuously circulate in United States poultry, with many viruses appearing concomitantly and in healthy birds. High-throughput nucleic acid pyrosequencing is a powerful diagnostic technology capable of determining the full genomic repertoire present in a complex environmental sample. We utilized the Roche/454 Life Sciences GS-FLX platform to compile an RNA virus metagenome from turkey flocks experiencing enteric disease. This approach yielded numerous sequences homologous to viruses in the BLAST nr protein database, many of which have not been described in turkeys. Our analysis of this turkey gut RNA metagenome focuses in particular on the turkey-origin members of the Picornavirales, the Caliciviridae, and the turkey Picobirnaviruses.

  18. Comparative metagenome of a stream impacted by the urbanization phenomenon

    Directory of Open Access Journals (Sweden)

    Julliane Dutra Medeiros

    Full Text Available Abstract Rivers and streams are important reservoirs of freshwater for human consumption. These ecosystems are threatened by increasing urbanization, because raw sewage discharged into them alters their nutrient content and may affect the composition of their microbial community. In the present study, we investigate the taxonomic and functional profile of the microbial community in an urban lotic environment. Samples of running water were collected at two points in the São Pedro stream: an upstream preserved and non-urbanized area, and a polluted urbanized area with discharged sewage. The metagenomic DNA was sequenced by pyrosequencing. Differences were observed in the community composition at the two sites. The non-urbanized area was overrepresented by genera of ubiquitous microbes that act in the maintenance of environments. In contrast, the urbanized metagenome was rich in genera pathogenic to humans. The functional profile indicated that the microbes act on the metabolism of methane, nitrogen and sulfur, especially in the urbanized area. It was also found that virulence/defense (antibiotic resistance and metal resistance and stress response-related genes were disseminated in the urbanized environment. The structure of the microbial community was altered by uncontrolled anthropic interference, highlighting the selective pressure imposed by high loads of urban sewage discharged into freshwater environments.

  19. Strain-Level Discrimination of Shiga Toxin-Producing Escherichia coli in Spinach Using Metagenomic Sequencing.

    Science.gov (United States)

    Leonard, Susan R; Mammel, Mark K; Lacher, David W; Elkins, Christopher A

    2016-01-01

    Consumption of fresh bagged spinach contaminated with Shiga toxin-producing Escherichia coli (STEC) has led to severe illness and death; however current culture-based methods to detect foodborne STEC are time consuming. Since not all STEC strains are considered pathogenic to humans, it is crucial to incorporate virulence characterization of STEC in the detection method. In this study, we assess the comprehensiveness of utilizing a shotgun metagenomics approach for detection and strain-level identification by spiking spinach with a variety of genomically disparate STEC strains at a low contamination level of 0.1 CFU/g. Molecular serotyping, virulence gene characterization, microbial community analysis, and E. coli core gene single nucleotide polymorphism (SNP) analysis were performed on metagenomic sequence data from enriched samples. It was determined from bacterial community analysis that E. coli, which was classified at the phylogroup level, was a major component of the population in most samples. However, in over half the samples, molecular serotyping revealed the presence of indigenous E. coli which also contributed to the percent abundance of E. coli. Despite the presence of additional E. coli strains, the serotype and virulence genes of the spiked STEC, including correct Shiga toxin subtype, were detected in 94% of the samples with a total number of reads per sample averaging 2.4 million. Variation in STEC abundance and/or detection was observed in replicate spiked samples, indicating an effect from the indigenous microbiota during enrichment. SNP analysis of the metagenomic data correctly placed the spiked STEC in a phylogeny of related strains in cases where the indigenous E. coli did not predominate in the enriched sample. Also, for these samples, our analysis demonstrates that strain-level phylogenetic resolution is possible using shotgun metagenomic data for determining the genomic relatedness of a contaminating STEC strain to other closely related E

  20. Strain-Level Discrimination of Shiga Toxin-Producing Escherichia coli in Spinach Using Metagenomic Sequencing.

    Directory of Open Access Journals (Sweden)

    Susan R Leonard

    Full Text Available Consumption of fresh bagged spinach contaminated with Shiga toxin-producing Escherichia coli (STEC has led to severe illness and death; however current culture-based methods to detect foodborne STEC are time consuming. Since not all STEC strains are considered pathogenic to humans, it is crucial to incorporate virulence characterization of STEC in the detection method. In this study, we assess the comprehensiveness of utilizing a shotgun metagenomics approach for detection and strain-level identification by spiking spinach with a variety of genomically disparate STEC strains at a low contamination level of 0.1 CFU/g. Molecular serotyping, virulence gene characterization, microbial community analysis, and E. coli core gene single nucleotide polymorphism (SNP analysis were performed on metagenomic sequence data from enriched samples. It was determined from bacterial community analysis that E. coli, which was classified at the phylogroup level, was a major component of the population in most samples. However, in over half the samples, molecular serotyping revealed the presence of indigenous E. coli which also contributed to the percent abundance of E. coli. Despite the presence of additional E. coli strains, the serotype and virulence genes of the spiked STEC, including correct Shiga toxin subtype, were detected in 94% of the samples with a total number of reads per sample averaging 2.4 million. Variation in STEC abundance and/or detection was observed in replicate spiked samples, indicating an effect from the indigenous microbiota during enrichment. SNP analysis of the metagenomic data correctly placed the spiked STEC in a phylogeny of related strains in cases where the indigenous E. coli did not predominate in the enriched sample. Also, for these samples, our analysis demonstrates that strain-level phylogenetic resolution is possible using shotgun metagenomic data for determining the genomic relatedness of a contaminating STEC strain to other

  1. The road to metagenomics: from microbiology to DNA sequencing technologies and bioinformatics

    Directory of Open Access Journals (Sweden)

    Alejandra eEscobar-Zepeda

    2015-12-01

    Full Text Available The study of microorganisms that pervade each and every part of this planet has encountered many challenges through time such as the discovery of unknown organisms and the understanding of how they interact with their environment. The aim of this review is to take the reader along the timeline and major milestones that led us to modern metagenomics. This new and thriving area is likely to be an important contributor to solve different problems. The transition from classical microbiology to modern metagenomics studies has required the development of new branches of knowledge and specialization. Here, we will review how the availability of high-throughput sequencing technologies have transform microbiology and bioinformatics and how we tackle the inherent computational challenges that arose from this DNA sequencing revolution. New computational methods are constantly developed to collect, process and extract useful biological information from a variety of samples and complex datasets but metagenomics needs the integration of several of these computational methods. Despite the level of specialization needed in bioinformatics, is important that life-scientists have a good understanding of it, for a correct experimental design that allow them to reveal the information in a metagenome.

  2. The Road to Metagenomics: From Microbiology to DNA Sequencing Technologies and Bioinformatics.

    Science.gov (United States)

    Escobar-Zepeda, Alejandra; Vera-Ponce de León, Arturo; Sanchez-Flores, Alejandro

    2015-01-01

    The study of microorganisms that pervade each and every part of this planet has encountered many challenges through time such as the discovery of unknown organisms and the understanding of how they interact with their environment. The aim of this review is to take the reader along the timeline and major milestones that led us to modern metagenomics. This new and thriving area is likely to be an important contributor to solve different problems. The transition from classical microbiology to modern metagenomics studies has required the development of new branches of knowledge and specialization. Here, we will review how the availability of high-throughput sequencing technologies has transformed microbiology and bioinformatics and how to tackle the inherent computational challenges that arise from the DNA sequencing revolution. New computational methods are constantly developed to collect, process, and extract useful biological information from a variety of samples and complex datasets, but metagenomics needs the integration of several of these computational methods. Despite the level of specialization needed in bioinformatics, it is important that life-scientists have a good understanding of it for a correct experimental design, which allows them to reveal the information in a metagenome.

  3. Estimating DNA coverage and abundance in metagenomes using a gamma approximation

    Energy Technology Data Exchange (ETDEWEB)

    Hooper, Sean D; Dalevi, Daniel; Pati, Amrita; Mavromatis, Konstantinos; Ivanova, Natalia N; Kyrpides, Nikos C

    2010-01-01

    Shotgun sequencing generates large numbers of short DNA reads from either an isolated organism or, in the case of metagenomics projects, from the aggregate genome of a microbial community. These reads are then assembled based on overlapping sequences into larger, contiguous sequences (contigs). The feasibility of assembly and the coverage achieved (reads per nucleotide or distinct sequence of nucleotides) depend on several factors: the number of reads sequenced, the read length and the relative abundances of their source genomes in the microbial community. A low coverage suggests that most of the genomic DNA in the sample has not been sequenced, but it is often difficult to estimate either the extent of the uncaptured diversity or the amount of additional sequencing that would be most efficacious. In this work, we regard a metagenome as a population of DNA fragments (bins), each of which may be covered by one or more reads. We employ a gamma distribution to model this bin population due to its flexibility and ease of use. When a gamma approximation can be found that adequately fits the data, we may estimate the number of bins that were not sequenced and that could potentially be revealed by additional sequencing. We evaluated the performance of this model using simulated metagenomes and demonstrate its applicability on three recent metagenomic datasets.

  4. Resolving the Complexity of Human Skin Metagenomes Using Single-Molecule Sequencing

    Directory of Open Access Journals (Sweden)

    Yu-Chih Tsai

    2016-02-01

    Full Text Available Deep metagenomic shotgun sequencing has emerged as a powerful tool to interrogate composition and function of complex microbial communities. Computational approaches to assemble genome fragments have been demonstrated to be an effective tool for de novo reconstruction of genomes from these communities. However, the resultant “genomes” are typically fragmented and incomplete due to the limited ability of short-read sequence data to assemble complex or low-coverage regions. Here, we use single-molecule, real-time (SMRT sequencing to reconstruct a high-quality, closed genome of a previously uncharacterized Corynebacterium simulans and its companion bacteriophage from a skin metagenomic sample. Considerable improvement in assembly quality occurs in hybrid approaches incorporating short-read data, with even relatively small amounts of long-read data being sufficient to improve metagenome reconstruction. Using short-read data to evaluate strain variation of this C. simulans in its skin community at single-nucleotide resolution, we observed a dominant C. simulans strain with moderate allelic heterozygosity throughout the population. We demonstrate the utility of SMRT sequencing and hybrid approaches in metagenome quantitation, reconstruction, and annotation.

  5. Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes

    Directory of Open Access Journals (Sweden)

    Ramy Karam Aziz

    2015-05-01

    Full Text Available Phages are the most abundant biological entities on Earth and play major ecological roles, yet the current sequenced phage genomes do not adequately represent their diversity, and little is known about the abundance and distribution of these sequenced genomes in nature. Although the study of phage ecology has benefited tremendously from the emergence of metagenomic sequencing, a systematic survey of phage genes and genomes in various ecosystems is still lacking, and fundamental questions about phage biology, lifestyle, and ecology remain unanswered. To address these questions and improve comparative analysis of phages in different metagenomes, we screened a core set of publicly available metagenomic samples for sequences related to completely sequenced phages using the web tool, Phage Eco-Locator. We then adopted and deployed an array of mathematical and statistical metrics for a multidimensional estimation of the abundance and distribution of phage genes and genomes in various ecosystems. Experiments using those metrics individually showed their usefulness in emphasizing the pervasive, yet uneven, distribution of known phage sequences in environmental metagenomes. Using these metrics in combination allowed us to resolve phage genomes into clusters that correlated with their genotypes and taxonomic classes as well as their ecological properties. We propose adding this set of metrics to current metaviromic analysis pipelines, where they can provide insight regarding phage mosaicism, habitat specificity, and evolution.

  6. From cultured to uncultured genome sequences: metagenomics and modeling microbial ecosystems.

    Science.gov (United States)

    Garza, Daniel R; Dutilh, Bas E

    2015-11-01

    Microorganisms and the viruses that infect them are the most numerous biological entities on Earth and enclose its greatest biodiversity and genetic reservoir. With strength in their numbers, these microscopic organisms are major players in the cycles of energy and matter that sustain all life. Scientists have only scratched the surface of this vast microbial world through culture-dependent methods. Recent developments in generating metagenomes, large random samples of nucleic acid sequences isolated directly from the environment, are providing comprehensive portraits of the composition, structure, and functioning of microbial communities. Moreover, advances in metagenomic analysis have created the possibility of obtaining complete or nearly complete genome sequences from uncultured microorganisms, providing important means to study their biology, ecology, and evolution. Here we review some of the recent developments in the field of metagenomics, focusing on the discovery of genetic novelty and on methods for obtaining uncultured genome sequences, including through the recycling of previously published datasets. Moreover we discuss how metagenomics has become a core scientific tool to characterize eco-evolutionary patterns of microbial ecosystems, thus allowing us to simultaneously discover new microbes and study their natural communities. We conclude by discussing general guidelines and challenges for modeling the interactions between uncultured microorganisms and viruses based on the information contained in their genome sequences. These models will significantly advance our understanding of the functioning of microbial ecosystems and the roles of microbes in the environment.

  7. Metagenomics: Tools and Insights for Analyzing Next-Generation Sequencing Data Derived from Biodiversity Studies

    Science.gov (United States)

    Oulas, Anastasis; Pavloudi, Christina; Polymenakou, Paraskevi; Pavlopoulos, Georgios A; Papanikolaou, Nikolas; Kotoulas, Georgios; Arvanitidis, Christos; Iliopoulos, Ioannis

    2015-01-01

    Advances in next-generation sequencing (NGS) have allowed significant breakthroughs in microbial ecology studies. This has led to the rapid expansion of research in the field and the establishment of “metagenomics”, often defined as the analysis of DNA from microbial communities in environmental samples without prior need for culturing. Many metagenomics statistical/computational tools and databases have been developed in order to allow the exploitation of the huge influx of data. In this review article, we provide an overview of the sequencing technologies and how they are uniquely suited to various types of metagenomic studies. We focus on the currently available bioinformatics techniques, tools, and methodologies for performing each individual step of a typical metagenomic dataset analysis. We also provide future trends in the field with respect to tools and technologies currently under development. Moreover, we discuss data management, distribution, and integration tools that are capable of performing comparative metagenomic analyses of multiple datasets using well-established databases, as well as commonly used annotation standards. PMID:25983555

  8. Metagenomic Analysis of Viruses in Feces from Unsolved Outbreaks of Gastroenteritis in Humans

    Science.gov (United States)

    Moore, Nicole E.; Wang, Jing; Hewitt, Joanne; Croucher, Dawn; Williamson, Deborah A.; Paine, Shevaun; Yen, Seiha; Greening, Gail E.

    2014-01-01

    The etiology of an outbreak of gastroenteritis in humans cannot always be determined, and ∼25% of outbreaks remain unsolved in New Zealand. It is hypothesized that novel viruses may account for a proportion of unsolved cases, and new unbiased high-throughput sequencing methods hold promise for their detection. Analysis of the fecal metagenome can reveal the presence of viruses, bacteria, and parasites which may have evaded routine diagnostic testing. Thirty-one fecal samples from 26 gastroenteritis outbreaks of unknown etiology occurring in New Zealand between 2011 and 2012 were selected for de novo metagenomic analysis. A total data set of 193 million sequence reads of 150 bp in length was produced on an Illumina MiSeq. The metagenomic data set was searched for virus and parasite sequences, with no evidence of novel pathogens found. Eight viruses and one parasite were detected, each already known to be associated with gastroenteritis, including adenovirus, rotavirus, sapovirus, and Dientamoeba fragilis. In addition, we also describe the first detection of human parechovirus 3 (HPeV3) in Australasia. Metagenomics may thus provide a useful audit tool when applied retrospectively to determine where routine diagnostic processes may have failed to detect a pathogen. PMID:25339401

  9. Comprehensive Diagnosis of Bacterial Infection Associated with Acute Cholecystitis Using Metagenomic Approach

    Directory of Open Access Journals (Sweden)

    Manabu Kujiraoka

    2017-04-01

    Full Text Available Acute cholecystitis (AC, which is strongly associated with retrograde bacterial infection, is an inflammatory disease that can be fatal if inappropriately treated. Currently, bacterial culture testing, which is basically recommended to detect the etiological agent, is a time-consuming (4–6 days, non-comprehensive approach. To rapidly detect a potential pathogen and predict its antimicrobial susceptibility, we undertook a metagenomic approach to characterize the bacterial infection associated with AC. Six patients (P1–P6 who underwent cholecystectomy for AC were enrolled in this study. Metagenome analysis demonstrated possible single or multiple bacterial infections in four patients (P1, P2, P3, and P4 with 24-h experimental procedures; in addition, the CTX-M extended-spectrum ß-lactamase (ESBL gene was identified in two bile samples (P1 and P4. Further whole genome sequencing of Escherichia coli isolates suggested that CTX-M-27-producing ST131 and CTX-M-14-producing novel-ST were identified in P1 and P4, respectively. Metagenome analysis of feces and saliva also suggested some imbalance in the microbiota for more comprehensive assessment of patients with AC. In conclusion, metagenome analysis was useful for rapid bacterial diagnostics, including assessing potential antimicrobial susceptibility, in patients with AC.

  10. Evaluation of a pooled strategy for high-throughput sequencing of cosmid clones from metagenomic libraries.

    Directory of Open Access Journals (Sweden)

    Kathy N Lam

    Full Text Available High-throughput sequencing methods have been instrumental in the growing field of metagenomics, with technological improvements enabling greater throughput at decreased costs. Nonetheless, the economy of high-throughput sequencing cannot be fully leveraged in the subdiscipline of functional metagenomics. In this area of research, environmental DNA is typically cloned to generate large-insert libraries from which individual clones are isolated, based on specific activities of interest. Sequence data are required for complete characterization of such clones, but the sequencing of a large set of clones requires individual barcode-based sample preparation; this can become costly, as the cost of clone barcoding scales linearly with the number of clones processed, and thus sequencing a large number of metagenomic clones often remains cost-prohibitive. We investigated a hybrid Sanger/Illumina pooled sequencing strategy that omits barcoding altogether, and we evaluated this strategy by comparing the pooled sequencing results to reference sequence data obtained from traditional barcode-based sequencing of the same set of clones. Using identity and coverage metrics in our evaluation, we show that pooled sequencing can generate high-quality sequence data, without producing problematic chimeras. Though caveats of a pooled strategy exist and further optimization of the method is required to improve recovery of complete clone sequences and to avoid circumstances that generate unrecoverable clone sequences, our results demonstrate that pooled sequencing represents an effective and low-cost alternative for sequencing large sets of metagenomic clones.

  11. Global metagenomic survey reveals a new bacterial candidate phylum in geothermal springs.

    Science.gov (United States)

    Eloe-Fadrosh, Emiley A; Paez-Espino, David; Jarett, Jessica; Dunfield, Peter F; Hedlund, Brian P; Dekas, Anne E; Grasby, Stephen E; Brady, Allyson L; Dong, Hailiang; Briggs, Brandon R; Li, Wen-Jun; Goudeau, Danielle; Malmstrom, Rex; Pati, Amrita; Pett-Ridge, Jennifer; Rubin, Edward M; Woyke, Tanja; Kyrpides, Nikos C; Ivanova, Natalia N

    2016-01-27

    Analysis of the increasing wealth of metagenomic data collected from diverse environments can lead to the discovery of novel branches on the tree of life. Here we analyse 5.2 Tb of metagenomic data collected globally to discover a novel bacterial phylum ('Candidatus Kryptonia') found exclusively in high-temperature pH-neutral geothermal springs. This lineage had remained hidden as a taxonomic 'blind spot' because of mismatches in the primers commonly used for ribosomal gene surveys. Genome reconstruction from metagenomic data combined with single-cell genomics results in several high-quality genomes representing four genera from the new phylum. Metabolic reconstruction indicates a heterotrophic lifestyle with conspicuous nutritional deficiencies, suggesting the need for metabolic complementarity with other microbes. Co-occurrence patterns identifies a number of putative partners, including an uncultured Armatimonadetes lineage. The discovery of Kryptonia within previously studied geothermal springs underscores the importance of globally sampled metagenomic data in detection of microbial novelty, and highlights the extraordinary diversity of microbial life still awaiting discovery.

  12. WebCARMA: a web application for the functional and taxonomic classification of unassembled metagenomic reads

    Directory of Open Access Journals (Sweden)

    Jünemann Sebastian

    2009-12-01

    Full Text Available Abstract Background Metagenomics is a new field of research on natural microbial communities. High-throughput sequencing techniques like 454 or Solexa-Illumina promise new possibilities as they are able to produce huge amounts of data in much shorter time and with less efforts and costs than the traditional Sanger technique. But the data produced comes in even shorter reads (35-100 basepairs with Illumina, 100-500 basepairs with 454-sequencing. CARMA is a new software pipeline for the characterisation of species composition and the genetic potential of microbial samples using short, unassembled reads. Results In this paper, we introduce WebCARMA, a refined version of CARMA available as a web application for the taxonomic and functional classification of unassembled (ultra-short reads from metagenomic communities. In addition, we have analysed the applicability of ultra-short reads in metagenomics. Conclusions We show that unassembled reads as short as 35 bp can be used for the taxonomic classification of a metagenome. The web application is freely available at http://webcarma.cebitec.uni-bielefeld.de.

  13. Genome diversity of marine phages recovered from Mediterranean metagenomes: Size matters.

    Directory of Open Access Journals (Sweden)

    Mario López-Pérez

    2017-09-01

    Full Text Available Marine viruses play a critical role not only in the global geochemical cycles but also in the biology and evolution of their hosts. Despite their importance, viral diversity remains underexplored mostly due to sampling and cultivation challenges. Direct sequencing approaches such as viromics has provided new insights into the marine viral world. As a complementary approach, we analysed 24 microbial metagenomes (>0.2 μm size range obtained from six sites in the Mediterranean Sea that vary by depth, season and filter used to retrieve the fraction. Filter-size comparison showed a significant number of viral sequences that were retained on the larger-pore filters and were different from those found in the viral fraction from the same sample, indicating that some important viral information is missing using only assembly from viromes. Besides, we were able to describe 1,323 viral genomic fragments that were more than 10Kb in length, of which 36 represented complete viral genomes including some of them retrieved from a cross-assembly from different metagenomes. Host prediction based on sequence methods revealed new phage groups belonging to marine prokaryotes like SAR11, Cyanobacteria or SAR116. We also identified the first complete virophage from deep seawater and a new endemic clade of the recently discovered Marine group II Euryarchaeota virus. Furthermore, analysis of viral distribution using metagenomes and viromes indicated that most of the new phages were found exclusively in the Mediterranean Sea and some of them, mostly the ones recovered from deep metagenomes, do not recruit in any database probably indicating higher variability and endemicity in Mediterranean bathypelagic waters. Together these data provide the first detailed picture of genomic diversity, spatial and depth variations of viral communities within the Mediterranean Sea using metagenome assembly.

  14. The Source and Evolutionary History of a Microbial Contaminant Identified Through Soil Metagenomic Analysis

    Directory of Open Access Journals (Sweden)

    Matthew R. Olm

    2017-02-01

    Full Text Available In this study, strain-resolved metagenomics was used to solve a mystery. A 6.4-Mbp complete closed genome was recovered from a soil metagenome and found to be astonishingly similar to that of Delftia acidovorans SPH-1, which was isolated in Germany a decade ago. It was suspected that this organism was not native to the soil sample because it lacked the diversity that is characteristic of other soil organisms; this suspicion was confirmed when PCR testing failed to detect the bacterium in the original soil samples. D. acidovorans was also identified in 16 previously published metagenomes from multiple environments, but detailed-scale single nucleotide polymorphism analysis grouped these into five distinct clades. All of the strains indicated as contaminants fell into one clade. Fragment length anomalies were identified in paired reads mapping to the contaminant clade genotypes only. This finding was used to establish that the DNA was present in specific size selection reagents used during sequencing. Ultimately, the source of the contaminant was identified as bacterial biofilms growing in tubing. On the basis of direct measurement of the rate of fixation of mutations across the period of time in which contamination was occurring, we estimated the time of separation of the contaminant strain from the genomically sequenced ancestral population within a factor of 2. This research serves as a case study of high-resolution microbial forensics and strain tracking accomplished through metagenomics-based comparative genomics. The specific case reported here is unusual in that the study was conducted in the background of a soil metagenome and the conclusions were confirmed by independent methods.

  15. A New Unsupervised Binning Approach for Metagenomic Sequences Based on N-grams and Automatic Feature Weighting.

    Science.gov (United States)

    Liao, Ruiqi; Zhang, Ruichang; Guan, Jihong; Zhou, Shuigeng

    2014-01-01

    The rapid development of high-throughput technologies enables researchers to sequence the whole metagenome of a microbial community sampled directly from the environment. The assignment of these sequence reads into different species or taxonomical classes is a crucial step for metagenomic analysis, which is referred to as binning of metagenomic data. Most traditional binning methods rely on known reference genomes for accurate assignment of the sequence reads, therefore cannot classify reads from unknown species without the help of close references. To overcome this drawback, unsupervised learning based approaches have been proposed, which need not any known species' reference genome for help. In this paper, we introduce a novel unsupervised method called MCluster for binning metagenomic sequences. This method uses N-grams to extract sequence features and utilizes automatic feature weighting to improve the performance of the basic K-means clustering algorithm. We evaluate MCluster on a variety of simulated data sets and a real data set, and compare it with three latest binning methods: AbundanceBin, MetaCluster 3.0, and MetaCluster 5.0. Experimental results show that MCluster achieves obviously better overall performance (F-measure) than AbundanceBin and MetaCluster 3.0 on long metagenomic reads (≥800 bp); while compared with MetaCluster 5.0, MCluster obtains a larger sensitivity, and a comparable yet more stable F-measure on short metagenomic reads (<300 bp). This suggests that MCluster can serve as a promising tool for effectively binning metagenomic sequences.

  16. Metagenomic Analysis of Koumiss in Kazakhstan

    Directory of Open Access Journals (Sweden)

    Samat Kozhakhmetov

    2014-12-01

    Full Text Available Introduction. Koumiss is a low-alcohol product made from fermented mare's milk, which is popular in Kazakhstan, Russia, and other countries of Central Asia, China, and Mongolia. Natural mare's milk is fermented in symbiosis of two types of microorganisms (lactobacteria and yeast. Koumiss’s microbial composition varies depending on the geographical, climatic, and cultural conditions. Based on a phenotypic characteristic from samples, Wu, R. and colleagues identified the following bacteria isolated in inner Mongolia, an autonomous region of China: L.casei, L.helveticus, L.plantarum, L.coryniformis subsp. coryniformis, L.paracasei, L.kefiranofaciens, L.curvatus, L.fermentum, and W.kandleri. Studies of the yeast composition in koumiss also showed significant variations. Thus, there were Saccharomyces unisporus related 48.3% of isolates, to Kluyveromyces marxianus (27.6%, Pichia membranaefaciens (15.0%, and Saccharomyces cerevisiae (9.2% from 87 isolated yeast cultures. The purpose of this study was to examine the bacterial composition in koumiss.Methods. To extract DNA, 1.8 ml of fermented milk was centrifuged to generate a pellet, which was suspended in 450 µl of lysis buffer P1 from the Powerfood Microbial DNA Isolation kit (MoBio Laboratories Inc, USA. Amplification of the microflora was used to determine the composition of a fragment of the gene 16S rRNA and ITS1. Plasmid library with target insertion was obtained on the basis of height copy plasmid vectors producing high pGem-T. The definition of direct nucleotide sequencing was performed by the method of Sanger using a set of "BigDye Terminanor v 3.1 Cycle sequencing Kit with automatic genetic analyzer ABI 3730xl  (Applied Biosystems, USA.  Informax Vector NTI Suite 9, Sequence Scanner v 1.0  software package used for the analysis.Results. Our studies showed that in the most samples of koumiss isolated from Akmola region (Central Kazakhstan prevailed the following bacteria species

  17. A highly optimized grid deployment: the metagenomic analysis example.

    Science.gov (United States)

    Aparicio, Gabriel; Blanquer, Ignacio; Hernández, Vicente

    2008-01-01

    Computational resources and computationally expensive processes are two topics that are not growing at the same ratio. The availability of large amounts of computing resources in Grid infrastructures does not mean that efficiency is not an important issue. It is necessary to analyze the whole process to improve partitioning and submission schemas, especially in the most critical experiments. This is the case of metagenomic analysis, and this text shows the work done in order to optimize a Grid deployment, which has led to a reduction of the response time and the failure rates. Metagenomic studies aim at processing samples of multiple specimens to extract the genes and proteins that belong to the different species. In many cases, the sequencing of the DNA of many microorganisms is hindered by the impossibility of growing significant samples of isolated specimens. Many bacteria cannot survive alone, and require the interaction with other organisms. In such cases, the information of the DNA available belongs to different kinds of organisms. One important stage in Metagenomic analysis consists on the extraction of fragments followed by the comparison and analysis of their function stage. By the comparison to existing chains, whose function is well known, fragments can be classified. This process is computationally intensive and requires of several iterations of alignment and phylogeny classification steps. Source samples reach several millions of sequences, which could reach up to thousands of nucleotides each. These sequences are compared to a selected part of the "Non-redundant" database which only implies the information from eukaryotic species. From this first analysis, a refining process is performed and alignment analysis is restarted from the results. This process implies several CPU years. The article describes and analyzes the difficulties to fragment, automate and check the above operations in current Grid production environments. This environment has been

  18. Unbiased Detection of Respiratory Viruses by Use of RNA Sequencing-Based Metagenomics: a Systematic Comparison to a Commercial PCR Panel.

    Science.gov (United States)

    Graf, Erin H; Simmon, Keith E; Tardif, Keith D; Hymas, Weston; Flygare, Steven; Eilbeck, Karen; Yandell, Mark; Schlaberg, Robert

    2016-04-01

    Current infectious disease molecular tests are largely pathogen specific, requiring test selection based on the patient's symptoms. For many syndromes caused by a large number of viral, bacterial, or fungal pathogens, such as respiratory tract infections, this necessitates large panels of tests and has limited yield. In contrast, next-generation sequencing-based metagenomics can be used for unbiased detection of any expected or unexpected pathogen. However, barriers for its diagnostic implementation include incomplete understanding of analytical performance and complexity of sequence data analysis. We compared detection of known respiratory virus-positive (n= 42) and unselected (n= 67) pediatric nasopharyngeal swabs using an RNA sequencing (RNA-seq)-based metagenomics approach and Taxonomer, an ultrarapid, interactive, web-based metagenomics data analysis tool, with an FDA-cleared respiratory virus panel (RVP; GenMark eSensor). Untargeted metagenomics detected 86% of known respiratory virus infections, and additional PCR testing confirmed RVP results for only 2 (33%) of the discordant samples. In unselected samples, untargeted metagenomics had excellent agreement with the RVP (93%). In addition, untargeted metagenomics detected an additional 12 viruses that were either not targeted by the RVP or missed due to highly divergent genome sequences. Normalized viral read counts for untargeted metagenomics correlated with viral burden determined by quantitative PCR and showed high intrarun and interrun reproducibility. Partial or full-length viral genome sequences were generated in 86% of RNA-seq-positive samples, allowing assessment of antiviral resistance, strain-level typing, and phylogenetic relatedness. Overall, untargeted metagenomics had high agreement with a sensitive RVP, detected viruses not targeted by the RVP, and yielded epidemiologically and clinically valuable sequence information. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  19. Metagenomic studies of the Red Sea.

    Science.gov (United States)

    Behzad, Hayedeh; Ibarra, Martin Augusto; Mineta, Katsuhiko; Gojobori, Takashi

    2016-02-01

    Metagenomics has significantly advanced the field of marine microbial ecology, revealing the vast diversity of previously unknown microbial life forms in different marine niches. The tremendous amount of data generated has enabled identification of a large number of microbial genes (metagenomes), their community interactions, adaptation mechanisms, and their potential applications in pharmaceutical and biotechnology-based industries. Comparative metagenomics reveals that microbial diversity is a function of the local environment, meaning that unique or unusual environments typically harbor novel microbial species with unique genes and metabolic pathways. The Red Sea has an abundance of unique characteristics; however, its microbiota is one of the least studied among marine environments. The Red Sea harbors approximately 25 hot anoxic brine pools, plus a vibrant coral reef ecosystem. Physiochemical studies describe the Red Sea as an oligotrophic environment that contains one of the warmest and saltiest waters in the world with year-round high UV radiations. These characteristics are believed to have shaped the evolution of microbial communities in the Red Sea. Over-representation of genes involved in DNA repair, high-intensity light responses, and osmoregulation were found in the Red Sea metagenomic databases suggesting acquisition of specific environmental adaptation by the Red Sea microbiota. The Red Sea brine pools harbor a diverse range of halophilic and thermophilic bacterial and archaeal communities, which are potential sources of enzymes for pharmaceutical and biotechnology-based application. Understanding the mechanisms of these adaptations and their function within the larger ecosystem could also prove useful in light of predicted global warming scenarios where global ocean temperatures are expected to rise by 1-3°C in the next few decades. In this review, we provide an overview of the published metagenomic studies that were conducted in the Red Sea, and

  20. Metagenomic Approaches to Identify Novel Organisms from the Soil Environment in a Classroom Setting

    Directory of Open Access Journals (Sweden)

    Sadia J. Rahman

    2016-12-01

    Full Text Available Molecular Microbial Metagenomics is a research-based undergraduate course developed at Georgia State University. This semester-long course provides hands-on research experience in the area of microbial diversity and introduces molecular approaches to study diversity. Students are part of an ongoing research project that uses metagenomic approaches to isolate clones containing 16S ribosomal ribonucleic acid (rRNA genes from a soil metagenomic library. These approaches not only provide a measure of microbial diversity in the sample but may also allow discovery of novel organisms. Metagenomic approaches differ from the traditional culturing methods in that they use molecular analysis of community deoxyribonucleic acid (DNA instead of culturing individual organisms. Groups of students select a batch of 100 clones from a metagenomic library. Using universal primers to amplify 16S rRNA genes from the pool of DNA isolated from 100 clones, and a stepwise process of elimination, each group isolates individual clones containing 16S rRNA genes within their batch of 100 clones. The amplified 16S rRNA genes are sequenced and analyzed using bioinformatics tools to determine whether the rRNA gene belongs to a novel organism. This course provides avenues for active learning and enhances students’ conceptual understanding of microbial diversity. Average scores on six assessment methods used during field testing indicated that success in achieving different learning objectives varied between 84% and 95%, with 65% of the students demonstrating complete grasp of the project based on the end-of-project lab report. The authentic research experience obtained in this course is also expected to result in more undergraduates choosing research-based graduate programs or careers.

  1. Exploration of soil metagenome diversity for prospection of enzymes involved in lignocellulosic biomass conversion

    Energy Technology Data Exchange (ETDEWEB)

    Alvarez, T.M.; Squina, F.M. [Laboratorio Nacional de Luz Sincrotron (LNLS), Campinas, SP (Brazil); Paixao, D.A.A.; Franco Cairo, J.P.L.; Buchli, F.; Ruller, R. [Laboratorio Nacional de Ciencia e Tecnologia do Bioetanol (CTBE), Campinas, SP (Brazil); Prade, R. [Oklahoma State University, Sillwater, OK (United States)

    2012-07-01

    Full text: Metagenomics allows access to genetic information encoded in DNA of microorganisms recalcitrant to cultivation. They represent a reservoir of novel biocatalyst with potential application in environmental friendly techniques aiming to overcome the dependence on fossil fuels and also to diminish air and water pollution. The focus of our work is the generation of a tool kit of lignocellulolytic enzymes from soil metagenome, which could be used for second generation ethanol production. Environmental samples were collected at a sugarcane field after harvesting, where it is expected that the microbial population involved on lignocellulose degradation was enriched due to the presence of straws covering the soil. Sugarcane Bagasse-Degrading-Soil (SBDS) metagenome was massively-parallel-454-Roche-sequenced. We identified a full repertoire of genes with significant match to glycosyl hydrolases catalytic domain and carbohydrate-binding modules. Soil metagenomics libraries cloned into pUC19 were screened through functional assays. CMC-agar screening resulted in positive clones, revealing new cellulases coding genes. Through a CMC-zymogram it was possible to observe that one of these genes, nominated as E-1, corresponds to an enzyme that is secreted to the extracellular medium, suggesting that the cloned gene carried the original signal peptide. Enzymatic assays and analysis through capillary electrophoresis showed that E-1 was able to cleave internal glycosidic bonds of cellulose. New rounds of functional screenings through chromogenic substrates are being conducted aiming the generation of a library of lignocellulolytic enzymes derived from soil metagenome, which may become key component for development of second generation biofuels. (author)

  2. Comparative metagenomics of bathypelagic plankton and bottom sediment from the Sea of Marmara.

    Science.gov (United States)

    Quaiser, Achim; Zivanovic, Yvan; Moreira, David; López-García, Purificación

    2011-02-01

    To extend comparative metagenomic analyses of the deep-sea, we produced metagenomic data by direct 454 pyrosequencing from bathypelagic plankton (1000  m depth) and bottom sediment of the Sea of Marmara, the gateway between the Eastern Mediterranean and the Black Seas. Data from small subunit ribosomal RNA (SSU rRNA) gene libraries and direct pyrosequencing of the same samples indicated that Gamma- and Alpha-proteobacteria, followed by Bacteroidetes, dominated the bacterial fraction in Marmara deep-sea plankton, whereas Planctomycetes, Delta- and Gamma-proteobacteria were the most abundant groups in high bacterial-diversity sediment. Group I Crenarchaeota/Thaumarchaeota dominated the archaeal plankton fraction, although group II and III Euryarchaeota were also present. Eukaryotes were highly diverse in SSU rRNA gene libraries, with group I (Duboscquellida) and II (Syndiniales) alveolates and Radiozoa dominating plankton, and Opisthokonta and Alveolates, sediment. However, eukaryotic sequences were scarce in pyrosequence data. Archaeal amo genes were abundant in plankton, suggesting that Marmara planktonic Thaumarchaeota are ammonia oxidizers. Genes involved in sulfate reduction, carbon monoxide oxidation, anammox and sulfatases were over-represented in sediment. Genome recruitment analyses showed that Alteromonas macleodii 'surface ecotype', Pelagibacter ubique and Nitrosopumilus maritimus were highly represented in 1000  m-deep plankton. A comparative analysis of Marmara metagenomes with ALOHA deep-sea and surface plankton, whale carcasses, Peru subsurface sediment and soil metagenomes clustered deep-sea Marmara plankton with deep-ALOHA plankton and whale carcasses, likely because of the suboxic conditions in the deep Marmara water column. The Marmara sediment clustered with the soil metagenome, highlighting the common ecological role of both types of microbial communities in the degradation of organic matter and the completion of biogeochemical cycles.

  3. Biotechnological applications of functional metagenomics in the food and pharmaceutical industries.

    Science.gov (United States)

    Coughlan, Laura M; Cotter, Paul D; Hill, Colin; Alvarez-Ordóñez, Avelino

    2015-01-01

    Microorganisms are found throughout nature, thriving in a vast range of environmental conditions. The majority of them are unculturable or difficult to culture by traditional methods. Metagenomics enables the study of all microorganisms, regardless of whether they can be cultured or not, through the analysis of genomic data obtained directly from an environmental sample, providing knowledge of the species present, and allowing the extraction of information regarding the functionality of microbial communities in their natural habitat. Function-based screenings, following the cloning and expression of metagenomic DNA in a heterologous host, can be applied to the discovery of novel proteins of industrial interest encoded by the genes of previously inaccessible microorganisms. Functional metagenomics has considerable potential in the food and pharmaceutical industries, where it can, for instance, aid (i) the identification of enzymes with desirable technological properties, capable of catalyzing novel reactions or replacing existing chemically synthesized catalysts which may be difficult or expensive to produce, and able to work under a wide range of environmental conditions encountered in food and pharmaceutical processing cycles including extreme conditions of temperature, pH, osmolarity, etc; (ii) the discovery of novel bioactives including antimicrobials active against microorganisms of concern both in food and medical settings; (iii) the investigation of industrial and societal issues such as antibiotic resistance development. This review article summarizes the state-of-the-art functional metagenomic methods available and discusses the potential of functional metagenomic approaches to mine as yet unexplored environments to discover novel genes with biotechnological application in the food and pharmaceutical industries.

  4. Biotechnological applications of functional metagenomics in the food and pharmaceutical industries

    Directory of Open Access Journals (Sweden)

    Laura M Coughlan

    2015-06-01

    Full Text Available Microorganisms are found throughout nature, thriving in a vast range of environmental conditions. The majority of them are unculturable or difficult to culture by traditional methods. Metagenomics enables the study of all microorganisms, regardless of whether they can be cultured or not, through the analysis of genomic data obtained directly from an environmental sample, providing knowledge of the species present and allowing the extraction of information regarding the functionality of microbial communities in their natural habitat. Function-based screenings, following the cloning and expression of metagenomic DNA in a heterologous host, can be applied to the discovery of novel proteins of industrial interest encoded by the genes of previously inaccessible microorganisms. Functional metagenomics has considerable potential in the food and pharmaceutical industries, where it can, for instance, aid (i the identification of enzymes with desirable technological properties, capable of catalysing novel reactions or replacing existing chemically synthesized catalysts which may be difficult or expensive to produce, and able to work under a wide range of environmental conditions encountered in food and pharmaceutical processing cycles including extreme conditions of temperature, pH, osmolarity, etc; (ii the discovery of novel bioactives including antimicrobials active against microorganisms of concern both in food and medical settings; (iii the investigation of industrial and societal issues such as antibiotic resistance development. This review article summarizes the state-of-the-art functional metagenomic methods available and discusses the potential of functional metagenomic approaches to mine as yet unexplored environments to discover novel genes with biotechnological application in the food and pharmaceutical industries.

  5. A case study for large-scale human microbiome analysis using JCVI's metagenomics reports (METAREP.

    Directory of Open Access Journals (Sweden)

    Johannes Goll

    Full Text Available As metagenomic studies continue to increase in their number, sequence volume and complexity, the scalability of biological analysis frameworks has become a rate-limiting factor to meaningful data interpretation. To address this issue, we have developed JCVI Metagenomics Reports (METAREP as an open source tool to query, browse, and compare extremely large volumes of metagenomic annotations. Here we present improvements to this software including the implementation of a dynamic weighting of taxonomic and functional annotation, support for distributed searches, advanced clustering routines, and integration of additional annotation input formats. The utility of these improvements to data interpretation are demonstrated through the application of multiple comparative analysis strategies to shotgun metagenomic data produced by the National Institutes of Health Roadmap for Biomedical Research Human Microbiome Project (HMP (http://nihroadmap.nih.gov. Specifically, the scalability of the dynamic weighting feature is evaluated and established by its application to the analysis of over 400 million weighted gene annotations derived from 14 billion short reads as predicted by the HMP Unified Metabolic Analysis Network (HUMAnN pipeline. Further, the capacity of METAREP to facilitate the identification and simultaneous comparison of taxonomic and functional annotations including biological pathway and individual enzyme abundances from hundreds of community samples is demonstrated by providing scenarios that describe how these data can be mined to answer biological questions related to the human microbiome. These strategies provide users with a reference of how to conduct similar large-scale metagenomic analyses using METAREP with their own sequence data, while in this study they reveal insights into the nature and extent of variation in taxonomic and functional profiles across body habitats and individuals. Over one thousand HMP WGS datasets and the latest

  6. Metagenomics reshapes the concepts of RNA virus evolution by revealing extensive horizontal virus transfer.

    Science.gov (United States)

    Dolja, Valerian V; Koonin, Eugene V

    2017-11-08

    Virus metagenomics is a young research filed but it has already transformed our understanding of virus diversity and evolution, and illuminated at a new level the connections between virus evolution and the evolution and ecology of the hosts. In this review article, we examine the new picture of the evolution of RNA viruses, the dominant component of the eukaryotic virome, that is emerging from metagenomic data analysis. The major expansion of many groups of RNA viruses through metagenomics allowed the construction of substantially improved phylogenetic trees for the conserved virus genes, primarily, the RNA-dependent RNA polymerases (RdRp). In particular, a new superfamily of widespread, small positive-strand RNA viruses was delineated that unites tombus-like and noda-like viruses. Comparison of the genome architectures of RNA viruses discovered by metagenomics and by traditional methods reveals an extent of gene module shuffling among diverse virus genomes that far exceeds the previous appreciation of this evolutionary phenomenon. Most dramatically, inclusion of the metagenomic data in phylogenetic analyses of the RdRp resulted in the identification of numerous, strongly supported groups that encompass RNA viruses from diverse hosts including different groups of protists, animals and plants. Notwithstanding potential caveats, in particular, incomplete and uneven sampling of eukaryotic taxa, these highly unexpected findings reveal horizontal virus transfer (HVT) between diverse hosts as the central aspect of RNA virus evolution. The vast and diverse virome of invertebrates, particularly nematodes and arthropods, appears to be the reservoir, from which the viromes of plants and vertebrates evolved via multiple HVT events. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.

  7. SmashCommunity: A metagenomic annotation and analysis tool

    DEFF Research Database (Denmark)

    Arumugam, Manimozhiyan; Harrington, Eoghan D; Foerstner, Konrad U

    2010-01-01

    SUMMARY: SmashCommunity is a stand-alone metagenomic annotation and analysis pipeline suitable for data from Sanger and 454 sequencing technologies. It supports state-of-the-art software for essential metagenomic tasks such as assembly and gene prediction. It provides tools to estimate...... the quantitative phylogenetic and functional compositions of metagenomes, to compare compositions of multiple metagenomes and to produce intuitive visual representations of such analyses. AVAILABILITY: SmashCommunity is freely available at http://www.bork.embl.de/software/smash CONTACT: bork@embl.de....

  8. Metagenomic Systems Biology of the Human Microbiome

    DEFF Research Database (Denmark)

    Bonde, Ida

    The human microbiome is an integrated part of the human body, outnumbering the human cells by approximately a factor 10. These microorganisms are very important for human health, hence knowledge about this, ”our other genome”, has been growing rapidly in recent years. This is manly due...... to the advances in next generation sequencing, which has allowed for large-scale metagenomics studies of different niches of the human microbiota. Especially the gut microbiota has been studied intensively. However, most studies have been purely descriptive, thus there is still a lot to learn regarding...... the interplay between species in the microbiota and also between the host and the inhabiting microorganisms. Additionally, the non-bacterial part of the microbiota, which includes bacteriophages, plasmids and micro-eukaryotes, is not very well described. In this thesis, metagenomics data from the human gut...

  9. Genomics and metagenomics in medical microbiology.

    Science.gov (United States)

    Padmanabhan, Roshan; Mishra, Ajay Kumar; Raoult, Didier; Fournier, Pierre-Edouard

    2013-12-01

    Over the last two decades, sequencing tools have evolved from laborious time-consuming methodologies to real-time detection and deciphering of genomic DNA. Genome sequencing, especially using next generation sequencing (NGS) has revolutionized the landscape of microbiology and infectious disease. This deluge of sequencing data has not only enabled advances in fundamental biology but also helped improve diagnosis, typing of pathogen, virulence and antibiotic resistance detection, and development of new vaccines and culture media. In addition, NGS also enabled efficient analysis of complex human micro-floras, both commensal, and pathological, through metagenomic methods, thus helping the comprehension and management of human diseases such as obesity. This review summarizes technological advances in genomics and metagenomics relevant to the field of medical microbiology. Copyright © 2013 Elsevier B.V. All rights reserved.

  10. Metagenomics: advances in ecology and biotechnology.

    Science.gov (United States)

    Steele, Helen L; Streit, Wolfgang R

    2005-06-15

    This review highlights the significant advances which have been made in prokaryotic ecology and biotechnology due to the application of metagenomic techniques. It is now possible to link processes to specific microorganisms in the environment, such as the detection of a new phototrophic process in marine bacteria, and to characterise the metabolic cooperation which takes place in mixed species biofilms. The range of prokaryote derived products available for biotechnology applications is increasing rapidly. The knowledge gained from analysis of biosynthetic pathways provides valuable information about enzymology and allows engineering of biocatalysts for specific processes. The expansion of metagenomic techniques to include alternative heterologous hosts for gene expression and the development of sophisticated assays which enable screening of thousands of clones offers the possibility to find out even more valuable information about the prokaryotic world.

  11. Microbes in deep marine sediments viewed through amplicon sequencing and metagenomics

    Science.gov (United States)

    Biddle, J.; Leon, Z. R.; Russell, J. A., III; Martino, A. J.

    2016-12-01

    Nearly twenty percent of microbial biomass on Earth can be found in the marine subsurface. The majority of this is concentrated on continental margins, which have been investigated by scientific drilling. On the Costa Rica Margin, Iberian Margin and Peru Margins, sediment samples have been investigated through DNA extraction followed by amplicon and metagenomic sequencing. Overall samples show a high degree of microbial diversity, including many lineages of newly defined groups. In this talk, metagenome assembled genomes of unusual lineages will be presented, including their relationships to shallower relatives. From Costa Rica, in particular, we have retrieved deep relatives of Lokiarchaeota and Thorarchaeota, as well as other deeply branching archaeal relatives. We discuss their genome similarities to both other archaea and eukaryotes. From the Iberian Margin, relatives of Atribacteria and Aerophobetes will be discussed. Finally, we will detail the knowledge lost or gained depending on whether samples are studied via amplicon sequencing or total metagenomics, as studies in other environments have shown that up to 15% of microbial diversity is ignored when samples are studied via amplicon sequencing alone.

  12. Time and space resolved deep metagenomics to investigate selection pressures on low abundant species in complex environments

    DEFF Research Database (Denmark)

    Albertsen, Mads; Saunders, Aaron Marc; Nielsen, Kåre Lehmann

    time and space resolved diversity of key species in the core EBPR community using available reference genomes and genomes assembled directly from metagenomes. A total of 500 Gb sequence was generated using Illumina HiSeq2000 for metagenomics and V4 16S rRNA gene sequencing. To track changes over time...... and between EBPR plants we sequenced a total of 10 samples from 3 different plants over a 3 year period at a depth of 25 Gb each. In addition, one time point was selected for deep sequencing, generating 200 Gb of sequence divided between replicates. Quantitative FISH analysis using >30 oligonucleotide probes...

  13. New Extremophilic Lipases and Esterases from Metagenomics

    Science.gov (United States)

    López-López, Olalla; Cerdán, Maria E; González Siso, Maria I

    2014-01-01

    Lipolytic enzymes catalyze the hydrolysis of ester bonds in the presence of water. In media with low water content or in organic solvents, they can catalyze synthetic reactions such as esterification and transesterification. Lipases and esterases, in particular those from extremophilic origin, are robust enzymes, functional under the harsh conditions of industrial processes owing to their inherent thermostability and resistance towards organic solvents, which combined with their high chemo-, regio- and enantioselectivity make them very attractive biocatalysts for a variety of industrial applications. Likewise, enzymes from extremophile sources can provide additional features such as activity at extreme temperatures, extreme pH values or high salinity levels, which could be interesting for certain purposes. New lipases and esterases have traditionally been discovered by the isolation of microbial strains producing lipolytic activity. The Genome Projects Era allowed genome mining, exploiting homology with known lipases and esterases, to be used in the search for new enzymes. The Metagenomic Era meant a step forward in this field with the study of the metagenome, the pool of genomes in an environmental microbial community. Current molecular biology techniques make it possible to construct total environmental DNA libraries, including the genomes of unculturable organisms, opening a new window to a vast field of unknown enzymes with new and unique properties. Here, we review the latest advances and findings from research into new extremophilic lipases and esterases, using metagenomic approaches, and their potential industrial and biotechnological applications. PMID:24588890

  14. Generating viral metagenomes from the coral holobiont

    Directory of Open Access Journals (Sweden)

    Karen Dawn Weynberg

    2014-05-01

    Full Text Available Reef-building corals comprise multipartite symbioses where the cnidarian animal is host to an array of eukaryotic and prokaryotic organisms, and the viruses that infect them. These viruses are critical elements of the coral holobiont, serving not only as agents of mortality, but also as potential vectors for lateral gene flow, and as elements encoding a variety of auxiliary metabolic functions. Consequently, understanding the functioning and health of the coral holobiont requires detailed knowledge of the associated viral assemblage and its function. Currently, the most tractable way of uncovering viral diversity and function is through metagenomic approaches, which is inherently difficult in corals because of the complex holobiont community, an extracellular mucus layer that all corals secrete, and the variety of sizes and structures of nucleic acids found in viruses. Here we present the first protocol for isolating, purifying and amplifying viral nucleic acids from corals based on mechanical disruption of cells. This method produces at least 50% higher yields of viral nucleic acids, has very low levels of cellular sequence contamination and captures wider viral diversity than previously used chemical-based extraction methods. We demonstrate that our mechanical-based method profiles a greater diversity of DNA and RNA genomes, including virus groups such as Retro-transcribing and ssRNA viruses, which are absent from metagenomes generated via chemical-based methods. In addition, we briefly present (and make publically available the first paired DNA and RNA viral metagenomes from the coral Acropora tenuis.

  15. Metagenomic approach for discovering new pathogens in infection disease outbreaks

    Directory of Open Access Journals (Sweden)

    Emanuela Giombini

    2011-09-01

    Full Text Available Viruses represent the most abundant biological components on earth.They can be found in every environment, from deep layers of oceans to animal bodies.Although several viruses have been isolated and sequenced, in each environment there are millions of different types of viruses that have not been identified yet.The advent of nextgeneration sequencing technologies with their high throughput capabilities make possible to study in a single experiment all the community of microorganisms present in a particular sample “microbioma”.They made more feasible the application of the metagenomic approach, by which it is also possible to discover and identify new pathogens, that may pose a threat to public health.This paper summarizes the most recent applications of nextgeneration sequencing to discover new viral pathogens during the occurrence of infection disease outbreaks.

  16. Molecular Diagnosis of Orthopedic-Device-Related Infection Directly from Sonication Fluid by Metagenomic Sequencing.

    Science.gov (United States)

    Street, Teresa L; Sanderson, Nicholas D; Atkins, Bridget L; Brent, Andrew J; Cole, Kevin; Foster, Dona; McNally, Martin A; Oakley, Sarah; Peto, Leon; Taylor, Adrian; Peto, Tim E A; Crook, Derrick W; Eyre, David W

    2017-08-01

    Culture of multiple periprosthetic tissue samples is the current gold standard for microbiological diagnosis of prosthetic joint infections (PJI). Additional diagnostic information may be obtained through culture of sonication fluid from explants. However, current techniques can have relatively low sensitivity, with prior antimicrobial therapy and infection by fastidious organisms influencing results. We assessed if metagenomic sequencing of total DNA extracts obtained direct from sonication fluid can provide an alternative rapid and sensitive tool for diagnosis of PJI. We compared metagenomic sequencing with standard aerobic and anaerobic culture in 97 sonication fluid samples from prosthetic joint and other orthopedic device infections. Reads from Illumina MiSeq sequencing were taxonomically classified using Kraken. Using 50 derivation samples, we determined optimal thresholds for the number and proportion of bacterial reads required to identify an infection and confirmed our findings in 47 independent validation samples. Compared to results from sonication fluid culture, the species-level sensitivity of metagenomic sequencing was 61/69 (88%; 95% confidence interval [CI], 77 to 94%; for derivation samples 35/38 [92%; 95% CI, 79 to 98%]; for validation samples, 26/31 [84%; 95% CI, 66 to 95%]), and genus-level sensitivity was 64/69 (93%; 95% CI, 84 to 98%). Species-level specificity, adjusting for plausible fastidious causes of infection, species found in concurrently obtained tissue samples, and prior antibiotics, was 85/97 (88%; 95% CI, 79 to 93%; for derivation samples, 43/50 [86%; 95% CI, 73 to 94%]; for validation samples, 42/47 [89%; 95% CI, 77 to 96%]). High levels of human DNA contamination were seen despite the use of laboratory methods to remove it. Rigorous laboratory good practice was required to minimize bacterial DNA contamination. We demonstrate that metagenomic sequencing can provide accurate diagnostic information in PJI. Our findings, combined

  17. 31 CFR 10.82 - Expedited suspension.

    Science.gov (United States)

    2010-07-01

    ... 31 Money and Finance: Treasury 1 2010-07-01 2010-07-01 false Expedited suspension. 10.82 Section... INTERNAL REVENUE SERVICE Rules Applicable to Disciplinary Proceedings § 10.82 Expedited suspension. (a... suspension. A suspension under this section will commence on the date that written notice of the suspension...

  18. Book Review Expedition Field Techniques: Small Mammals ...

    African Journals Online (AJOL)

    Book Review. Expedition Field Techniques: Small. Mammals (excluding bats) Second Edition. A. Barnett and J. Dutton. Published and distributed 1995 by the Expedition Advisory. Cenlre, Royal Geographic Society, 1 Kensington Gore,. London, SW7 2AR. 126 pages. Price: £8.50 (softcover). ISBN 0-907649-68-8.

  19. Draft Genome Sequence of Uncultured SAR324 Bacterium lautmerah10, Binned from a Red Sea Metagenome

    KAUST Repository

    Haroon, Mohamed

    2016-02-11

    A draft genome of SAR324 bacterium lautmerah10 was assembled from a metagenome of a surface water sample from the Red Sea, Saudi Arabia. The genome is more complete and has a higher G+C content than that of previously sequenced SAR324 representatives. Its genomic information shows a versatile metabolism that confers an advantage to SAR324, which is reflected in its distribution throughout different depths of the marine water column.

  20. Microbial community structure and dynamics in thermophilic composting viewed through metagenomics and metatranscriptomics

    OpenAIRE

    Luciana Principal Antunes; Layla Farage Martins; Roberta Verciano Pereira; Andrew Maltez Thomas; Deibs Barbosa; Leandro Nascimento Lemos; Gianluca Major Machado Silva; Livia Maria Silva Moura; George Willian Condomitti Epamino; Luciano Antonio Digiampietri; Karen Cristina Lombardi; Patricia Locosque Ramos; Ronaldo Bento Quaggio; Julio Cezar Franco de Oliveira; Renata Castiglioni Pascon

    2016-01-01

    Composting is a promising source of new organisms and thermostable enzymes that may be helpful in environmental management and industrial processes. Here we present results of metagenomic- and metatranscriptomic-based analyses of a large composting operation in the S?o Paulo Zoo Park. This composting exhibits a sustained thermophilic profile (50??C to 75??C), which seems to preclude fungal activity. The main novelty of our study is the combination of time-series sampling with shotgun DNA, 16S...

  1. Comparative Metagenomics Revealed Commonly Enriched Gene Sets in Human Gut Microbiomes

    OpenAIRE

    Kurokawa, Ken; Itoh, Takehiko; Kuwahara, Tomomi; Oshima, Kenshiro; Toh, Hidehiro; Toyoda, Atsushi; Takami, Hideto; Morita, Hidetoshi; Vineet K. Sharma; Tulika P. Srivastava; Todd D. Taylor; Noguchi, Hideki; Mori, Hiroshi; Ogura, Yoshitoshi; Dusko S. Ehrlich

    2007-01-01

    Numerous microbes inhabit the human intestine, many of which are uncharacterized or uncultivable. They form a complex microbial community that deeply affects human physiology. To identify the genomic features common to all human gut microbiomes as well as those variable among them, we performed a large-scale comparative metagenomic analysis of fecal samples from 13 healthy individuals of various ages, including unweaned infants. We found that, while the gut microbiota from unweaned infants we...

  2. SUPER-FOCUS: a tool for agile functional analysis of shotgun metagenomic data

    OpenAIRE

    Silva, Genivaldo Gueiros Z.; Green, Kevin T; Dutilh, Bas E; Edwards, Robert A

    2015-01-01

    Summary: Analyzing the functional profile of a microbial community from unannotated shotgun sequencing reads is one of the important goals in metagenomics. Functional profiling has valuable applications in biological research because it identifies the abundances of the functional genes of the organisms present in the original sample, answering the question what they can do. Currently, available tools do not scale well with increasing data volumes, which is important because both the number an...

  3. The GAAS metagenomic tool and its estimations of viral and microbial average genome size in four major biomes.

    Directory of Open Access Journals (Sweden)

    Florent E Angly

    2009-12-01

    Full Text Available Metagenomic studies characterize both the composition and diversity of uncultured viral and microbial communities. BLAST-based comparisons have typically been used for such analyses; however, sampling biases, high percentages of unknown sequences, and the use of arbitrary thresholds to find significant similarities can decrease the accuracy and validity of estimates. Here, we present Genome relative Abundance and Average Size (GAAS, a complete software package that provides improved estimates of community composition and average genome length for metagenomes in both textual and graphical formats. GAAS implements a novel methodology to control for sampling bias via length normalization, to adjust for multiple BLAST similarities by similarity weighting, and to select significant similarities using relative alignment lengths. In benchmark tests, the GAAS method was robust to both high percentages of unknown sequences and to variations in metagenomic sequence read lengths. Re-analysis of the Sargasso Sea virome using GAAS indicated that standard methodologies for metagenomic analysis may dramatically underestimate the abundance and importance of organisms with small genomes in environmental systems. Using GAAS, we conducted a meta-analysis of microbial and viral average genome lengths in over 150 metagenomes from four biomes to determine whether genome lengths vary consistently between and within biomes, and between microbial and viral communities from the same environment. Significant differences between biomes and within aquatic sub-biomes (oceans, hypersaline systems, freshwater, and microbialites suggested that average genome length is a fundamental property of environments driven by factors at the sub-biome level. The behavior of paired viral and microbial metagenomes from the same environment indicated that microbial and viral average genome sizes are independent of each other, but indicative of community responses to stressors and

  4. Microbial Diversity and Biochemical Potential Encoded by Thermal Spring Metagenomes Derived from the Kamchatka Peninsula

    Directory of Open Access Journals (Sweden)

    Bernd Wemheuer

    2013-01-01

    Full Text Available Volcanic regions contain a variety of environments suitable for extremophiles. This study was focused on assessing and exploiting the prokaryotic diversity of two microbial communities derived from different Kamchatkian thermal springs by metagenomic approaches. Samples were taken from a thermoacidophilic spring near the Mutnovsky Volcano and from a thermophilic spring in the Uzon Caldera. Environmental DNA for metagenomic analysis was isolated from collected sediment samples by direct cell lysis. The prokaryotic community composition was examined by analysis of archaeal and bacterial 16S rRNA genes. A total number of 1235 16S rRNA gene sequences were obtained and used for taxonomic classification. Most abundant in the samples were members of Thaumarchaeota, Thermotogae, and Proteobacteria. The Mutnovsky hot spring was dominated by the Terrestrial Hot Spring Group, Kosmotoga, and Acidithiobacillus. The Uzon Caldera was dominated by uncultured members of the Miscellaneous Crenarchaeotic Group and Enterobacteriaceae. The remaining 16S rRNA gene sequences belonged to the Aquificae, Dictyoglomi, Euryarchaeota, Korarchaeota, Thermodesulfobacteria, Firmicutes, and some potential new phyla. In addition, the recovered DNA was used for generation of metagenomic libraries, which were subsequently mined for genes encoding lipolytic and proteolytic enzymes. Three novel genes conferring lipolytic and one gene conferring proteolytic activity were identified.

  5. Metagenomics of Bacterial Diversity in Villa Luz Caves with Sulfur Water Springs

    Science.gov (United States)

    Artacho, Alejandro; Bautista, José S.; Méndez, Roberto; Gamboa, María T.; Gamboa, Jesús R.; Gómez-Cruz, Rodolfo

    2018-01-01

    New biotechnology applications require in-depth preliminary studies of biodiversity. The methods of massive sequencing using metagenomics and bioinformatics tools offer us sufficient and reliable knowledge to understand environmental diversity, to know new microorganisms, and to take advantage of their functional genes. Villa Luz caves, in the southern Mexican state of Tabasco, are fed by at least 26 groundwater inlets, containing 300–500 mg L−1 H2S and <0.1 mg L−1 O2. We extracted environmental DNA for metagenomic analysis of collected samples in five selected Villa Luz caves sites, with pH values from 2.5 to 7. Foreign organisms found in this underground ecosystem can oxidize H2S to H2SO4. These include: biovermiculites, a bacterial association that can grow on the rock walls; snottites, that are whitish, viscous biofilms hanging from the rock walls, and sacks or bags of phlegm, which live within the aquatic environment of the springs. Through the emergency food assistance program (TEFAP) pyrosequencing, a total of 20,901 readings of amplification products from hypervariable regions V1 and V3 of 16S rRNA bacterial gene in whole and pure metagenomic DNA samples were generated. Seven bacterial phyla were identified. As a result, Proteobacteria was more frequent than Acidobacteria. Finally, acidophilic Proteobacteria was detected in UJAT5 sample. PMID:29361802

  6. Metagenomic investigation of the microbial diversity in a chrysotile asbestos mine pit pond, Lowell, Vermont, USA

    Directory of Open Access Journals (Sweden)

    Heather E. Driscoll

    2016-12-01

    Full Text Available Here we report on a metagenomics investigation of the microbial diversity in a serpentine-hosted aquatic habitat created by chrysotile asbestos mining activity at the Vermont Asbestos Group (VAG Mine in northern Vermont, USA. The now-abandoned VAG Mine on Belvidere Mountain in the towns of Eden and Lowell includes three open-pit quarries, a flooded pit, mill buildings, roads, and >26 million metric tons of eroding mine waste that contribute alkaline mine drainage to the surrounding watershed. Metagenomes and water chemistry originated from aquatic samples taken at three depths (0.5 m, 3.5 m, and 25 m along the water column at three distinct, offshore sites within the mine's flooded pit (near 44°46′00.7673″, −72°31′36.2699″; UTM NAD 83 Zone 18 T 0695720 E, 4960030 N. Whole metagenome shotgun Illumina paired-end sequences were quality trimmed and analyzed based on a translated nucleotide search of NCBI-NR protein database and lowest common ancestor taxonomic assignments. Our results show strata within the pit pond water column can be distinguished by taxonomic composition and distribution, pH, temperature, conductivity, light intensity, and concentrations of dissolved oxygen. At the phylum level, metagenomes from 0.5 m and 3.5 m contained a similar distribution of taxa and were dominated by Actinobacteria (46% and 53% of reads, respectively, Proteobacteria (45% and 38%, respectively, and Bacteroidetes (7% in both. The metagenomes from 25 m showed a greater diversity of phyla and a different distribution of reads than the two upper strata: Proteobacteria (60%, Actinobacteria (18%, Planctomycetes, (10%, Bacteroidetes (5% and Cyanobacteria (2.5%, Armatimonadetes (<1%, Verrucomicrobia (<1%, Firmicutes (<1%, and Nitrospirae (<1%. Raw metagenome sequence data from each sample reside in NCBI's Short Read Archive (SRA ID: SRP056095 and are accessible through NCBI BioProject PRJNA277916.

  7. Metagenomic investigation of the microbial diversity in a chrysotile asbestos mine pit pond, Lowell, Vermont, USA.

    Science.gov (United States)

    Driscoll, Heather E; Vincent, James J; English, Erika L; Dolci, Elizabeth D

    2016-12-01

    Here we report on a metagenomics investigation of the microbial diversity in a serpentine-hosted aquatic habitat created by chrysotile asbestos mining activity at the Vermont Asbestos Group (VAG) Mine in northern Vermont, USA. The now-abandoned VAG Mine on Belvidere Mountain in the towns of Eden and Lowell includes three open-pit quarries, a flooded pit, mill buildings, roads, and > 26 million metric tons of eroding mine waste that contribute alkaline mine drainage to the surrounding watershed. Metagenomes and water chemistry originated from aquatic samples taken at three depths (0.5 m, 3.5 m, and 25 m) along the water column at three distinct, offshore sites within the mine's flooded pit (near 44°46'00.7673″, - 72°31'36.2699″; UTM NAD 83 Zone 18 T 0695720 E, 4960030 N). Whole metagenome shotgun Illumina paired-end sequences were quality trimmed and analyzed based on a translated nucleotide search of NCBI-NR protein database and lowest common ancestor taxonomic assignments. Our results show strata within the pit pond water column can be distinguished by taxonomic composition and distribution, pH, temperature, conductivity, light intensity, and concentrations of dissolved oxygen. At the phylum level, metagenomes from 0.5 m and 3.5 m contained a similar distribution of taxa and were dominated by Actinobacteria (46% and 53% of reads, respectively), Proteobacteria (45% and 38%, respectively), and Bacteroidetes (7% in both). The metagenomes from 25 m showed a greater diversity of phyla and a different distribution of reads than the two upper strata: Proteobacteria (60%), Actinobacteria (18%), Planctomycetes, (10%), Bacteroidetes (5%) and Cyanobacteria (2.5%), Armatimonadetes (< 1%), Verrucomicrobia (< 1%), Firmicutes (< 1%), and Nitrospirae (< 1%). Raw metagenome sequence data from each sample reside in NCBI's Short Read Archive (SRA ID: SRP056095) and are accessible through NCBI BioProject PRJNA277916.

  8. Unlocking the potential of metagenomics through replicated experimental design.

    NARCIS (Netherlands)

    Knight, R.; Jansson, J.; Field, D.; Fierer, N.; Desai, N.; Fuhrman, J.A.; Hugenholtz, P.; van der Lelie, D.; Meyer, F.; Stevens, R.; Bailey, M.J.; Gordon, J.I.; Kowalchuk, G.A.; Gilbert, J.A.

    2012-01-01

    Metagenomics holds enormous promise for discovering novel enzymes and organisms that are biomarkers or drivers of processes relevant to disease, industry and the environment. In the past two years, we have seen a paradigm shift in metagenomics to the application of cross-sectional and longitudinal

  9. Unlocking the potential of metagenomics through replicated experimental design

    NARCIS (Netherlands)

    Knight, R.; Jansson, J.; Field, D.; Fierer, N.; Desai, N.; Fuhrman, J.A.; Hugenholtz, P.; Van der Lelie, D.; Meyer, F.; Stevens, R.; Bailey, M.J.; Gordon, J.I.; Kowalchuk, G.A.; Gilbert, J.A.

    2012-01-01

    Metagenomics holds enormous promise for discovering novel enzymes and organisms that are biomarkers or drivers of processes relevant to disease, industry and the environment. In the past two years, we have seen a paradigm shift in metagenomics to the application of cross-sectional and longitudinal

  10. Cross-cutting activities: Soil quality and soil metagenomics

    OpenAIRE

    Motavalli, Peter P.; Garrett, Karen A.

    2008-01-01

    This presentation reports on the work of the SANREM CRSP cross-cutting activities "Assessing and Managing Soil Quality for Sustainable Agricultural Systems" and "Soil Metagenomics to Construct Indicators of Soil Degradation." The introduction gives an overview of the extensiveness of soil degradation globally and defines soil quality. The objectives of the soil quality cross cutting activity are: CCRA-4 (Soil Metagenomics)

  11. Online Semi-Supervised Learning: Algorithm and Application in Metagenomics

    NARCIS (Netherlands)

    Imangaliyev, S.; Keijser, B.J.F.; Crielaard, W.; Tsivtsivadze, E.

    2013-01-01

    As the amount of metagenomic data grows rapidly, online statistical learning algorithms are poised to play key rolein metagenome analysis tasks. Frequently, data are only partially labeled, namely dataset contains partial information about the problem of interest. This work presents an algorithm and

  12. Online semi-supervised learning: algorithm and application in metagenomics

    NARCIS (Netherlands)

    Imangaliyev, S.; Keijser, B.J.; Crielaard, W.; Tsivtsivadze, E.; Li, G.Z.; Kim, S.; Hughes, M.; McLachlan, G.; Sun, H.; Hu, X.; Ressom, H.; Liu, B.; Liebman, M.

    2013-01-01

    As the amount of metagenomic data grows rapidly, online statistical learning algorithms are poised to play key role in metagenome analysis tasks. Frequently, data are only partially labeled, namely dataset contains partial information about the problem of interest. This work presents an algorithm

  13. Metagenomics: Retrospect and Prospects in High Throughput Age

    Directory of Open Access Journals (Sweden)

    Satish Kumar

    2015-01-01

    Full Text Available In recent years, metagenomics has emerged as a powerful tool for mining of hidden microbial treasure in a culture independent manner. In the last two decades, metagenomics has been applied extensively to exploit concealed potential of microbial communities from almost all sorts of habitats. A brief historic progress made over the period is discussed in terms of origin of metagenomics to its current state and also the discovery of novel biological functions of commercial importance from metagenomes of diverse habitats. The present review also highlights the paradigm shift of metagenomics from basic study of community composition to insight into the microbial community dynamics for harnessing the full potential of uncultured microbes with more emphasis on the implication of breakthrough developments, namely, Next Generation Sequencing, advanced bioinformatics tools, and systems biology.

  14. Assessment of metagenomic assembly using simulated next generation sequencing data

    DEFF Research Database (Denmark)

    Mende, Daniel R; Waller, Alison S; Sunagawa, Shinichi

    2012-01-01

    Due to the complexity of the protocols and a limited knowledge of the nature of microbial communities, simulating metagenomic sequences plays an important role in testing the performance of existing tools and data analysis methods with metagenomic data. We developed metagenomic read simulators...... with platform-specific (Sanger, pyrosequencing, Illumina) base-error models, and simulated metagenomes of differing community complexities. We first evaluated the effect of rigorous quality control on Illumina data. Although quality filtering removed a large proportion of the data, it greatly improved....... Although the increase in contig length was accompanied by increased chimericity, it resulted in more complete genes and a better characterization of the functional repertoire. The metagenomic simulators developed for this research are freely available....

  15. A Metagenomic Survey of Serpentinites and Nearby Soils in Taiwan

    Science.gov (United States)

    Li, K. Y.; Hsu, Y. W.; Chen, Y. W.; Huang, T. Y.; Shih, Y. J.; Chen, J. S.; Hsu, B. M.

    2016-12-01

    The serpentinite of Taiwan is originated from the subduction zone of the Eurasian plate and the Philippine Sea plate. Many small bodies of serpentinite are scattered around the lands of the East Rift Valley, which are also one of the major agricultural areas in Taiwan. Since microbial communities play a role both on weathering process and soil recovery, uncovering the microbial compositions in serpentinites and surrounding soils may help people to understand the roles of microorganisms on serpentinites during the nature weathering process. In this study, microorganisms growing on the surface of serpentinites, in the surrounding soil, and agriculture soils that are miles of horizontal distance away from serpentinite were collected. Next generation sequencing (NGS) was carried out to examine the metagenomics of uncultured microbial community in these samples. The metagenomics were further clustered into operational taxonomic units (OTUs) to analyze relative abundance, heatmap of OTUs, and principal coordinates analysis (PCoA). Our data revealed the different types of geographic material had their own distinct structures of microbial community. In serpentinites, the heatmaps based on the phylogenetic pattern showed that the OTUs distributions were similar in phyla of Bacteroidetes, Cyanobacteria, Proteobacteria, Verrucomicrobia, and WPS-1/WPS-2. On the other hand, the heatmaps of phylogenetic pattern of agriculture soils showed that the OTUs distributions in phyla of Chloroflexi, Acidobacteria, Actinobacteria, WPS-1/WPS-2, and Proteobacteria were similar. In soil nearby the serpentinite, some clusters of OTUs in phyla of Bacteroidetes, Cyanobacteria, and WPS-1/WPS-2 have disappeared. Our data provided evidence regarding kinetic evolutions of microbial communities in different geographic materials.

  16. Three novel virophage genomes discovered from Yellowstone Lake metagenomes.

    Science.gov (United States)

    Zhou, Jinglie; Sun, Dawei; Childers, Alyson; McDermott, Timothy R; Wang, Yongjie; Liles, Mark R

    2015-01-15

    Virophages are a unique group of circular double-stranded DNA viruses that are considered parasites of giant DNA viruses, which in turn are known to infect eukaryotic hosts. In this study, the genomes of three novel Yellowstone Lake virophages (YSLVs)--YSLV5, YSLV6, and YSLV7--were identified from Yellowstone Lake through metagenomic analyses. The relative abundance of these three novel virophages and previously identified Yellowstone Lake virophages YSLV1 to -4 were determined in different locations of the lake, revealing that most of the sampled locations in the lake, including both mesophilic and thermophilic habitats, had multiple virophage genotypes. This likely reflects the diverse habitats or diversity of the eukaryotic hosts and their associated giant viruses that serve as putative hosts for these virophages. YSLV5 has a 29,767-bp genome with 32 predicted open reading frames (ORFs), YSLV6 has a 24,837-bp genome with 29 predicted ORFs, and YSLV7 has a 23,193-bp genome with 26 predicted ORFs. Based on multilocus phylogenetic analysis, YSLV6 shows a close evolutionary relationship with YSLV1 to -4, whereas YSLV5 and YSLV7 are distantly related to the others, and YSLV7 represents the fourth novel virophage lineage. In addition, the genome of YSLV5 has a G+C content of 51.1% that is much higher than all other known virophages, indicating a unique host range for YSLV5. These results suggest that virophages are abundant and have diverse genotypes that likely mirror diverse giant viral and eukaryotic hosts within the Yellowstone Lake ecosystem. This study discovered novel virophages present within the Yellowstone Lake ecosystem using a conserved major capsid protein as a phylogenetic anchor for assembly of sequence reads from Yellowstone Lake metagenomic samples. The three novel virophage genomes (YSLV5 to -7) were completed by identifying specific environmental samples containing these respective virophages, and closing gaps by targeted PCR and sequencing. Most of

  17. Contamination of the Arctic reflected in microbial metagenomes from the Greenland ice sheet

    Science.gov (United States)

    Hauptmann, Aviaja L.; Sicheritz-Pontén, Thomas; Cameron, Karen A.; Bælum, Jacob; Plichta, Damian R.; Dalgaard, Marlene; Stibal, Marek

    2017-07-01

    Globally emitted contaminants accumulate in the Arctic and are stored in the frozen environments of the cryosphere. Climate change influences the release of these contaminants through elevated melt rates, resulting in increased contamination locally. Our understanding of how biological processes interact with contamination in the Arctic is limited. Through shotgun metagenomic data and binned genomes from metagenomes we show that microbial communities, sampled from multiple surface ice locations on the Greenland ice sheet, have the potential for resistance to and degradation of contaminants. The microbial potential to degrade anthropogenic contaminants, such as toxic and persistent polychlorinated biphenyls, was found to be spatially variable and not limited to regions close to human activities. Binned genomes showed close resemblance to microorganisms isolated from contaminated habitats. These results indicate that, from a microbiological perspective, the Greenland ice sheet cannot be seen as a pristine environment.

  18. Metagenomic sequencing reveals microbiota and its functional potential associated with periodontal disease.

    Science.gov (United States)

    Wang, Jinfeng; Qi, Ji; Zhao, Hui; He, Shu; Zhang, Yifei; Wei, Shicheng; Zhao, Fangqing

    2013-01-01

    Although attempts have been made to reveal the relationships between bacteria and human health, little is known about the species and function of the microbial community associated with oral diseases. In this study, we report the sequencing of 16 metagenomic samples collected from dental swabs and plaques representing four periodontal states. Insights into the microbial community structure and the metabolic variation associated with periodontal health and disease were obtained. We observed a strong correlation between community structure and disease status, and described a core disease-associated community. A number of functional genes and metabolic pathways including bacterial chemotaxis and glycan biosynthesis were over-represented in the microbiomes of periodontal disease. A significant amount of novel species and genes were identified in the metagenomic assemblies. Our study enriches the understanding of the oral microbiome and sheds light on the contribution of microorganisms to the formation and succession of dental plaques and oral diseases.

  19. Metagenomics-Guided Mining of Commercially Useful Biocatalysts from Marine Microorganisms.

    Science.gov (United States)

    Uria, A R; Zilda, D S

    Marine microorganisms are a rich reservoir of highly diverse and unique biocatalysts that offer potential applications in food, pharmaceutical, fuel, and cosmetic industries. The fact that only less than 1% of microbes in any marine habitats can be cultured under standard laboratory conditions has hampered access to their extraordinary biocatalytic potential. Metagenomics has recently emerged as a powerful and well-established tool to investigate the vast majority of hidden uncultured microbial diversity for the discovery of novel industrially relevant enzymes from different types of environmental samples, such as seawater, marine sediment, and symbiotic microbial consortia. We discuss here in this review about approaches and methods in metagenomics that have been used and can potentially be used to mine commercially useful biocatalysts from uncultured marine microbes. © 2016 Elsevier Inc. All rights reserved.

  20. The Characterization of Novel Tissue Microbiota Using an Optimized 16S Metagenomic Sequencing Pipeline.

    Science.gov (United States)

    Lluch, Jérôme; Servant, Florence; Païssé, Sandrine; Valle, Carine; Valière, Sophie; Kuchly, Claire; Vilchez, Gaëlle; Donnadieu, Cécile; Courtney, Michael; Burcelin, Rémy; Amar, Jacques; Bouchez, Olivier; Lelouvier, Benjamin

    2015-01-01

    Substantial progress in high-throughput metagenomic sequencing methodologies has enabled the characterisation of bacteria from various origins (for example gut and skin). However, the recently-discovered bacterial microbiota present within animal internal tissues has remained unexplored due to technical difficulties associated with these challenging samples. We have optimized a specific 16S rDNA-targeted metagenomics sequencing (16S metabarcoding) pipeline based on the Illumina MiSeq technology for the analysis of bacterial DNA in human and animal tissues. This was successfully achieved in various mouse tissues despite the high abundance of eukaryotic DNA and PCR inhibitors in these samples. We extensively tested this pipeline on mock communities, negative controls, positive controls and tissues and demonstrated the presence of novel tissue specific bacterial DNA profiles in a variety of organs (including brain, muscle, adipose tissue, liver and heart). The high throughput and excellent reproducibility of the method ensured exhaustive and precise coverage of the 16S rDNA bacterial variants present in mouse tissues. This optimized 16S metagenomic sequencing pipeline will allow the scientific community to catalogue the bacterial DNA profiles of different tissues and will provide a database to analyse host/bacterial interactions in relation to homeostasis and disease.

  1. The Characterization of Novel Tissue Microbiota Using an Optimized 16S Metagenomic Sequencing Pipeline.

    Directory of Open Access Journals (Sweden)

    Jérôme Lluch

    Full Text Available Substantial progress in high-throughput metagenomic sequencing methodologies has enabled the characterisation of bacteria from various origins (for example gut and skin. However, the recently-discovered bacterial microbiota present within animal internal tissues has remained unexplored due to technical difficulties associated with these challenging samples.We have optimized a specific 16S rDNA-targeted metagenomics sequencing (16S metabarcoding pipeline based on the Illumina MiSeq technology for the analysis of bacterial DNA in human and animal tissues. This was successfully achieved in various mouse tissues despite the high abundance of eukaryotic DNA and PCR inhibitors in these samples. We extensively tested this pipeline on mock communities, negative controls, positive controls and tissues and demonstrated the presence of novel tissue specific bacterial DNA profiles in a variety of organs (including brain, muscle, adipose tissue, liver and heart.The high throughput and excellent reproducibility of the method ensured exhaustive and precise coverage of the 16S rDNA bacterial variants present in mouse tissues. This optimized 16S metagenomic sequencing pipeline will allow the scientific community to catalogue the bacterial DNA profiles of different tissues and will provide a database to analyse host/bacterial interactions in relation to homeostasis and disease.

  2. Functional Metagenomics as a Tool for Identification of New Antibiotic Resistance Genes from Natural Environments.

    Science.gov (United States)

    Dos Santos, Débora Farage Knupp; Istvan, Paula; Quirino, Betania Ferraz; Kruger, Ricardo Henrique

    2017-02-01

    Antibiotic resistance has become a major concern for human and animal health, as therapeutic alternatives to treat multidrug-resistant microorganisms are rapidly dwindling. The problem is compounded by low investment in antibiotic research and lack of new effective antimicrobial drugs on the market. Exploring environmental antibiotic resistance genes (ARGs) will help us to better understand bacterial resistance mechanisms, which may be the key to identifying new drug targets. Because most environment-associated microorganisms are not yet cultivable, culture-independent techniques are essential to determine which organisms are present in a given environmental sample and allow the assessment and utilization of the genetic wealth they represent. Metagenomics represents a powerful tool to achieve these goals using sequence-based and functional-based approaches. Functional metagenomic approaches are particularly well suited to the identification new ARGs from natural environments because, unlike sequence-based approaches, they do not require previous knowledge of these genes. This review discusses functional metagenomics-based ARG research and describes new possibilities for surveying the resistome in environmental samples.

  3. Viral Metagenomics on Blood-Feeding Arthropods as a Tool for Human Disease Surveillance.

    Science.gov (United States)

    Brinkmann, Annika; Nitsche, Andreas; Kohl, Claudia

    2016-10-19

    Surveillance and monitoring of viral pathogens circulating in humans and wildlife, together with the identification of emerging infectious diseases (EIDs), are critical for the prediction of future disease outbreaks and epidemics at an early stage. It is advisable to sample a broad range of vertebrates and invertebrates at different temporospatial levels on a regular basis to detect possible candidate viruses at their natural source. However, virus surveillance systems can be expensive, costly in terms of finances and resources and inadequate for sampling sufficient numbers of different host species over space and time. Recent publications have presented the concept of a new virus surveillance system, coining the terms "flying biological syringes", "xenosurveillance" and "vector-enabled metagenomics". According to these novel and promising surveillance approaches, viral metagenomics on engorged mosquitoes might reflect the viral diversity of numerous mammals, birds and humans, combined in the mosquitoes' blood meal during feeding on the host. In this review article, we summarize the literature on vector-enabled metagenomics (VEM) techniques and its application in disease surveillance in humans. Furthermore, we highlight the combination of VEM and "invertebrate-derived DNA" (iDNA) analysis to identify the host DNA within the mosquito midgut.

  4. A tale of two approaches: how metagenomics and proteomics are shaping the future of encephalitis diagnostics.

    Science.gov (United States)

    Schubert, Ryan D; Wilson, Michael R

    2015-06-01

    We highlight how metagenomics and proteomics-based approaches are being applied to the problem of diagnosis in idiopathic encephalitis. Low cost, high-throughput next-generation sequencing platforms have enabled unbiased sequencing of biological samples. Rapid sequence-based computational algorithms then determine the source of all the nonhost (e.g., pathogen-derived) nucleic acids in a sample. This approach recently identified a case of neuroleptospirosis, resulting in a patient's dramatic clinical improvement with intravenous penicillin. Metagenomics also enabled the discovery of a neuroinvasive astrovirus in several patients. With regard to autoimmune encephalitis, advances in high throughput and efficient phage display of human peptides resulted in the discovery of autoantibodies against tripartite motif family members in a patient with paraneoplastic encephalitis. A complementary assay using ribosomes to display full-length human proteins identified additional autoantibody targets. Metagenomics and proteomics represent promising avenues of research to improve upon the diagnostic yield of current assays for infectious and autoimmune encephalitis, respectively.

  5. Synthetic biology approaches to improve biocatalyst identification in metagenomic library screening

    Science.gov (United States)

    Guazzaroni, María-Eugenia; Silva-Rocha, Rafael; Ward, Richard John

    2015-01-01

    There is a growing demand for enzymes with improved catalytic performance or tolerance to process-specific parameters, and biotechnology plays a crucial role in the development of biocatalysts for use in industry, agriculture, medicine and energy generation. Metagenomics takes advantage of the wealth of genetic and biochemical diversity present in the genomes of microorganisms found in environmental samples, and provides a set of new technologies directed towards screening for new catalytic activities from environmental samples with potential biotechnology applications. However, biased and low level of expression of heterologous proteins in Escherichia coli together with the use of non-optimal cloning vectors for the construction of metagenomic libraries generally results in an extremely low success rate for enzyme identification. The bottleneck arising from inefficient screening of enzymatic activities has been addressed from several perspectives; however, the limitations related to biased expression in heterologous hosts cannot be overcome by using a single approach, but rather requires the synergetic implementation of multiple methodologies. Here, we review some of the principal constraints regarding the discovery of new enzymes in metagenomic libraries and discuss how these might be resolved by using synthetic biology methods. PMID:25123225

  6. An Agile Functional Analysis of Metagenomic Data Using SUPER-FOCUS.

    Science.gov (United States)

    Silva, Genivaldo Gueiros Z; Lopes, Fabyano A C; Edwards, Robert A

    2017-01-01

    One of the main goals in metagenomics is to identify the functional profile of a microbial community from unannotated shotgun sequencing reads. Functional annotation is important in biological research because it enables researchers to identify the abundance of functional genes of the organisms present in the sample, answering the question, "What can the organisms in the sample do?" Most currently available approaches do not scale with increasing data volumes, which is important because both the number and lengths of the reads provided by sequencing platforms keep increasing. Here, we present SUPER-FOCUS, SUbsystems Profile by databasE Reduction using FOCUS, an agile homology-based approach using a reduced reference database to report the subsystems present in metagenomic datasets and profile their abundances. SUPER-FOCUS was tested with real metagenomes, and the results show that it accurately predicts the subsystems present in the profiled microbial communities, is computationally efficient, and up to 1000 times faster than other tools. SUPER-FOCUS is freely available at http://edwards.sdsu.edu/SUPERFOCUS .

  7. 29 CFR 1404.20 - Proper use of expedited arbitration.

    Science.gov (United States)

    2010-07-01

    ... 29 Labor 4 2010-07-01 2010-07-01 false Proper use of expedited arbitration. 1404.20 Section 1404... ARBITRATION SERVICES Expedited Arbitration § 1404.20 Proper use of expedited arbitration. (a) FMCS reserves the right to cease honoring request for Expedited Arbitration if a pattern of misuse of this becomes...

  8. Screening currency notes for microbial pathogens and antibiotic resistance genes using a shotgun metagenomic approach.

    Directory of Open Access Journals (Sweden)

    Saakshi Jalali

    Full Text Available Fomites are a well-known source of microbial infections and previous studies have provided insights into the sojourning microbiome of fomites from various sources. Paper currency notes are one of the most commonly exchanged objects and its potential to transmit pathogenic organisms has been well recognized. Approaches to identify the microbiome associated with paper currency notes have been largely limited to culture dependent approaches. Subsequent studies portrayed the use of 16S ribosomal RNA based approaches which provided insights into the taxonomical distribution of the microbiome. However, recent techniques including shotgun sequencing provides resolution at gene level and enable estimation of their copy numbers in the metagenome. We investigated the microbiome of Indian paper currency notes using a shotgun metagenome sequencing approach. Metagenomic DNA isolated from samples of frequently circulated denominations of Indian currency notes were sequenced using Illumina Hiseq sequencer. Analysis of the data revealed presence of species belonging to both eukaryotic and prokaryotic genera. The taxonomic distribution at kingdom level revealed contigs mapping to eukaryota (70%, bacteria (9%, viruses and archae (~1%. We identified 78 pathogens including Staphylococcus aureus, Corynebacterium glutamicum, Enterococcus faecalis, and 75 cellulose degrading organisms including Acidothermus cellulolyticus, Cellulomonas flavigena and Ruminococcus albus. Additionally, 78 antibiotic resistance genes were identified and 18 of these were found in all the samples. Furthermore, six out of 78 pathogens harbored at least one of the 18 common antibiotic resistance genes. To the best of our knowledge, this is the first report of shotgun metagenome sequence dataset of paper currency notes, which can be useful for future applications including as bio-surveillance of exchangeable fomites for infectious agents.

  9. Comparing and Evaluating Metagenome Assembly Tools from a Microbiologist's Perspective - Not Only Size Matters!

    Directory of Open Access Journals (Sweden)

    John Vollmers

    Full Text Available With the constant improvement in cost-efficiency and quality of Next Generation Sequencing technologies, shotgun-sequencing approaches -such as metagenomics- have nowadays become the methods of choice for studying and classifying microorganisms from various habitats. The production of data has dramatically increased over the past years and processing and analysis steps are becoming more and more of a bottleneck. Limiting factors are partly the availability of computational resources, but mainly the bioinformatics expertise in establishing and applying appropriate processing and analysis pipelines. Fortunately, a large diversity of specialized software tools is nowadays available. Nevertheless, choosing the most appropriate methods for answering specific biological questions can be rather challenging, especially for non-bioinformaticians. In order to provide a comprehensive overview and guide for the microbiological scientific community, we assessed the most common and freely available metagenome assembly tools with respect to their output statistics, their sensitivity for low abundant community members and variability in resulting community profiles as well as their ease-of-use. In contrast to the highly anticipated "Critical Assessment of Metagenomic Interpretation" (CAMI challenge, which uses general mock community-based assembler comparison we here tested assemblers on real Illumina metagenome sequencing data from natural communities of varying complexity sampled from forest soil and algal biofilms. Our observations clearly demonstrate that different assembly tools can prove optimal, depending on the sample type, available computational resources and, most importantly, the specific research goal. In addition, we present detailed descriptions of the underlying principles and pitfalls of publically available assembly tools from a microbiologist's perspective, and provide guidance regarding the user-friendliness, sensitivity and reliability of

  10. Detection of bacterial pathogens from clinical specimens using conventional microbial culture and 16S metagenomics: a comparative study.

    Science.gov (United States)

    Abayasekara, Lalanika M; Perera, Jennifer; Chandrasekharan, Vishvanath; Gnanam, Vaz S; Udunuwara, Nisala A; Liyanage, Dileepa S; Bulathsinhala, Nuwani E; Adikary, Subhashanie; Aluthmuhandiram, Janith V S; Thanaseelan, Chrishanthi S; Tharmakulasingam, D Portia; Karunakaran, Tharaga; Ilango, Janahan

    2017-09-19

    Infectious disease is the leading cause of death worldwide, and diagnosis of polymicrobial and fungal infections is increasingly challenging in the clinical setting. Conventionally, molecular detection is still the best method of species identification in clinical samples. However, the limitations of Sanger sequencing make diagnosis of polymicrobial infections one of the biggest hurdles in treatment. The development of massively parallel sequencing or next generation sequencing (NGS) has revolutionized the field of metagenomics, with wide application of the technology in identification of microbial communities in environmental sources, human gut and others. However, to date there has been no commercial application of this technology in infectious disease diagnostic settings. Credence Genomics Rapid Infection Detection™ test, is a molecular based diagnostic test that uses next generation sequencing of bacterial 16S rRNA gene and fungal ITS1 gene region to provide accurate identification of species within a clinical sample. Here we present a study comparing 16S and ITS1 metagenomic identification against conventional culture for clinical samples. Using culture results as gold standard, a comparison was conducted using patient specimens from a clinical microbiology lab. Metagenomics based results show a 91.8% concordance rate for culture positive specimens and 52.8% concordance rate with culture negative samples. 10.3% of specimens were also positive for fungal species which was not investigated by culture. Specificity and sensitivity for metagenomics analysis is 91.8 and 52.7% respectively. 16S based metagenomic identification of bacterial species within a clinical specimen is on par with conventional culture based techniques and when coupled with clinical information can lead to an accurate diagnostic tool for infectious disease diagnosis.

  11. The microbiome of Brazilian mangrove sediments as revealed by metagenomics.

    Directory of Open Access Journals (Sweden)

    Fernando Dini Andreote

    Full Text Available Here we embark in a deep metagenomic survey that revealed the taxonomic and potential metabolic pathways aspects of mangrove sediment microbiology. The extraction of DNA from sediment samples and the direct application of pyrosequencing resulted in approximately 215 Mb of data from four distinct mangrove areas (BrMgv01 to 04 in Brazil. The taxonomic approaches applied revealed the dominance of Deltaproteobacteria and Gammaproteobacteria in the samples. Paired statistical analysis showed higher proportions of specific taxonomic groups in each dataset. The metabolic reconstruction indicated the possible occurrence of processes modulated by the prevailing conditions found in mangrove sediments. In terms of carbon cycling, the sequences indicated the prevalence of genes involved in the metabolism of methane, formaldehyde, and carbon dioxide. With respect to the nitrogen cycle, evidence for sequences associated with dissimilatory reduction of nitrate, nitrogen immobilization, and denitrification was detected. Sequences related to the production of adenylsulfate, sulfite, and H(2S were relevant to the sulphur cycle. These data indicate that the microbial core involved in methane, nitrogen, and sulphur metabolism consists mainly of Burkholderiaceae, Planctomycetaceae, Rhodobacteraceae, and Desulfobacteraceae. Comparison of our data to datasets from soil and sea samples resulted in the allotment of the mangrove sediments between those samples. The results of this study add valuable data about the composition of microbial communities in mangroves and also shed light on possible transformations promoted by microbial organisms in mangrove sediments.

  12. Microbial community profiling of human saliva using shotgun metagenomic sequencing.

    Directory of Open Access Journals (Sweden)

    Nur A Hasan

    Full Text Available Human saliva is clinically informative of both oral and general health. Since next generation shotgun sequencing (NGS is now widely used to identify and quantify bacteria, we investigated the bacterial flora of saliva microbiomes of two healthy volunteers and five datasets from the Human Microbiome Project, along with a control dataset containing short NGS reads from bacterial species representative of the bacterial flora of human saliva. GENIUS, a system designed to identify and quantify bacterial species using unassembled short NGS reads was used to identify the bacterial species comprising the microbiomes of the saliva samples and datasets. Results, achieved within minutes and at greater than 90% accuracy, showed more than 175 bacterial species comprised the bacterial flora of human saliva, including bacteria known to be commensal human flora but also Haemophilus influenzae, Neisseria meningitidis, Streptococcus pneumoniae, and Gamma proteobacteria. Basic Local Alignment Search Tool (BLASTn analysis in parallel, reported ca. five times more species than those actually comprising the in silico sample. Both GENIUS and BLAST analyses of saliva samples identified major genera comprising the bacterial flora of saliva, but GENIUS provided a more precise description of species composition, identifying to strain in most cases and delivered results at least 10,000 times faster. Therefore, GENIUS offers a facile and accurate system for identification and quantification of bacterial species and/or strains in metagenomic samples.

  13. Gene-Based Pathogen Detection: Can We Use qPCR to Predict the Outcome of Diagnostic Metagenomics?

    DEFF Research Database (Denmark)

    Andersen, Sandra Christine; Fachmann, Mette Sofie Rousing; Kiil, Kristoffer

    2017-01-01

    with 10³ or 10⁶ colony-forming units (CFU)/g Campylobacter jejuni, as well as porcine fecal samples spiked with 10³ or 10⁶ CFU/g Salmonella typhimurium. DNA was extracted from the samples using variations of two widely used kits. The following quality parameters were measured: DNA concentration, qPCR, DNA......In microbial food safety, molecular methods such as quantitative PCR (qPCR) and next-generation sequencing (NGS) of bacterial isolates can potentially be replaced by diagnostic shotgun metagenomics. However, the methods for pre-analytical sample preparation are often optimized for qPCR, and do...... not necessarily perform equally well for qPCR and sequencing. The present study investigates, through screening of methods, whether qPCR can be used as an indicator for the optimization of sample preparation for NGS-based shotgun metagenomics with a diagnostic focus. This was used on human fecal samples spiked...

  14. Metagenomic and whole-genome analysis reveals new lineages of gokushoviruses and biogeographic separation in the sea

    Directory of Open Access Journals (Sweden)

    Jessica Myriam Labonté

    2013-12-01

    Full Text Available Much remains to be learned about single-stranded (ss DNA viruses in natural systems, and the evolutionary relationships among them. One of the eight recognized families of ssDNA viruses is the Microviridae, a group of viruses infecting bacteria. In this study we used metagenomic analysis, genome assembly and amplicon sequencing of purified ssDNA to show that bacteriophages belonging to the subfamily Gokushovirinae within the Microviridae are genetically diverse and widespread members of marine microbial communities. Metagenomic analysis of coastal samples from the Gulf of Mexico and British Columbia, Canada, revealed numerous sequences belonging to gokushoviruses and allowed the assembly of five putative genomes with an organization similar to chlamydiamicroviruses. Fragment recruitment to these genomes from different metagenomic data sets is consistent with gokushovirus genotypes being restricted to specific oceanic regions. Conservation among the assembled genomes allowed the design of degenerate primers that target an 800 bp fragment from the gene encoding the major capsid protein. Sequences could be amplified from coastal temperate and subtropical waters, but not from samples collected from the Arctic Ocean, or freshwater lakes. Phylogenetic analysis revealed that most sequences were distantly related to those from cultured representatives. Moreover, the sequences fell into at least seven distinct evolutionary groups, most of which were represented by one of the assembled metagenomes. Our results greatly expand the known sequence space for gokushoviruses, and reveal biogeographic separation and new evolutionary lineages of gokushoviruses in the oceans.

  15. Integrated metagenomic data analysis demonstrates that a loss of diversity in oral microbiota is associated with periodontitis.

    Science.gov (United States)

    Ai, Dongmei; Huang, Ruocheng; Wen, Jin; Li, Chao; Zhu, Jiangping; Xia, Li Charlie

    2017-01-25

    Periodontitis is an inflammatory disease affecting the tissues supporting teeth (periodontium). Integrative analysis of metagenomic samples from multiple periodontitis studies is a powerful way to examine microbiota diversity and interactions within host oral cavity. A total of 43 subjects were recruited to participate in two previous studies profiling the microbial community of human subgingival plaque samples using shotgun metagenomic sequencing. We integrated metagenomic sequence data from those two studies, including six healthy controls, 14 sites representative of stable periodontitis, 16 sites representative of progressing periodontitis, and seven periodontal sites of unknown status. We applied phylogenetic diversity, differential abundance, and network analyses, as well as clustering, to the integrated dataset to compare microbiological community profiles among the different disease states. We found alpha-diversity, i.e., mean species diversity in sites or habitats at a local scale, to be the single strongest predictor of subjects' periodontitis status (P metagenomics sequencing and phylogenetic profiling are predictive of early periodontitis, leading to potential therapeutic intervention. Our results also support a keystone pathogen-mediated polymicrobial synergy and dysbiosis (PSD) model to explain the etiology of periodontitis. Apart from P. gingivalis, we identified three additional keystone species potentially mediating the progression of periodontitis progression based on pathogenic characteristics similar to those of known keystone pathogens.

  16. ISS Expedition 21/22 Press Kit

    Data.gov (United States)

    National Aeronautics and Space Administration — Press kit for ISS mission Expedition 21/22 from 10/2009-03/2010. Press kits contain information about each mission overview, crew, mission timeline, benefits, and...

  17. [Research in metagenomics and its applications in translational medicine].

    Science.gov (United States)

    Chen, Jia-huan; Sun, Zheng; Wang, Xiao-jun; Su, Xiao-quan; Ning, Kang

    2015-07-01

    Humans are born with microbiota, which have accompanied us through our life-span. There is an important symbiotic relationship between us and the microbial communities, thus microbial communities are of great importance to our health. All genomic information within this microbiota is referered to as "metagenomics" (also referred to as "human's second genome"). The analysis of high throughput metagenomic data generated from biomedical experiments would provide new approaches for translational research, and it have several applications in clinics. With the help of next generation sequencing technology and the emerging metagenomic approach (analysis of all genomic information in microbiota as a whole), we can overcome the pitfalls of tedious traditional method of isolation and cultivation of single microbial species. The metagenomic approach can also help us to analyze the whole microbial community efficiently and offer deep insights in human-microbe relationships as well as new ideas on many biomedical problems. In this review, we summarize frontiers in metagenomic research, including new concepts and methods. Then, we focus on the applications of metagenomic research in medical researches and clinical applications in recent years, which would clearly show the importance of metagenomic research in the field of translational medicine.

  18. Educational expeditions - et norsk perspektiv

    Directory of Open Access Journals (Sweden)

    Andre Horgen

    2015-06-01

    Full Text Available AbstractThe topic of this article is the Norwegian concept of “friluftsliv” (outdoor life, used as a pedagogical tool to support personal growth. While supporting personal growth appears to be a central pedagogical strategy within Anglo-American and British youth expeditions and adventure programming, this does not appear to be case in the Norwegian outdoor tradition. My research question is: Do Norwegian Outdoor Education students experience a learning outcome related to personal growth, and to their abilities as leaders/mentors, during ski expeditions? I have collected data through a three-year period, after three ski expeditions with Outdoor Education students from an outdoor bachelor-programme at Telemark University College.The students have given written answers to questions regarding personal growth in which several informants’ express thoughts about experiences related to “self” and “identity”. They reflect upon experiences related to “mastering” and “performing”, to acceptance of their own strengths and weaknesses, and about developing self-confidence. They also reflect upon learning outcomes related to interpersonal relations and abilities, self-control, communication and caregiving. The informants have experienced, as leaders/mentors, that it is important to be able to, to “read” situations, to make good assessments of the situations, and to make good decisions related to the situations. As a follow up to this, the informants highlight the importance of being aware of each individual in the group, the importance of encouragement, being positive and caregiving. This study has shown that ski expeditions in “a Norwegian tradition” may have a potential when it comes to encouraging reflections related to personal growth and leadership abilities. Hopefully this study can contribute to increased awareness of the pedagogical potential, for personal growth, within the Norwegian concept of

  19. New Bacterial Phytase through Metagenomic Prospection

    Directory of Open Access Journals (Sweden)

    Nathálya Farias

    2018-02-01

    Full Text Available Alkaline phytases from uncultured microorganisms, which hydrolyze phytate to less phosphorylated myo-inositols and inorganic phosphate, have great potential as additives in agricultural industry. The development of metagenomics has stemmed from the ineluctable evidence that as-yet-uncultured microorganisms represent the vast majority of organisms in most environments on earth. In this study, a gene encoding a phytase was cloned from red rice crop residues and castor bean cake using a metagenomics strategy. The amino acid identity between this gene and its closest published counterparts is lower than 60%. The phytase was named PhyRC001 and was biochemically characterized. This recombinant protein showed activity on sodium phytate, indicating that PhyRC001 is a hydrolase enzyme. The enzymatic activity was optimal at a pH of 7.0 and at a temperature of 35 °C. β-propeller phytases possess great potential as feed additives because they are the only type of phytase with high activity at neutral pH. Therefore, to explore and exploit the underlying mechanism for β-propeller phytase functions could be of great benefit to biotechnology.

  20. Comparing the effectiveness of metagenomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrix nemaeus).

    Science.gov (United States)

    Srivathsan, Amrita; Sha, John C M; Vogler, Alfried P; Meier, Rudolf

    2015-03-01

    Faecal samples are of great value as a non-invasive means to gather information on the genetics, distribution, demography, diet and parasite infestation of endangered species. Direct shotgun sequencing of faecal DNA could give information on these simultaneously, but this approach is largely untested. Here, we used two faecal samples to characterize the diet of two red-shanked doucs langurs (Pygathrix nemaeus) that were fed known foliage, fruits, vegetables and cereals. Illumina HiSeq produced ~74 and 67 million paired reads for these samples, of which ~ 10,000 (0.014%) and ~ 44,000 (0.066%), respectively, were of chloroplast origin. Sequences were matched against a database of available chloroplast 'barcodes' for angiosperms. The results were compared with 'metabarcoding' using PCR amplification of the P6 loop of trnL. Metagenomics identified seven and nine of the likely 16 diet plants while six and five were identified by metabarcoding. Metabarcoding produced thousands of reads consistent with the known diet, but the barcodes were too short to identify several plant species to genus. Metagenomics utilized multiple, longer barcodes that combined had greater power of identification. However, rare diet items were not recovered. Read numbers for diet species in metagenomic and metabarcoding data were correlated, indicating that both are useful for determining relative sequence abundance. Metagenomic reads were uniformly distributed across the chloroplast genomes; thus, if chloroplast genomes were used as reference, the precision of identifications and species recovery would improve further. Metagenomics also recovered the host mitochondrial genome and numerous intestinal parasite sequences in addition to generating data useful for characterizing the microbiome. © 2014 John Wiley & Sons Ltd.

  1. Metagenomics of the Methane Ice Worm, Hesiocaeca methanicola

    Science.gov (United States)

    Goodwin, K. D.; Edsall, L.; Xin, W.; Head, S. R.; Gelbart, T.; Wood, A. M.; Gaasterland, T.

    2012-12-01

    The methane ice worm (Hesiocaeca methanicola) is a polychaete found on methane hydrate deposits for which there appears to be no publically available genomic or metagenomic data. Methane ice worms were collected in 2009 by the Johnson-Sea-Link submersible (543m depth; N 27:44.7526 W 91:13.3168). Next-generation sequencing (HiSeq2000) was applied to samples of tissue and gut contents. A subset of the assembled data (40M reads, randomly selected) was run through MG-RAST. Preliminary results for the gut content (1,269,153 sequences, average length 202 bp) indicated that 0.1% of the sequences contained ribosomal RNA genes with the majority (67%) classified as Bacteria, a relatively small per cent (1.4%) as Archae, and 31% as Eukaryota. Campylobacterales was the predominant order (14%), with unclassified (7.5%) and Desulfobacterales (4%) being the next dominant. Preliminary results for the worm tissue (2,716,461 sequences, average length 241 bp) indicated that the majority of sequences were Eukaryota (73%), with 256 sequences classified as phylum Annelida and 58% of those belonging to class Polychaeta. For the bacterial sequences obtained from the tissue samples, the predominant order was Actinomycetales (2.7%). For both the tissue and gut content samples, the majority of proteins were classified as clustering-based subsystems. This preliminary analysis will be compared to an assembly consisting of 40M of the highest quality reads.; methane ice worms on methane hydrate

  2. Metagenomics for the development of new biocatalysts to advance lignocellulose saccharification for bioeconomic development.

    Science.gov (United States)

    Montella, Salvatore; Amore, Antonella; Faraco, Vincenza

    2016-12-01

    The world economy is moving toward the use of renewable and nonedible lignocellulosic biomasses as substitutes for fossil sources in order to decrease the environmental impact of manufacturing processes and overcome the conflict with food production. Enzymatic hydrolysis of the feedstock is a key technology for bio-based chemical production, and the identification of novel, less expensive and more efficient biocatalysts is one of the main challenges. As the genomic era has shown that only a few microorganisms can be cultured under standard laboratory conditions, the extraction and analysis of genetic material directly from environmental samples, termed metagenomics, is a promising way to overcome this bottleneck. Two screening methodologies can be used on metagenomic material: the function-driven approach of expression libraries and sequence-driven analysis based on gene homology. Both techniques have been shown to be useful for the discovery of novel biocatalysts for lignocellulose conversion, and they enabled identification of several (hemi)cellulases and accessory enzymes involved in (hemi)cellulose hydrolysis. This review summarizes the latest progress in metagenomics aimed at discovering new enzymes for lignocellulose saccharification.

  3. Mining metagenomic and metatranscriptomic data for clues about microbial metabolic functions in ruminants.

    Science.gov (United States)

    Li, Fuyong; Neves, Andre L A; Ghoshal, Bibaswan; Guan, Le Luo

    2017-12-20

    Metagenomics and metatranscriptomics can capture the whole genome and transcriptome repertoire of microorganisms through sequencing total DNA/RNA from various environmental samples, providing both taxonomic and functional information with high resolution. The unique and complex rumen microbial ecosystem is receiving great research attention because the rumen microbiota coevolves with the host and equips ruminants with the ability to convert cellulosic plant materials to high-protein products for human consumption. To date, hundreds to thousands of microbial phylotypes have been identified in the rumen using culture-independent molecular-based approaches, and genomic information of rumen microorganisms is rapidly accumulating through the single genome sequencing. However, functional characteristics of the rumen microbiome have not been well described because there are numerous uncultivable microorganisms in the rumen. The advent of metagenomics and metatranscriptomics along with advanced bioinformatics methods can help us better understand mechanisms of the rumen fermentation, which is vital for improving nutrient utilization and animal productivity. Therefore, in this review, we summarize a general workflow to conduct rumen metagenomics and metatranscriptomics and discuss how the data can be interpreted to be useful information. Moreover, we review recent literatures studying associations between the rumen microbiome and host phenotypes (e.g., feed efficiency and methane emissions) using these approaches, aiming to provide a useful guide to include studying the rumen microbiome as one of the research objectives using these 2 approaches. Copyright © 2018 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  4. SUPER-FOCUS: a tool for agile functional analysis of shotgun metagenomic data.

    Science.gov (United States)

    Silva, Genivaldo Gueiros Z; Green, Kevin T; Dutilh, Bas E; Edwards, Robert A

    2016-02-01

    Analyzing the functional profile of a microbial community from unannotated shotgun sequencing reads is one of the important goals in metagenomics. Functional profiling has valuable applications in biological research because it identifies the abundances of the functional genes of the organisms present in the original sample, answering the question what they can do. Currently, available tools do not scale well with increasing data volumes, which is important because both the number and lengths of the reads produced by sequencing platforms keep increasing. Here, we introduce SUPER-FOCUS, SUbsystems Profile by databasE Reduction using FOCUS, an agile homology-based approach using a reduced reference database to report the subsystems present in metagenomic datasets and profile their abundances. SUPER-FOCUS was tested with over 70 real metagenomes, the results showing that it accurately predicts the subsystems present in the profiled microbial communities, and is up to 1000 times faster than other tools. SUPER-FOCUS was implemented in Python, and its source code and the tool website are freely available at https://edwards.sdsu.edu/SUPERFOCUS. redwards@mail.sdsu.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  5. Metagenomic profiling of known and unknown microbes with microbeGPS.

    Directory of Open Access Journals (Sweden)

    Martin S Lindner

    Full Text Available Microbial community profiling identifies and quantifies organisms in metagenomic sequencing data using either reference based or unsupervised approaches. However, current reference based profiling methods only report the presence and abundance of single reference genomes that are available in databases. Since only a small fraction of environmental genomes is represented in genomic databases, these approaches entail the risk of false identifications and often suggest a higher precision than justified by the data. Therefore, we developed MicrobeGPS, a novel metagenomic profiling approach that overcomes these limitations. MicrobeGPS is the first method that identifies microbiota in the sample and estimates their genomic distances to known reference genomes. With this strategy, MicrobeGPS identifies organisms down to the strain level and highlights possibly inaccurate identifications when the correct reference genome is missing. We demonstrate on three metagenomic datasets with different origin that our approach successfully avoids misleading interpretation of results and additionally provides more accurate results than current profiling methods. Our results indicate that MicrobeGPS can enable reference based taxonomic profiling of complex and less characterized microbial communities. MicrobeGPS is open source and available from https://sourceforge.net/projects/microbegps/ as source code and binary distribution for Windows and Linux operating systems.

  6. Metagenomic profiling of known and unknown microbes with microbeGPS.

    Science.gov (United States)

    Lindner, Martin S; Renard, Bernhard Y

    2015-01-01

    Microbial community profiling identifies and quantifies organisms in metagenomic sequencing data using either reference based or unsupervised approaches. However, current reference based profiling methods only report the presence and abundance of single reference genomes that are available in databases. Since only a small fraction of environmental genomes is represented in genomic databases, these approaches entail the risk of false identifications and often suggest a higher precision than justified by the data. Therefore, we developed MicrobeGPS, a novel metagenomic profiling approach that overcomes these limitations. MicrobeGPS is the first method that identifies microbiota in the sample and estimates their genomic distances to known reference genomes. With this strategy, MicrobeGPS identifies organisms down to the strain level and highlights possibly inaccurate identifications when the correct reference genome is missing. We demonstrate on three metagenomic datasets with different origin that our approach successfully avoids misleading interpretation of results and additionally provides more accurate results than current profiling methods. Our results indicate that MicrobeGPS can enable reference based taxonomic profiling of complex and less characterized microbial communities. MicrobeGPS is open source and available from https://sourceforge.net/projects/microbegps/ as source code and binary distribution for Windows and Linux operating systems.

  7. Activity-Based Screening of Metagenomic Libraries for Hydrogenase Enzymes.

    Science.gov (United States)

    Adam, Nicole; Perner, Mirjam

    2017-01-01

    Here we outline how to identify hydrogenase enzymes from metagenomic libraries through an activity-based screening approach. A metagenomic fosmid library is constructed in E. coli and the fosmids are transferred into a hydrogenase deletion mutant of Shewanella oneidensis (ΔhyaB) via triparental mating. If a fosmid exhibits hydrogen uptake activity, S. oneidensis' phenotype is restored and hydrogenase activity is indicated by a color change of the medium from yellow to colorless. This new method enables screening of 48 metagenomic fosmid clones in parallel.

  8. Bioprospecting potential of the soil metagenome: novel enzymes and bioactivities.

    Science.gov (United States)

    Lee, Myung Hwan; Lee, Seon-Woo

    2013-09-01

    The microbial diversity in soil ecosystems is higher than in any other microbial ecosystem. The majority of soil microorganisms has not been characterized, because the dominant members have not been readily culturable on standard cultivation media; therefore, the soil ecosystem is a great reservoir for the discovery of novel microbial enzymes and bioactivities. The soil metagenome, the collective microbial genome, could be cloned and sequenced directly from soils to search for novel microbial resources. This review summarizes the microbial diversity in soils and the efforts to search for microbial resources from the soil metagenome, with more emphasis on the potential of bioprospecting metagenomics and recent discoveries.

  9. Bioprospecting Potential of the Soil Metagenome: Novel Enzymes and Bioactivities

    Directory of Open Access Journals (Sweden)

    Myung Hwan Lee

    2013-09-01

    Full Text Available The microbial diversity in soil ecosystems is higher than in any other microbial ecosystem. The majority of soil microorganisms has not been characterized, because the dominant members have not been readily culturable on standard cultivation media; therefore, the soil ecosystem is a great reservoir for the discovery of novel microbial enzymes and bioactivities. The soil metagenome, the collective microbial genome, could be cloned and sequenced directly from soils to search for novel microbial resources. This review summarizes the microbial diversity in soils and the efforts to search for microbial resources from the soil metagenome, with more emphasis on the potential of bioprospecting metagenomics and recent discoveries.

  10. Metagenomic search strategies for interactions among plants and multiple microbes

    Directory of Open Access Journals (Sweden)

    Ulrich Karl Melcher

    2014-06-01

    Full Text Available Plants harbor multiple microbes. Metagenomics can facilitate understanding of the significance, for the plant, of the microbes and of the interactions among them. However, current approaches to metagenomic analysis of plants are computationally time-consuming. Efforts to speed the discovery process include improvement of computational speed, condensing the sequencing reads into smaller datasets before BLAST searches, simplifying the target database of BLAST searches, and flipping the roles of metagenomic and reference datasets. The latter is exemplified by the E-probe diagnostic nucleic acid analysis (EDNA approach originally devised for improving analysis during plant quarantine.

  11. A viral metagenomic approach on a nonmetagenomic experiment

    DEFF Research Database (Denmark)

    Bovo, Samuele; Mazzoni, Gianluca; Ribani, Anisa

    2017-01-01

    unmapped reads on the reference pig genome that were obtained from the two NGS datasets. In silico analyses included read mapping and sequence assembly approaches for a viral metagenomic analysis using the NCBI Viral Genome Resource. Our approach identified sequences matching several viruses...... a retrospective evaluation of apparently asymptomatic parvovirus infected pigs providing information that could be important to define occurrence and prevalence of different parvoviruses in South Europe. This study demonstrated the potential of mining NGS datasets non-originally derived by metagenomics...... experiments for viral metagenomics analyses in a livestock species....

  12. An Improved Methodology to Overcome Key Issues in Human Fecal Metagenomic DNA Extraction

    Directory of Open Access Journals (Sweden)

    Jitendra Kumar

    2016-12-01

    Full Text Available Microbes are ubiquitously distributed in nature, and recent culture-independent studies have highlighted the significance of gut microbiota in human health and disease. Fecal DNA is the primary source for the majority of human gut microbiome studies. However, further improvement is needed to obtain fecal metagenomic DNA with sufficient amount and good quality but low host genomic DNA contamination. In the current study, we demonstrate a quick, robust, unbiased, and cost-effective method for the isolation of high molecular weight (>23 kb metagenomic DNA (260/280 ratio >1.8 with a good yield (55.8 ± 3.8 ng/mg of feces. We also confirm that there is very low human genomic DNA contamination (eubacterial: human genomic DNA marker genes = 227.9:1 in the human feces. The newly-developed method robustly performs for fresh as well as stored fecal samples as demonstrated by 16S rRNA gene sequencing using 454 FLX+. Moreover, 16S rRNA gene analysis indicated that compared to other DNA extraction methods tested, the fecal metagenomic DNA isolated with current methodology retains species richness and does not show microbial diversity biases, which is further confirmed by qPCR with a known quantity of spike-in genomes. Overall, our data highlight a protocol with a balance between quality, amount, user-friendliness, and cost effectiveness for its suitability toward usage for culture-independent analysis of the human gut microbiome, which provides a robust solution to overcome key issues associated with fecal metagenomic DNA isolation in human gut microbiome studies.

  13. Metagenomics, metatranscriptomics and single cell genomics reveal functional response of active Oceanospirillales to Gulf oil spill

    Energy Technology Data Exchange (ETDEWEB)

    Mason, Olivia U.; Hazen, Terry C.; Borglin, Sharon; Chain, Patrick S. G.; Dubinsky, Eric A.; Fortney, Julian L.; Han, James; Holman, Hoi-Ying N.; Hultman, Jenni; Lamendella, Regina; Mackelprang, Rachel; Malfatti, Stephanie; Tom, Lauren M.; Tringe, Susannah G.; Woyke, Tanja; Zhou, Jizhong; Rubin, Edward M.; Jansson, Janet K.

    2012-06-12

    The Deepwater Horizon oil spill in the Gulf of Mexico resulted in a deep-sea hydrocarbon plume that caused a shift in the indigenous microbial community composition with unknown ecological consequences. Early in the spill history, a bloom of uncultured, thus uncharacterized, members of the Oceanospirillales was previously detected, but their role in oil disposition was unknown. Here our aim was to determine the functional role of the Oceanospirillales and other active members of the indigenous microbial community using deep sequencing of community DNA and RNA, as well as single-cell genomics. Shotgun metagenomic and metatranscriptomic sequencing revealed that genes for motility, chemotaxis and aliphatic hydrocarbon degradation were significantly enriched and expressed in the hydrocarbon plume samples compared with uncontaminated seawater collected from plume depth. In contrast, although genes coding for degradation of more recalcitrant compounds, such as benzene, toluene, ethylbenzene, total xylenes and polycyclic aromatic hydrocarbons, were identified in the metagenomes, they were expressed at low levels, or not at all based on analysis of the metatranscriptomes. Isolation and sequencing of two Oceanospirillales single cells revealed that both cells possessed genes coding for n-alkane and cycloalkane degradation. Specifically, the near-complete pathway for cyclohexane oxidation in the Oceanospirillales single cells was elucidated and supported by both metagenome and metatranscriptome data. The draft genome also included genes for chemotaxis, motility and nutrient acquisition strategies that were also identified in the metagenomes and metatranscriptomes. These data point towards a rapid response of members of the Oceanospirillales to aliphatic hydrocarbons in the deep sea.

  14. Tracking microbial colonization in fecal microbiota transplantation experiments via genome-resolved metagenomics.

    Science.gov (United States)

    Lee, Sonny T M; Kahn, Stacy A; Delmont, Tom O; Shaiber, Alon; Esen, Özcan C; Hubert, Nathaniel A; Morrison, Hilary G; Antonopoulos, Dionysios A; Rubin, David T; Eren, A Murat

    2017-05-04

    Fecal microbiota transplantation (FMT) is an effective treatment for recurrent Clostridium difficile infection and shows promise for treating other medical conditions associated with intestinal dysbioses. However, we lack a sufficient understanding of which microbial populations successfully colonize the recipient gut, and the widely used approaches to study the microbial ecology of FMT experiments fail to provide enough resolution to identify populations that are likely responsible for FMT-derived benefits. We used shotgun metagenomics together with assembly and binning strategies to reconstruct metagenome-assembled genomes (MAGs) from fecal samples of a single FMT donor. We then used metagenomic mapping to track the occurrence and distribution patterns of donor MAGs in two FMT recipients. Our analyses revealed that 22% of the 92 highly complete bacterial MAGs that we identified from the donor successfully colonized and remained abundant in two recipients for at least 8 weeks. Most MAGs with a high colonization rate belonged to the order Bacteroidales. The vast majority of those that lacked evidence of colonization belonged to the order Clostridiales, and colonization success was negatively correlated with the number of genes related to sporulation. Our analysis of 151 publicly available gut metagenomes showed that the donor MAGs that colonized both recipients were prevalent, and the ones that colonized neither were rare across the participants of the Human Microbiome Project. Although our dataset showed a link between taxonomy and the colonization ability of a given MAG, we also identified MAGs that belong to the same taxon with different colonization properties, highlighting the importance of an appropriate level of resolution to explore the functional basis of colonization and to identify targets for cultivation, hypothesis generation, and testing in model systems. The analytical strategy adopted in our study can provide genomic insights into bacterial

  15. An Improved Methodology to Overcome Key Issues in Human Fecal Metagenomic DNA Extraction.

    Science.gov (United States)

    Kumar, Jitendra; Kumar, Manoj; Gupta, Shashank; Ahmed, Vasim; Bhambi, Manu; Pandey, Rajesh; Chauhan, Nar Singh

    2016-12-01

    Microbes are ubiquitously distributed in nature, and recent culture-independent studies have highlighted the significance of gut microbiota in human health and disease. Fecal DNA is the primary source for the majority of human gut microbiome studies. However, further improvement is needed to obtain fecal metagenomic DNA with sufficient amount and good quality but low host genomic DNA contamination. In the current study, we demonstrate a quick, robust, unbiased, and cost-effective method for the isolation of high molecular weight (>23kb) metagenomic DNA (260/280 ratio >1.8) with a good yield (55.8±3.8ng/mg of feces). We also confirm that there is very low human genomic DNA contamination (eubacterial: human genomic DNA marker genes=227.9:1) in the human feces. The newly-developed method robustly performs for fresh as well as stored fecal samples as demonstrated by 16S rRNA gene sequencing using 454 FLX+. Moreover, 16S rRNA gene analysis indicated that compared to other DNA extraction methods tested, the fecal metagenomic DNA isolated with current methodology retains species richness and does not show microbial diversity biases, which is further confirmed by qPCR with a known quantity of spike-in genomes. Overall, our data highlight a protocol with a balance between quality, amount, user-friendliness, and cost effectiveness for its suitability toward usage for culture-independent analysis of the human gut microbiome, which provides a robust solution to overcome key issues associated with fecal metagenomic DNA isolation in human gut microbiome studies. Copyright © 2016 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.

  16. Critical Assessment of Metagenome Interpretation-a benchmark of metagenomics software.

    Science.gov (United States)

    Sczyrba, Alexander; Hofmann, Peter; Belmann, Peter; Koslicki, David; Janssen, Stefan; Dröge, Johannes; Gregor, Ivan; Majda, Stephan; Fiedler, Jessika; Dahms, Eik; Bremges, Andreas; Fritz, Adrian; Garrido-Oter, Ruben; Jørgensen, Tue Sparholt; Shapiro, Nicole; Blood, Philip D; Gurevich, Alexey; Bai, Yang; Turaev, Dmitrij; DeMaere, Matthew Z; Chikhi, Rayan; Nagarajan, Niranjan; Quince, Christopher; Meyer, Fernando; Balvočiūtė, Monika; Hansen, Lars Hestbjerg; Sørensen, Søren J; Chia, Burton K H; Denis, Bertrand; Froula, Jeff L; Wang, Zhong; Egan, Robert; Don Kang, Dongwan; Cook, Jeffrey J; Deltel, Charles; Beckstette, Michael; Lemaitre, Claire; Peterlongo, Pierre; Rizk, Guillaume; Lavenier, Dominique; Wu, Yu-Wei; Singer, Steven W; Jain, Chirag; Strous, Marc; Klingenberg, Heiner; Meinicke, Peter; Barton, Michael D; Lingner, Thomas; Lin, Hsin-Hung; Liao, Yu-Chieh; Silva, Genivaldo Gueiros Z; Cuevas, Daniel A; Edwards, Robert A; Saha, Surya; Piro, Vitor C; Renard, Bernhard Y; Pop, Mihai; Klenk, Hans-Peter; Göker, Markus; Kyrpides, Nikos C; Woyke, Tanja; Vorholt, Julia A; Schulze-Lefert, Paul; Rubin, Edward M; Darling, Aaron E; Rattei, Thomas; McHardy, Alice C

    2017-11-01

    Methods for assembly, taxonomic profiling and binning are key to interpreting metagenome data, but a lack of consensus about benchmarking complicates performance assessment. The Critical Assessment of Metagenome Interpretation (CAMI) challenge has engaged the global developer community to benchmark their programs on highly complex and realistic data sets, generated from ∼700 newly sequenced microorganisms and ∼600 novel viruses and plasmids and representing common experimental setups. Assembly and genome binning programs performed well for species represented by individual genomes but were substantially affected by the presence of related strains. Taxonomic profiling and binning programs were proficient at high taxonomic ranks, with a notable performance decrease below family level. Parameter settings markedly affected performance, underscoring their importance for program reproducibility. The CAMI results highlight current challenges but also provide a roadmap for software selection to answer specific research questions.

  17. METAREP: JCVI metagenomics reports--an open source tool for high-performance comparative metagenomics.

    Science.gov (United States)

    Goll, Johannes; Rusch, Douglas B; Tanenbaum, David M; Thiagarajan, Mathangi; Li, Kelvin; Methé, Barbara A; Yooseph, Shibu

    2010-10-15

    JCVI Metagenomics Reports (METAREP) is a Web 2.0 application designed to help scientists analyze and compare annotated metagenomics datasets. It utilizes Solr/Lucene, a high-performance scalable search engine, to quickly query large data collections. Furthermore, users can use its SQL-like query syntax to filter and refine datasets. METAREP provides graphical summaries for top taxonomic and functional classifications as well as a GO, NCBI Taxonomy and KEGG Pathway Browser. Users can compare absolute and relative counts of multiple datasets at various functional and taxonomic levels. Advanced comparative features comprise statistical tests as well as multidimensional scaling, heatmap and hierarchical clustering plots. Summaries can be exported as tab-delimited files, publication quality plots in PDF format. A data management layer allows collaborative data analysis and result sharing. Web site http://www.jcvi.org/metarep; source code http://github.com/jcvi/METAREP CONTACT: syooseph@jcvi.org Supplementary data are available at Bioinformatics online.

  18. A Functional Metagenomic Analysis of Tetracycline Resistance in Cheese Bacteria

    Directory of Open Access Journals (Sweden)

    Ana B. Flórez

    2017-05-01

    Full Text Available Metagenomic techniques have been successfully used to monitor antibiotic resistance genes in environmental, animal and human ecosystems. However, despite the claim that the food chain plays a key role in the spread of antibiotic resistance, metagenomic analysis has scarcely been used to investigate food systems. The present work reports a functional metagenomic analysis of the prevalence and evolution of tetracycline resistance determinants in a raw-milk, blue-veined cheese during manufacturing and ripening. For this, the same cheese batch was sampled and analyzed on days 3 and 60 of manufacture. Samples were diluted and grown in the presence of tetracycline on plate count milk agar (PCMA (non-selective and de Man Rogosa and Sharpe (MRS agar (selective for lactic acid bacteria, LAB. DNA from the cultured bacteria was then isolated and used to construct four fosmid libraries, named after the medium and sampling time: PCMA-3D, PCMA-60D, MRS-3D, and MRS-60D. Clones in the libraries were subjected to restriction enzyme analysis, PCR amplification, and sequencing. Among the 300 fosmid clones analyzed, 268 different EcoRI restriction profiles were encountered. Sequence homology of their extremes clustered the clones into 47 groups. Representative clones of all groups were then screened for the presence of tetracycline resistance genes by PCR, targeting well-recognized genes coding for ribosomal protection proteins and efflux pumps. A single tetracycline resistance gene was detected in each of the clones, with four such resistance genes identified in total: tet(A, tet(L, tet(M, and tet(S. tet(A was the only gene identified in the PCMA-3D library, and tet(L the only one identified in the PCMA-60D and MRS-60D libraries. tet(M and tet(S were both detected in the MRS-3D library and in similar numbers. Six representative clones of the libraries were sequenced and analyzed. Long segments of all clones but one showed extensive homology to plasmids from Gram

  19. ARGs-OAP: online analysis pipeline for antibiotic resistance genes detection from metagenomic data using an integrated structured ARG-database.

    Science.gov (United States)

    Yang, Ying; Jiang, Xiaotao; Chai, Benli; Ma, Liping; Li, Bing; Zhang, Anni; Cole, James R; Tiedje, James M; Zhang, Tong

    2016-08-01

    Environmental dissemination of antibiotic resistance genes (ARGs) has become an increasing concern for public health. Metagenomics approaches can effectively detect broad profiles of ARGs in environmental samples; however, the detection and subsequent classification of ARG-like sequences are time consuming and have been severe obstacles in employing metagenomic methods. We sought to accelerate quantification of ARGs in metagenomic data from environmental samples. A Structured ARG reference database (SARG) was constructed by integrating ARDB and CARD, the two most commonly used databases. SARG was curated to remove redundant sequences and optimized to facilitate query sequence identification by similarity. A database with a hierarchical structure (type-subtype-reference sequence) was then constructed to facilitate classification (assigning ARG-like sequence to type, subtype and reference sequence) of sequences identified through similarity search. Utilizing SARG and a previously proposed hybrid functional gene annotation pipeline, we developed an online pipeline called ARGs-OAP for fast annotation and classification of ARG-like sequences from metagenomic data. We also evaluated and proposed a set of criteria important for efficiently conducting metagenomic analysis of ARGs using ARGs-OAP. Perl script for ARGs-OAP can be downloaded from https://github.com/biofuture/Ublastx_stageone ARGs-OAP can be accessed through http://smile.hku.hk/SARGs zhangt@hku.hk or tiedjej@msu.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  20. Moleculo Long-Read Sequencing Facilitates Assembly and Genomic Binning from Complex Soil Metagenomes

    Energy Technology Data Exchange (ETDEWEB)

    White, Richard Allen; Bottos, Eric M.; Roy Chowdhury, Taniya; Zucker, Jeremy D.; Brislawn, Colin J.; Nicora, Carrie D.; Fansler, Sarah J.; Glaesemann, Kurt R.; Glass, Kevin; Jansson, Janet K.; Langille, Morgan

    2016-06-28

    functional roles in ecosystem stability and responses to environmental perturbations. This knowledge gap is largely due to the difficulty in culturing the majority of soil microbes. Thus, use of culture-independent approaches, such as metagenomics, promises the direct assessment of the functional potential of soil microbiomes. Soil is, however, a challenge for metagenomic assembly due to its high microbial diversity and variable evenness, resulting in low coverage and uneven sampling of microbial genomes. Despite increasingly large soil metagenome data volumes (>200 Gbp), the majority of the data do not assemble. Here, we used the cutting-edge approach of synthetic long-read sequencing technology (Moleculo) to assemble soil metagenome sequence data into long contigs and used the assemblies for binning of genomes.

    Author Video: Anauthor video summaryof this article is available.

  1. The Metagenomics and Metadesign of the Subways and Urban Biomes (MetaSUB) International Consortium inaugural meeting report.

    Science.gov (United States)

    2016-06-03

    The Metagenomics and Metadesign of the Subways and Urban Biomes (MetaSUB) International Consortium is a novel, interdisciplinary initiative comprised of experts across many fields, including genomics, data analysis, engineering, public health, and architecture. The ultimate goal of the MetaSUB Consortium is to improve city utilization and planning through the detection, measurement, and design of metagenomics within urban environments. Although continual measures occur for temperature, air pressure, weather, and human activity, including longitudinal, cross-kingdom ecosystem dynamics can alter and improve the design of cities. The MetaSUB Consortium is aiding these efforts by developing and testing metagenomic methods and standards, including optimized methods for sample collection, DNA/RNA isolation, taxa characterization, and data visualization. The data produced by the consortium can aid city planners, public health officials, and architectural designers. In addition, the study will continue to lead to the discovery of new species, global maps of antimicrobial resistance (AMR) markers, and novel biosynthetic gene clusters (BGCs). Finally, we note that engineered metagenomic ecosystems can help enable more responsive, safer, and quantified cities.

  2. Improving contig binning of metagenomic data using [Formula: see text] oligonucleotide frequency dissimilarity.

    Science.gov (United States)

    Wang, Ying; Wang, Kun; Lu, Yang Young; Sun, Fengzhu

    2017-09-20

    Metagenomics sequencing provides deep insights into microbial communities. To investigate their taxonomic structure, binning assembled contigs into discrete clusters is critical. Many binning algorithms have been developed, but their performance is not always satisfactory, especially for complex microbial communities, calling for further development. According to previous studies, relative sequence compositions are similar across different regions of the same genome, but they differ between distinct genomes. Generally, current tools have used the normalized frequency of k-tuples directly, but this represents an absolute, not relative, sequence composition. Therefore, we attempted to model contigs using relative k-tuple composition, followed by measuring dissimilarity between contigs using [Formula: see text]. The [Formula: see text] was designed to measure the dissimilarity between two long sequences or Next-Generation Sequencing data with the Markov models of the background genomes. This method was effective in revealing group and gradient relationships between genomes, metagenomes and metatranscriptomes. With many binning tools available, we do not try to bin contigs from scratch. Instead, we developed [Formula: see text] to adjust contigs among bins based on the output of existing binning tools for a single metagenomic sample. The tool is taxonomy-free and depends only on k-tuples. To evaluate the performance of [Formula: see text], five widely used binning tools with different strategies of sequence composition or the hybrid of sequence composition and abundance were selected to bin six synthetic and real datasets, after which [Formula: see text] was applied to adjust the binning results. Our experiments showed that [Formula: see text] consistently achieves the best performance with tuple length k = 6 under the independent identically distributed (i.i.d.) background model. Using the metrics of recall, precision and ARI (Adjusted Rand Index), [Formula: see

  3. Comparative Metagenomics of Freshwater Microbial Communities

    Energy Technology Data Exchange (ETDEWEB)

    Hemme, Chris; Deng, Ye; Tu, Qichao; Fields, Matthew; Gentry, Terry; Wu, Liyou; Tringe, Susannah; Watson, David; He, Zhili; Hazen, Terry; Tiedje, James; Rubin, Eddy; Zhou, Jizhong

    2010-05-17

    Previous analyses of a microbial metagenome from uranium and nitric-acid contaminated groundwater (FW106) showed significant environmental effects resulting from the rapid introduction of multiple contaminants. Effects include a massive loss of species and strain biodiversity, accumulation of toxin resistant genes in the metagenome and lateral transfer of toxin resistance genes between community members. To better understand these results in an ecological context, a second metagenome from a pristine groundwater system located along the same geological strike was sequenced and analyzed (FW301). It is hypothesized that FW301 approximates the ancestral FW106 community based on phylogenetic profiles and common geological parameters; however, even if is not the case, the datasets still permit comparisons between healthy and stressed groundwater ecosystems. Complex carbohydrate metabolism has been almost entirely lost in the stressed ecosystem. In contrast, the pristine system encodes a wide diversity of complex carbohydrate metabolism systems, suggesting that carbon turnover is very rapid and less leaky in the healthy groundwater system. FW301 encodes many (~;;160+) carbon monoxide dehydrogenase genes while FW106 encodes none. This result suggests that the community is frequently exposed to oxygen from aerated rainwater percolating into the subsurface, with a resulting high rate of carbon metabolism and CO production. When oxygen levels fall, the CO then serves as a major carbon source for the community. FW301 appears to be capable of CO2 fixation via the reductive carboxylase (reverse TCA) cycle and possibly acetogenesis, activities; these activities are lacking in the heterotrophic FW106 system which relies exclusively on respiration of nitrate and/or oxygen for energy production. FW301 encodes a complete set of B12 biosynthesis pathway at high abundance suggesting the use of sodium gradients for energy production in the healthy groundwater community. Overall

  4. Storage conditions of intestinal microbiota matter in metagenomic analysis

    Directory of Open Access Journals (Sweden)

    Cardona Silvia

    2012-07-01

    Full Text Available Abstract Background The structure and function of human gut microbiota is currently inferred from metagenomic and metatranscriptomic analyses. Recovery of intact DNA and RNA is therefore a critical step in these studies. Here, we evaluated how different storage conditions of fecal samples affect the quality of extracted nucleic acids and the stability of their microbial communities. Results We assessed the quality of genomic DNA and total RNA by microcapillary electrophoresis and analyzed the bacterial community structure by pyrosequencing the 16S rRNA gene. DNA and RNA started to fragment when samples were kept at room temperature for more than 24 h. The use of RNAse inhibitors diminished RNA degradation but this protection was not consistent among individuals. DNA and RNA degradation also occurred when frozen samples were defrosted for a short period (1 h before nucleic acid extraction. The same conditions that affected DNA and RNA integrity also altered the relative abundance of most taxa in the bacterial community analysis. In this case, intra-individual variability of microbial diversity was larger than inter-individual one. Conclusions Though this preliminary work explored a very limited number of parameters, the results suggest that storage conditions of fecal samples affect the integrity of DNA and RNA and the composition of their microbial community. For optimal preservation, stool samples should be kept at room temperature and brought at the laboratory within 24 h after collection or be stored immediately at −20°C in a home freezer and transported afterwards in a freezer pack to ensure that they do not defrost at any time. Mixing the samples with RNAse inhibitors outside the laboratory is not recommended since proper homogenization of the stool is difficult to monitor.

  5. Storage conditions of intestinal microbiota matter in metagenomic analysis.

    Science.gov (United States)

    Cardona, Silvia; Eck, Anat; Cassellas, Montserrat; Gallart, Milagros; Alastrue, Carmen; Dore, Joel; Azpiroz, Fernando; Roca, Joaquim; Guarner, Francisco; Manichanh, Chaysavanh

    2012-07-30

    The structure and function of human gut microbiota is currently inferred from metagenomic and metatranscriptomic analyses. Recovery of intact DNA and RNA is therefore a critical step in these studies. Here, we evaluated how different storage conditions of fecal samples affect the quality of extracted nucleic acids and the stability of their microbial communities. We assessed the quality of genomic DNA and total RNA by microcapillary electrophoresis and analyzed the bacterial community structure by pyrosequencing the 16S rRNA gene. DNA and RNA started to fragment when samples were kept at room temperature for more than 24 h. The use of RNAse inhibitors diminished RNA degradation but this protection was not consistent among individuals. DNA and RNA degradation also occurred when frozen samples were defrosted for a short period (1 h) before nucleic acid extraction. The same conditions that affected DNA and RNA integrity also altered the relative abundance of most taxa in the bacterial community analysis. In this case, intra-individual variability of microbial diversity was larger than inter-individual one. Though this preliminary work explored a very limited number of parameters, the results suggest that storage conditions of fecal samples affect the integrity of DNA and RNA and the composition of their microbial community. For optimal preservation, stool samples should be kept at room temperature and brought at the laboratory within 24 h after collection or be stored immediately at -20°C in a home freezer and transported afterwards in a freezer pack to ensure that they do not defrost at any time. Mixing the samples with RNAse inhibitors outside the laboratory is not recommended since proper homogenization of the stool is difficult to monitor.

  6. CoMeta: classification of metagenomes using k-mers.

    Directory of Open Access Journals (Sweden)

    Jolanta Kawulok

    Full Text Available Nowadays, the study of environmental samples has been developing rapidly. Characterization of the environment composition broadens the knowledge about the relationship between species composition and environmental conditions. An important element of extracting the knowledge of the sample composition is to compare the extracted fragments of DNA with sequences derived from known organisms. In the presented paper, we introduce an algorithm called CoMeta (Classification of metagenomes, which assigns a query read (a DNA fragment into one of the groups previously prepared by the user. Typically, this is one of the taxonomic rank (e.g., phylum, genus, however prepared groups may contain sequences having various functions. In CoMeta, we used the exact method for read classification using short subsequences (k-mers and fast program for indexing large set of k-mers. In contrast to the most popular methods based on BLAST, where the query is compared with each reference sequence, we begin the classification from the top of the taxonomy tree to reduce the number of comparisons. The presented experimental study confirms that CoMeta outperforms other programs used in this context. CoMeta is available at https://github.com/jkawulok/cometa under a free GNU GPL 2 license.

  7. CoMeta: classification of metagenomes using k-mers.

    Science.gov (United States)

    Kawulok, Jolanta; Deorowicz, Sebastian

    2015-01-01

    Nowadays, the study of environmental samples has been developing rapidly. Characterization of the environment composition broadens the knowledge about the relationship between species composition and environmental conditions. An important element of extracting the knowledge of the sample composition is to compare the extracted fragments of DNA with sequences derived from known organisms. In the presented paper, we introduce an algorithm called CoMeta (Classification of metagenomes), which assigns a query read (a DNA fragment) into one of the groups previously prepared by the user. Typically, this is one of the taxonomic rank (e.g., phylum, genus), however prepared groups may contain sequences having various functions. In CoMeta, we used the exact method for read classification using short subsequences (k-mers) and fast program for indexing large set of k-mers. In contrast to the most popular methods based on BLAST, where the query is compared with each reference sequence, we begin the classification from the top of the taxonomy tree to reduce the number of comparisons. The presented experimental study confirms that CoMeta outperforms other programs used in this context. CoMeta is available at https://github.com/jkawulok/cometa under a free GNU GPL 2 license.

  8. A Metagenomic Survey of Limestone Hill in Taiwan

    Science.gov (United States)

    Hsu, Y. W.; Li, K. Y.; Chen, Y. W.; Huang, T. Y.; Chen, W. J.; Shih, Y. J.; Chen, J. S.; Fan, C. W.; Hsu, B. M.

    2016-12-01

    The limestone of Narro-Sky in Tainliao, Taiwan is of Pleistocene reef limestones interbedded in clastic layers that covered the Takangshan anticlines. Understanding how microbial relative abundance was changed in response to changes of environmental factors may contribute to better comprehension of roles that microorganisms play in altering the landscape structures. In this study, microorganisms growing on the wall of limestone, in the water dripping from the limestone wall and of soil underneath the wall were collected from different locations where the environmental factors such as daytime illumination, humidity, or pH are different. Next generation sequencing (NGS) was carried out to examine the compositions and richness of microbial community. The metagenomics were clustered into operational taxonomic units (OTUs) to analyze relative abundance, diversities and principal coordinates analysis (PCoA). Our results showed the soil sample has the highest alpha diversity while water sample has the lowest. Four major phyla, which are Proteobacteria, Acidobacteria, Actinobacteria, and Cyanobacteria, account for 80 % of total microbial biomass in all groups. Cyanobacteria were found most abundantly in limestone wall instead of water or soil of weathering limestone. The PCoA dimensional patterns of each phylum showed a trace of microbial community dynamic changes, which might be affected by environmental factors. This study provides the insights to understand how environmental factors worked together with microbial community to shape landscape structures.

  9. Mitochondrial metagenomics: letting the genes out of the bottle.

    Science.gov (United States)

    Crampton-Platt, Alex; Yu, Douglas W; Zhou, Xin; Vogler, Alfried P

    2016-01-01

    'Mitochondrial metagenomics' (MMG) is a methodology for shotgun sequencing of total DNA from specimen mixtures and subsequent bioinformatic extraction of mitochondrial sequences. The approach can be applied to phylogenetic analysis of taxonomically selected taxa, as an economical alternative to mitogenome sequencing from individual species, or to environmental samples of mixed specimens, such as from mass trapping of invertebrates. The routine generation of mitochondrial genome sequences has great potential both for systematics and community phylogenetics. Mapping of reads from low-coverage shotgun sequencing of environmental samples also makes it possible to obtain data on spatial and temporal turnover in whole-community phylogenetic and species composition, even in complex ecosystems where species-level taxonomy and biodiversity patterns are poorly known. In addition, read mapping can produce information on species biomass, and potentially allows quantification of within-species genetic variation. The success of MMG relies on the formation of numerous mitochondrial genome contigs, achievable with standard genome assemblers, but various challenges for the efficiency of assembly remain, particularly in the face of variable relative species abundance and intra-specific genetic variation. Nevertheless, several studies have demonstrated the power of mitogenomes from MMG for accurate phylogenetic placement, evolutionary analysis of species traits, biodiversity discovery and the establishment of species distribution patterns; it offers a promising avenue for unifying the ecological and evolutionary understanding of species diversity.

  10. WGSQuikr: fast whole-genome shotgun metagenomic classification.

    Directory of Open Access Journals (Sweden)

    David Koslicki

    Full Text Available With the decrease in cost and increase in output of whole-genome shotgun technologies, many metagenomic studies are utilizing this approach in lieu of the more traditional 16S rRNA amplicon technique. Due to the large number of relatively short reads output from whole-genome shotgun technologies, there is a need for fast and accurate short-read OTU classifiers. While there are relatively fast and accurate algorithms available, such as MetaPhlAn, MetaPhyler, PhyloPythiaS, and PhymmBL, these algorithms still classify samples in a read-by-read fashion and so execution times can range from hours to days on large datasets. We introduce WGSQuikr, a reconstruction method which can compute a vector of taxonomic assignments and their proportions in the sample with remarkable speed and accuracy. We demonstrate on simulated data that WGSQuikr is typically more accurate and up to an order of magnitude faster than the aforementioned classification algorithms. We also verify the utility of WGSQuikr on real biological data in the form of a mock community. WGSQuikr is a Whole-Genome Shotgun QUadratic, Iterative, K-mer based Reconstruction method which extends the previously introduced 16S rRNA-based algorithm Quikr. A MATLAB implementation of WGSQuikr is available at: http://sourceforge.net/projects/wgsquikr.

  11. WGSQuikr: fast whole-genome shotgun metagenomic classification.

    Science.gov (United States)

    Koslicki, David; Foucart, Simon; Rosen, Gail

    2014-01-01

    With the decrease in cost and increase in output of whole-genome shotgun technologies, many metagenomic studies are utilizing this approach in lieu of the more traditional 16S rRNA amplicon technique. Due to the large number of relatively short reads output from whole-genome shotgun technologies, there is a need for fast and accurate short-read OTU classifiers. While there are relatively fast and accurate algorithms available, such as MetaPhlAn, MetaPhyler, PhyloPythiaS, and PhymmBL, these algorithms still classify samples in a read-by-read fashion and so execution times can range from hours to days on large datasets. We introduce WGSQuikr, a reconstruction method which can compute a vector of taxonomic assignments and their proportions in the sample with remarkable speed and accuracy. We demonstrate on simulated data that WGSQuikr is typically more accurate and up to an order of magnitude faster than the aforementioned classification algorithms. We also verify the utility of WGSQuikr on real biological data in the form of a mock community. WGSQuikr is a Whole-Genome Shotgun QUadratic, Iterative, K-mer based Reconstruction method which extends the previously introduced 16S rRNA-based algorithm Quikr. A MATLAB implementation of WGSQuikr is available at: http://sourceforge.net/projects/wgsquikr.

  12. Functional metagenomics for the investigation of antibiotic resistance.

    Science.gov (United States)

    Mullany, Peter

    2014-04-01

    Antibiotic resistance is a major threat to human health and well-being. To effectively combat this problem we need to understand the range of different resistance genes that allow bacteria to resist antibiotics. To do this the whole microbiota needs to be investigated. As most bacteria cannot be cultivated in the laboratory, the reservoir of antibiotic resistance genes in the non-cultivatable majority remains relatively unexplored. Currently the only way to study antibiotic resistance in these organisms is to use metagenomic approaches. Furthermore, the only method that does not require any prior knowledge about the resistance genes is functional metagenomics, which involves expressing genes from metagenomic clones in surrogate hosts. In this review the methods and limitations of functional metagenomics to isolate new antibiotic resistance genes and the mobile genetic elements that mediate their spread are explored.

  13. Consensus statement: Virus taxonomy in the age of metagenomics.

    Science.gov (United States)

    Simmonds, Peter; Adams, Mike J; Benkő, Mária; Breitbart, Mya; Brister, J Rodney; Carstens, Eric B; Davison, Andrew J; Delwart, Eric; Gorbalenya, Alexander E; Harrach, Balázs; Hull, Roger; King, Andrew M Q; Koonin, Eugene V; Krupovic, Mart; Kuhn, Jens H; Lefkowitz, Elliot J; Nibert, Max L; Orton, Richard; Roossinck, Marilyn J; Sabanadzovic, Sead; Sullivan, Matthew B; Suttle, Curtis A; Tesh, Robert B; van der Vlugt, René A; Varsani, Arvind; Zerbini, F Murilo

    2017-03-01

    The number and diversity of viral sequences that are identified in metagenomic data far exceeds that of experimentally characterized virus isolates. In a recent workshop, a panel of experts discussed the proposal that, with appropriate quality control, viruses that are known only from metagenomic data can, and should be, incorporated into the official classification scheme of the International Committee on Taxonomy of Viruses (ICTV). Although a taxonomy that is based on metagenomic sequence data alone represents a substantial departure from the traditional reliance on phenotypic properties, the development of a robust framework for sequence-based virus taxonomy is indispensable for the comprehensive characterization of the global virome. In this Consensus Statement article, we consider the rationale for why metagenomic sequence data should, and how it can, be incorporated into the ICTV taxonomy, and present proposals that have been endorsed by the Executive Committee of the ICTV.

  14. Functional metagenomics for the investigation of antibiotic resistance

    Science.gov (United States)

    Mullany, Peter

    2014-01-01

    Antibiotic resistance is a major threat to human health and well-being. To effectively combat this problem we need to understand the range of different resistance genes that allow bacteria to resist antibiotics. To do this the whole microbiota needs to be investigated. As most bacteria cannot be cultivated in the laboratory, the reservoir of antibiotic resistance genes in the non-cultivatable majority remains relatively unexplored. Currently the only way to study antibiotic resistance in these organisms is to use metagenomic approaches. Furthermore, the only method that does not require any prior knowledge about the resistance genes is functional metagenomics, which involves expressing genes from metagenomic clones in surrogate hosts. In this review the methods and limitations of functional metagenomics to isolate new antibiotic resistance genes and the mobile genetic elements that mediate their spread are explored. PMID:24556726

  15. Activity screening of environmental metagenomic libraries reveals novel carboxylesterase families

    Science.gov (United States)

    Popovic, Ana; Hai, Tran; Tchigvintsev, Anatoly; Hajighasemi, Mahbod; Nocek, Boguslaw; Khusnutdinova, Anna N.; Brown, Greg; Glinos, Julia; Flick, Robert; Skarina, Tatiana; Chernikova, Tatyana N.; Yim, Veronica; Brüls, Thomas; Paslier, Denis Le; Yakimov, Michail M.; Joachimiak, Andrzej; Ferrer, Manuel; Golyshina, Olga V.; Savchenko, Alexei; Golyshin, Peter N.; Yakunin, Alexander F.

    2017-01-01

    Metagenomics has made accessible an enormous reserve of global biochemical diversity. To tap into this vast resource of novel enzymes, we have screened over one million clones from metagenome DNA libraries derived from sixteen different environments for carboxylesterase activity and identified 714 positive hits. We have validated the esterase activity of 80 selected genes, which belong to 17 different protein families including unknown and cyclase-like proteins. Three metagenomic enzymes exhibited lipase activity, and seven proteins showed polyester depolymerization activity against polylactic acid and polycaprolactone. Detailed biochemical characterization of four new enzymes revealed their substrate preference, whereas their catalytic residues were identified using site-directed mutagenesis. The crystal structure of the metal-ion dependent esterase MGS0169 from the amidohydrolase superfamily revealed a novel active site with a bound unknown ligand. Thus, activity-centered metagenomics has revealed diverse enzymes and novel families of microbial carboxylesterases, whose activity could not have been predicted using bioinformatics tools. PMID:28272521

  16. Functional Metagenomic Investigations of the Human Intestinal Microbiota

    DEFF Research Database (Denmark)

    Moore, Aimee M.; Munck, Christian; Sommer, Morten Otto Alexander

    2011-01-01

    of this microbial community, its recalcitrance to standard cultivation, and the immense diversity of its encoded genes has necessitated the development of novel molecular, microbiological, and genomic tools. Functional metagenomics is one such culture-independent technique, used for decades to study environmental...... microorganisms, but relatively recently applied to the study of the human commensal microbiota. Metagenomic functional screens characterize the functional capacity of a microbial community, independent of identity to known genes, by subjecting the metagenome to functional assays in a genetically tractable host....... Here we highlight recent work applying this technique to study the functional diversity of the intestinal microbiota, and discuss how an approach combining high-throughput sequencing, cultivation, and metagenomic functional screens can improve our understanding of interactions between this complex...

  17. Quantitative metagenomic analyses based on average genome size normalization

    DEFF Research Database (Denmark)

    Frank, Jeremy Alexander; Sørensen, Søren Johannes

    2011-01-01

    Over the past quarter-century, microbiologists have used DNA sequence information to aid in the characterization of microbial communities. During the last decade, this has expanded from single genes to microbial community genomics, or metagenomics, in which the gene content of an environment can...... provide not just a census of the community members but direct information on metabolic capabilities and potential interactions among community members. Here we introduce a method for the quantitative characterization and comparison of microbial communities based on the normalization of metagenomic data...... by estimating average genome sizes. This normalization can relieve comparative biases introduced by differences in community structure, number of sequencing reads, and sequencing read lengths between different metagenomes. We demonstrate the utility of this approach by comparing metagenomes from two different...

  18. Metagenomic Insights of Microbial Feedbacks to Elevated CO2 (Invited)

    Science.gov (United States)

    Zhou, J.; Tu, Q.; Wu, L.; He, Z.; Deng, Y.; Van Nostrand, J. D.

    2013-12-01

    Understanding the responses of biological communities to elevated CO2 (eCO2) is a central issue in ecology and global change biology, but its impacts on the diversity, composition, structure, function, interactions and dynamics of soil microbial communities remain elusive. In this study, we first examined microbial responses to eCO2 among six FACE sites/ecosystems using a comprehensive functional gene microarray (GeoChip), and then focused on details of metagenome sequencing analysis in one particular site. GeoChip is a comprehensive functional gene array for examining the relationships between microbial community structure and ecosystem functioning and is a very powerful technology for biogeochemical, ecological and environmental studies. The current version of GeoChip (GeoChip 5.0) contains approximately 162,000 probes from 378,000 genes involved in C, N, S and P cycling, organic contaminant degradation, metal resistance, antibiotic resistance, stress responses, metal homeostasis, virulence, pigment production, bacterial phage-mediated lysis, soil beneficial microorganisms, and specific probes for viruses, protists, and fungi. Our experimental results revealed that both ecosystem and CO2 significantly (p changes in the soil microbial community structure were closely correlated with geographic distance, soil NO3-N, NH4-N and C/N ratio. Further metagenome sequencing analysis of soil microbial communities in one particular site showed eCO2 altered the overall structure of soil microbial communities with ambient CO2 samples retaining a higher functional gene diversity than eCO2 samples. Also the taxonomic diversity of functional genes decreased at eCO2. Random matrix theory (RMT)-based network analysis showed that the identified networks under ambient and elevated CO2 were substantially different in terms of overall network topology, network composition, node overlap, module preservation, module-based higher order organization (meta-modules), topological roles of

  19. Request for wood samples

    NARCIS (Netherlands)

    NN,

    1977-01-01

    In recent years the wood collection at the Rijksherbarium was greatly expanded following a renewed interest in wood anatomy as an aid for solving classification problems. Staff members of the Rijksherbarium added to the collection by taking interesting wood samples with them from their expeditions

  20. FY11 Report on Metagenome Analysis using Pathogen Marker Libraries

    Energy Technology Data Exchange (ETDEWEB)

    Gardner, Shea N. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Allen, Jonathan E. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); McLoughlin, Kevin S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Slezak, Tom [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2011-06-02

    A method, sequence library, and software suite was invented to rapidly assess whether any member of a pre-specified list of threat organisms or their near neighbors is present in a metagenome. The system was designed to handle mega- to giga-bases of FASTA-formatted raw sequence reads from short or long read next generation sequencing platforms. The approach is to pre-calculate a viral and a bacterial "Pathogen Marker Library" (PML) containing sub-sequences specific to pathogens or their near neighbors. A list of expected matches comparing every bacterial or viral genome against the PML sequences is also pre-calculated. To analyze a metagenome, reads are compared to the PML, and observed PML-metagenome matches are compared to the expected PML-genome matches, and the ratio of observed relative to expected matches is reported. In other words, a 3-way comparison among the PML, metagenome, and existing genome sequences is used to quickly assess which (if any) species included in the PML is likely to be present in the metagenome, based on available sequence data. Our tests showed that the species with the most PML matches correctly indicated the organism sequenced for empirical metagenomes consisting of a cultured, relatively pure isolate. These runs completed in 1 minute to 3 hours on 12 CPU (1 thread/CPU), depending on the metagenome and PML. Using more threads on the same number of CPU resulted in speed improvements roughly proportional to the number of threads. Simulations indicated that detection sensitivity depends on both sequencing coverage levels for a species and the size of the PML: species were correctly detected even at ~0.003x coverage by the large PMLs, and at ~0.03x coverage by the smaller PMLs. Matches to true positive species were 3-4 orders of magnitude higher than to false positives. Simulations with short reads (36 nt and ~260 nt) showed that species were usually detected for metagenome coverage above 0.005x and coverage in the PML above 0.05x, and

  1. A human gut microbial gene catalogue established by metagenomic sequencing

    DEFF Research Database (Denmark)

    dos Santos, Marcelo Bertalan Quintanilha; Sicheritz-Pontén, Thomas; Nielsen, Henrik Bjørn

    2010-01-01

    To understand the impact of gut microbes on human health and well-being it is crucial to assess their genetic potential. Here we describe the Illumina-based metagenomic sequencing, assembly and characterization of 3.3 million non-redundant microbial genes, derived from 576.7 gigabases of sequence...... gut metagenome and the minimal gut bacterial genome in terms of functions present in all individuals and most bacteria, respectively....

  2. Life on human surfaces: skin metagenomics.

    Directory of Open Access Journals (Sweden)

    Alban Mathieu

    Full Text Available The human skin microbiome could provide another example, after the gut, of the strong positive or negative impact that human colonizing bacteria can have on health. Deciphering functional diversity and dynamics within human skin microbial communities is critical for understanding their involvement and for developing the appropriate substances for improving or correcting their action. We present a direct PCR-free high throughput sequencing approach to unravel the human skin microbiota specificities through metagenomic dataset analysis and inter-environmental comparison. The approach provided access to the functions carried out by dominant skin colonizing taxa, including Corynebacterium, Staphylococcus and Propionibacterium, revealing their specific capabilities to interact with and exploit compounds from the human skin. These functions, which clearly illustrate the unique life style of the skin microbial communities, stand as invaluable investigation targets for understanding and potentially modifying bacterial interactions with the human host with the objective of increasing health and well being.

  3. [Metagenomics and biodiversity of sphagnum bogs].

    Science.gov (United States)

    Rusin, L Yu

    2016-01-01

    Biodiversity of sphagnum bogs is one of the richest and less studied, while these ecosystems are among the top ones in ecological, conservation, and economic value. Recent studies focused on the prokaryotic consortia associated with sphagnum mosses, and revealed the factors that maintain sustainability and productivity of bog ecosystems. High-throughput sequencing technologies provided insight into functional diversity of moss microbial communities (microbiomes), and helped to identify the biochemical pathways and gene families that facilitate the spectrum of adaptive strategies and largely foster the very successful colonization of the Northern hemisphere by sphagnum mosses. Rich and valuable information obtained on microbiomes of peat bogs sets off the paucity of evidence on their eukaryotic diversity. Prospects and expectations of reliable assessment of taxonomic profiles, relative abundance of taxa, and hidden biodiversity of microscopic eukaryotes in sphagnum bog ecosystems are briefly outlined in the context of today's metagenomics.

  4. MetaProx: the database of metagenomic proximons.

    Science.gov (United States)

    Vey, Gregory; Charles, Trevor C

    2014-01-01

    MetaProx is the database of metagenomic proximons: a searchable repository of proximon objects conceived with two specific goals. The first objective is to accelerate research involving metagenomic functional interactions by providing a database of metagenomic operon candidates. Proximons represent a special subset of directons (series of contiguous co-directional genes) where each member gene is in close proximity to its neighbours with respect to intergenic distance. As a result, proximons represent significant operon candidates where some subset of proximons is the set of true metagenomic operons. Proximons are well suited for the inference of metagenomic functional networks because predicted functional linkages do not rely on homology-dependent information that is frequently unavailable in metagenomic scenarios. The second objective is to explore representations for semistructured biological data that can offer an alternative to the traditional relational database approach. In particular, we use a serialized object implementation and advocate a Data as Data policy where the same serialized objects can be used at all levels (database, search tool and saved user file) without conversion or the use of human-readable markups. MetaProx currently includes 4,210,818 proximons consisting of 8 \\,926,993 total member genes. Database URL: http://metaprox.uwaterloo.ca. © The Author(s) 2014. Published by Oxford University Press.

  5. Metagenomic Assessment of a Dynamic Microbial Population from Subseafloor Aquifer Fluids in the Cold, Oxygenated Crust

    Science.gov (United States)

    Tully, B. J.; Heidelberg, J. F.; Kraft, B.; Girguis, P. R.; Huber, J. A.

    2016-12-01

    The oceanic crust contains the largest aquifer on Earth with a volume approximately 2% of the global ocean. Ongoing research at the North Pond (NP) site, west of the Mid-Atlantic Ridge, provides an environment representative of oxygenated crustal aquifers beneath oligotrophic surface waters. Using subseafloor CORK observatories for multiple sampling depths beneath the seafloor, crustal fluids were sampled along the predicted aquifer fluid flow path over a two-year period. DNA was extracted and sequenced for metagenomic analysis from 22 crustal fluid samples, along with the overlying bottom. At broad taxonomic groupings, the aquifer system is highly dynamic over time and space, with shifts in dominant taxa and "blooms" of transient groups that appear at discreet time points and sample depths. We were able to reconstruct 194 high-quality, low-contamination bacterial and archaeal metagenomic-assembled genomes (MAGs) with estimated completeness >50% (429 MAGs >20% complete). Environmental genomes were assigned to phylogenies from the major bacterial phyla, putative novel groups, and poorly sampled phylogenetic groups, including the Marinimicrobia, Candidate Phyla Radiation, and Planctomycetes. Biogeochemically relevant processes were assigned to MAGs, including denitrification, dissimilatory sulfur and hydrogen cycling, and carbon fixation. Collectively, the oxic NP aquifer system represents a diverse, dynamic microbial habitat with the metabolic potential to impact multiple globally relevant biogeochemical cycles, including nitrogen, sulfur, and carbon.

  6. A Comparison between Transcriptome Sequencing and 16S Metagenomics for Detection of Bacterial Pathogens in Wildlife.

    Directory of Open Access Journals (Sweden)

    Maria Razzauti

    Full Text Available Rodents are major reservoirs of pathogens responsible for numerous zoonotic diseases in humans and livestock. Assessing their microbial diversity at both the individual and population level is crucial for monitoring endemic infections and revealing microbial association patterns within reservoirs. Recently, NGS approaches have been employed to characterize microbial communities of different ecosystems. Yet, their relative efficacy has not been assessed. Here, we compared two NGS approaches, RNA-Sequencing (RNA-Seq and 16S-metagenomics, assessing their ability to survey neglected zoonotic bacteria in rodent populations.We first extracted nucleic acids from the spleens of 190 voles collected in France. RNA extracts were pooled, randomly retro-transcribed, then RNA-Seq was performed using HiSeq. Assembled bacterial sequences were assigned to the closest taxon registered in GenBank. DNA extracts were analyzed via a 16S-metagenomics approach using two sequencers: the 454 GS-FLX and the MiSeq. The V4 region of the gene coding for 16S rRNA was amplified for each sample using barcoded universal primers. Amplicons were multiplexed and processed on the distinct sequencers. The resulting datasets were de-multiplexed, and each read was processed through a pipeline to be taxonomically classified using the Ribosomal Database Project. Altogether, 45 pathogenic bacterial genera were detected. The bacteria identified by RNA-Seq were comparable to those detected by 16S-metagenomics approach processed with MiSeq (16S-MiSeq. In contrast, 21 of these pathogens went unnoticed when the 16S-metagenomics approach was processed via 454-pyrosequencing (16S-454. In addition, the 16S-metagenomics approaches revealed a high level of coinfection in bank voles.We concluded that RNA-Seq and 16S-MiSeq are equally sensitive in detecting bacteria. Although only the 16S-MiSeq method enabled identification of bacteria in each individual reservoir, with subsequent derivation of

  7. A Comparison between Transcriptome Sequencing and 16S Metagenomics for Detection of Bacterial Pathogens in Wildlife.

    Science.gov (United States)

    Razzauti, Maria; Galan, Maxime; Bernard, Maria; Maman, Sarah; Klopp, Christophe; Charbonnel, Nathalie; Vayssier-Taussat, Muriel; Eloit, Marc; Cosson, Jean-François

    2015-01-01

    Rodents are major reservoirs of pathogens responsible for numerous zoonotic diseases in humans and livestock. Assessing their microbial diversity at both the individual and population level is crucial for monitoring endemic infections and revealing microbial association patterns within reservoirs. Recently, NGS approaches have been employed to characterize microbial communities of different ecosystems. Yet, their relative efficacy has not been assessed. Here, we compared two NGS approaches, RNA-Sequencing (RNA-Seq) and 16S-metagenomics, assessing their ability to survey neglected zoonotic bacteria in rodent populations. We first extracted nucleic acids from the spleens of 190 voles collected in France. RNA extracts were pooled, randomly retro-transcribed, then RNA-Seq was performed using HiSeq. Assembled bacterial sequences were assigned to the closest taxon registered in GenBank. DNA extracts were analyzed via a 16S-metagenomics approach using two sequencers: the 454 GS-FLX and the MiSeq. The V4 region of the gene coding for 16S rRNA was amplified for each sample using barcoded universal primers. Amplicons were multiplexed and processed on the distinct sequencers. The resulting datasets were de-multiplexed, and each read was processed through a pipeline to be taxonomically classified using the Ribosomal Database Project. Altogether, 45 pathogenic bacterial genera were detected. The bacteria identified by RNA-Seq were comparable to those detected by 16S-metagenomics approach processed with MiSeq (16S-MiSeq). In contrast, 21 of these pathogens went unnoticed when the 16S-metagenomics approach was processed via 454-pyrosequencing (16S-454). In addition, the 16S-metagenomics approaches revealed a high level of coinfection in bank voles. We concluded that RNA-Seq and 16S-MiSeq are equally sensitive in detecting bacteria. Although only the 16S-MiSeq method enabled identification of bacteria in each individual reservoir, with subsequent derivation of bacterial prevalence

  8. Expedition Two crew arrives at KSC

    Science.gov (United States)

    2001-01-01

    Astronaut James Voss (right) stands with astronaut John Young on the tarmac at the KSC Shuttle Landing Facility. Voss is flying on mission STS-102, launching March 8, as part of the Expedition Two crew going to the International Space Station. Young made his fifth flight as Spacecraft Commander of STS-1, the first flight of the Space Shuttle, April 12-14, 1981. His sixth and final flight was as Spacecraft Commander of STS-9, the first Spacelab mission, Nov. 28-Dec. 8, 1983. The other members of the Expedition Two crew are Susan Helms and Yury Usachev. STS-102 will be Helms' and Voss's fifth Shuttle flight, and Usachev's second. They will be replacing the Expedition One crew (Bill Shepherd, Yuri Gidzenko and Sergei Krikalev), who will return to Earth March 20 on Discovery along with the STS-102 crew.

  9. Greenland Expeditions by Alfred Wegener - A photographic window to past

    Science.gov (United States)

    Leitner, M.; Tschürtz, S.; Kirchengast, G.; Kranzelbinder, H.; Prügger, B.; Krause, R. A.; Kalliokoski, M.; Thórhallsdóttir, E.

    2012-04-01

    On several expeditions to Greenland, Alfred Wegener (1880-1930) took pictures on glass plates from landscapes and glaciers, the expedition equipment, the people and animals taking part on the expeditions as well as physical phenomena as dust storm, clouds or spherical light phenomena. Chronologically the plates show the Danmark Expedition 1906-1908, the crossing of Greenland expedition with stop in Iceland 1912-1913, and the German Greenland Expedition 1929-1930. Until the tragic end of the expedition in 1930, Wegener was professor at the University of Graz, and such a stock of about 300 glass plates stayed there. The aim of our work is to digitize all plates for further studies. We present a first selection of Wegener's Greenland expedition pictures. For those made at Iceland in 1912 we will present a comparison of the past with pictures from the same viewing point made in 2011.

  10. Exploring Genomic Diversity Using Metagenomics of Deep-Sea Subsurface Microbes from the Louisville Seamount and the South Pacific Gyre

    Science.gov (United States)

    Tully, B. J.; Sylvan, J. B.; Heidelberg, J. F.; Huber, J. A.

    2014-12-01

    There are many limitations involved with sampling microbial diversity from deep-sea subsurface environments, ranging from physical sample collection, low microbial biomass, culturing at in situ conditions, and inefficient nucleic acid extractions. As such, we are continually modifying our methods to obtain better results and expanding what we know about microbes in these environments. Here we present analysis of metagenomes sequences from samples collected from 120 m within the Louisville Seamount and from the top 5-10cm of the sediment in the center of the south Pacific gyre (SPG). Both systems are low biomass with ~102 and ~104 cells per cm3 for Louisville Seamount samples analyzed and the SPG sediment, respectively. The Louisville Seamount represents the first in situ subseafloor basalt and the SPG sediments represent the first in situ low biomass sediment microbial metagenomes. Both of these environments, subseafloor basalt and sediments underlying oligotrophic ocean gyres, represent large provinces of the seafloor environment that remain understudied. Despite the low biomass and DNA generated from these samples, we have generated 16 near complete genomes (5 from Louisville and 11 from the SPG) from the two metagenomic datasets. These genomes are estimated to be between 51-100% complete and span a range of phylogenetic groups, including the Proteobacteria, Actinobacteria, Firmicutes, Chloroflexi, and unclassified bacterial groups. With these genomes, we have assessed potential functional capabilities of these organisms and performed a comparative analysis between the environmental genomes and previously sequenced relatives to determine possible adaptations that may elucidate survival mechanisms for these low energy environments. These methods illustrate a baseline analysis that can be applied to future metagenomic deep-sea subsurface datasets and will help to further our understanding of microbiology within these environments.

  11. Expeditions to Komsomolets in 1993 and 1994; Tokt til Komsomolets i 1993 og 1994

    Energy Technology Data Exchange (ETDEWEB)

    Kolstad, A.K.

    1995-09-01

    The Russian nuclear submarine Komsomolets went down about 180 km southwest of the Bear Island in the Norwegian Sea on April 7, 1989. According to Russian information the submarine contains one nuclear reactor and two torpedoes with nuclear warheads. The Norwegian Radiation Protection Authority has taken part in the Russian expeditions to the accident site since 1991. This is a report from the expeditions in 1993 and 1994. It includes sampling, analysis and results obtained by the Norwegian part. 5 refs., 4 figs., 5 tabs.

  12. Machine Learning Meta-analysis of Large Metagenomic Datasets: Tools and Biological Insights.

    Science.gov (United States)

    Pasolli, Edoardo; Truong, Duy Tin; Malik, Faizan; Waldron, Levi; Segata, Nicola

    2016-07-01

    Shotgun metagenomic analysis of the human associated microbiome provides a rich set of microbial features for prediction and biomarker discovery in the context of human diseases and health conditions. However, the use of such high-resolution microbial features presents new challenges, and validated computational tools for learning tasks are lacking. Moreover, classification rules have scarcely been validated in independent studies, posing questions about the generality and generalization of disease-predictive models across cohorts. In this paper, we comprehensively assess approaches to metagenomics-based prediction tasks and for quantitative assessment of the strength of potential microbiome-phenotype associations. We develop a computational framework for prediction tasks using quantitative microbiome profiles, including species-level relative abundances and presence of strain-specific markers. A comprehensive meta-analysis, with particular emphasis on generalization across cohorts, was performed in a collection of 2424 publicly available metagenomic samples from eight large-scale studies. Cross-validation revealed good disease-prediction capabilities, which were in general improved by feature selection and use of strain-specific markers instead of species-level taxonomic abundance. In cross-study analysis, models transferred between studies were in some cases less accurate than models tested by within-study cross-validation. Interestingly, the addition of healthy (control) samples from other studies to training sets improved disease prediction capabilities. Some microbial species (most notably Streptococcus anginosus) seem to characterize general dysbiotic states of the microbiome rather than connections with a specific disease. Our results in modelling features of the "healthy" microbiome can be considered a first step toward defining general microbial dysbiosis. The software framework, microbiome profiles, and metadata for thousands of samples are publicly

  13. Reliable Biomarker discovery from Metagenomic data via RegLRSD algorithm.

    Science.gov (United States)

    Alshawaqfeh, Mustafa; Bashaireh, Ahmad; Serpedin, Erchin; Suchodolski, Jan

    2017-07-10

    Biomarker detection presents itself as a major means of translating biological data into clinical applications. Due to the recent advances in high throughput sequencing technologies, an increased number of metagenomics studies have suggested the dysbiosis in microbial communities as potential biomarker for certain diseases. The reproducibility of the results drawn from metagenomic data is crucial for clinical applications and to prevent incorrect biological conclusions. The variability in the sample size and the subjects participating in the experiments induce diversity, which may drastically change the outcome of biomarker detection algorithms. Therefore, a robust biomarker detection algorithm that ensures the consistency of the results irrespective of the natural diversity present in the samples is needed. Toward this end, this paper proposes a novel Regularized Low Rank-Sparse Decomposition (RegLRSD) algorithm. RegLRSD models the bacterial abundance data as a superposition between a sparse matrix and a low-rank matrix, which account for the differentially and non-differentially abundant microbes, respectively. Hence, the biomarker detection problem is cast as a matrix decomposition problem. In order to yield more consistent and solid biological conclusions, RegLRSD incorporates the prior knowledge that the irrelevant microbes do not exhibit significant variation between samples belonging to different phenotypes. Moreover, an efficient algorithm to extract the sparse matrix is proposed. Comprehensive comparisons of RegLRSD with the state-of-the-art algorithms on three realistic datasets are presented. The obtained results demonstrate that RegLRSD consistently outperforms the other algorithms in terms of reproducibility performance and provides a marker list with high classification accuracy. The proposed RegLRSD algorithm for biomarker detection provides high reproducibility and classification accuracy performance regardless of the dataset complexity and the

  14. Expedition Three Crew Onboard Photograph of Sunset

    Science.gov (United States)

    2001-01-01

    The setting sun and the thin blue airglow line at Earth's horizon was captured by the International Space Station's (ISS) Expedition Three crewmembers with a digital camera. Some of the Station's components are silhouetted in the foreground. The crew was launched aboard the Space Shuttle Orbiter Discovery STS-105 mission, on August 10, 2001, replacing the Expedition Two crew. After marning the orbiting ISS for 128 consecutive days, the three returned to Earth on December 17, 2001, aboard the STS-108 mission Space Shuttle Orbiter Endeavour.

  15. Comparison of microbial DNA enrichment tools for metagenomic whole genome sequencing.

    Science.gov (United States)

    Thoendel, Matthew; Jeraldo, Patricio R; Greenwood-Quaintance, Kerryl E; Yao, Janet Z; Chia, Nicholas; Hanssen, Arlen D; Abdel, Matthew P; Patel, Robin

    2016-08-01

    Metagenomic whole genome sequencing for detection of pathogens in clinical samples is an exciting new area for discovery and clinical testing. A major barrier to this approach is the overwhelming ratio of human to pathogen DNA in samples with low pathogen abundance, which is typical of most clinical specimens. Microbial DNA enrichment methods offer the potential to relieve this limitation by improving this ratio. Two commercially available enrichment kits, the NEBNext Microbiome DNA Enrichment Kit and the Molzym MolYsis Basic kit, were tested for their ability to enrich for microbial DNA from resected arthroplasty component sonicate fluids from prosthetic joint infections or uninfected sonicate fluids spiked with Staphylococcus aureus. Using spiked uninfected sonicate fluid there was a 6-fold enrichment of bacterial DNA with the NEBNext kit and 76-fold enrichment with the MolYsis kit. Metagenomic whole genome sequencing of sonicate fluid revealed 13- to 85-fold enrichment of bacterial DNA using the NEBNext enrichment kit. The MolYsis approach achieved 481- to 9580-fold enrichment, resulting in 7 to 59% of sequencing reads being from the pathogens known to be present in the samples. These results demonstrate the usefulness of these tools when testing clinical samples with low microbial burden using next generation sequencing. Copyright © 2016 Elsevier B.V. All rights reserved.

  16. Characterisation of two bifunctional cellulase-xylanase enzymes isolated from a bovine rumen metagenome library

    CSIR Research Space (South Africa)

    Rashamuse, KJ

    2013-02-01

    Full Text Available Ruminant digestive tract microbes hydrolyse plant biomass, and the application of metagenomic techniques can provide good coverage of their glycosyl hydrolase enzymes. A metagenomic library of circa 70,000 fosmids was constructed from bacterial DNA...

  17. Ultrahigh-throughput discovery of promiscuous enzymes by picodroplet functional metagenomics

    NARCIS (Netherlands)

    Colin, Pierre-Yves; Kintses, Balint; Gielen, Fabrice; Miton, Charlotte M; Fischer, Gerhard; Mohamed, Mark F; Hyvönen, Marko; Morgavi, Diego P; Janssen, Dick B; Hollfelder, Florian

    2015-01-01

    Unculturable bacterial communities provide a rich source of biocatalysts, but their experimental discovery by functional metagenomics is difficult, because the odds are stacked against the experimentor. Here we demonstrate functional screening of a million-membered metagenomic library in

  18. Differences in sequencing technologies improve the retrieval of anammox bacterial genome from metagenomes

    NARCIS (Netherlands)

    Gori, F.; Tringe, S.G.; Folino, G.; Hijum, S.A.F.T. van; Camp, H.J. Op den; Jetten, M.S.; Marchiori, E.

    2013-01-01

    BACKGROUND: Sequencing technologies have different biases, in single-genome sequencing and metagenomic sequencing; these can significantly affect ORFs recovery and the population distribution of a metagenome. In this paper we investigate how well different technologies represent information related

  19. Differences in sequencing technologies improve the retrieval of anammox bacterial genome from metagenomes

    NARCIS (Netherlands)

    Gori, F.; Tringe, S.G.; Folino, G.; Van Hijum, S.A.F.T.; Op den Camp, H.J.M.; Jetten, M.S.M.; Marchiori, E.

    2013-01-01

    Background Sequencing technologies have different biases, in single-genome sequencing and metagenomic sequencing; these can significantly affect ORFs recovery and the population distribution of a metagenome. In this paper we investigate how well different technologies represent information related

  20. Discovery and characterizaton of a novel lipase with transesterification activity from hot spring metagenomic library

    OpenAIRE

    Yan, Wei; Li, Furong; Wang, Li; Zhu, Yaxin; Dong, Zhiyang; Bai, Linhan

    2016-01-01

    A new gene encoding a lipase (designated as Lip-1) was identified from a metagenomic bacterial artificial chromosome(BAC) library prepared from a concentrated water sample collected from a hot spring field in Niujie, Eryuan of Yunnan province in China. The open reading frame of this gene encoded 622 amino acid residues. It was cloned, fused with the oleosin gene and over expressed in Escherichia coli to prepare immobilized lipase artificial oil body AOB-sole-lip-1. The monomeric Sole-lip-1 fu...

  1. 20 CFR 404.926 - Agreement in expedited appeals process.

    Science.gov (United States)

    2010-04-01

    ... DISABILITY INSURANCE (1950- ) Determinations, Administrative Review Process, and Reopening of Determinations and Decisions Expedited Appeals Process § 404.926 Agreement in expedited appeals process. If you meet... 20 Employees' Benefits 2 2010-04-01 2010-04-01 false Agreement in expedited appeals process. 404...

  2. 7 CFR 1703.112 - Expedited telecommunications loans

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 11 2010-01-01 2010-01-01 false Expedited telecommunications loans 1703.112 Section... § 1703.112 Expedited telecommunications loans RUS will expedite consideration and determination of an application submitted by an RUS telecommunications borrower for a loan under the Act or an advance of such...

  3. Metagenomic analysis of viruses associated with field-grown and retail lettuce identifies human and animal viruses.

    Science.gov (United States)

    Aw, Tiong Gim; Wengert, Samantha; Rose, Joan B

    2016-04-16

    The emergence of culture- and sequence-independent metagenomic methods has not only provided great insight into the microbial community structure in a wide range of clinical and environmental samples but has also proven to be powerful tools for pathogen detection. Recent studies of the food microbiome have revealed the vast genetic diversity of bacteria associated with fresh produce. However, no work has been done to apply metagenomic methods to tackle viruses associated with fresh produce for addressing food safety. Thus, there is a little knowledge about the presence and diversity of viruses associated with fresh produce from farm-to-fork. To address this knowledge gap, we assessed viruses on commercial romaine and iceberg lettuces in fields and a produce distribution center using a shotgun metagenomic sequencing targeting both RNA and DNA viruses. Commercial lettuce harbors an immense assemblage of viruses that infect a wide range of hosts. As expected, plant pathogenic viruses dominated these communities. Sequences of rotaviruses and picobirnaviruses were also identified in both field-harvest and retail lettuce samples, suggesting an emerging foodborne transmission threat that has yet to be fully recognized. The identification of human and animal viruses in lettuce samples in the field emphasizes the importance of preventing viral contamination on leafy greens starting at the field. Although there are still some inherent experimental and bioinformatics challenges in applying viral metagenomic approaches for food safety testing, this work will facilitate further application of this unprecedented deep sequencing method to food samples. Copyright © 2016 Elsevier B.V. All rights reserved.

  4. Metagenomic and metatranscriptomic analysis of saliva reveals disease-associated microbiota in patients with periodontitis and dental caries

    OpenAIRE

    Belstrøm, Daniel; Constancias, Florentin; Liu, Yang; Yang, Liang; Drautz-Moses, Daniela I; Schuster, Stephan C; Kohli, Gurjeet Singh; Jakobsen, Tim Holm; Holmstrup, Palle; Givskov, Michael

    2017-01-01

    The taxonomic composition of the salivary microbiota has been reported to differentiate between oral health and disease. However, information on bacterial activity and gene expression of the salivary microbiota is limited. The purpose of this study was to perform metagenomic and metatranscriptomic characterization of the salivary microbiota and test the hypothesis that salivary microbial presence and activity could be an indicator of the oral health status. Stimulated saliva samples were coll...

  5. Phylogenetic community ecology of soil biodiversity using mitochondrial metagenomics.

    Science.gov (United States)

    Andújar, Carmelo; Arribas, Paula; Ruzicka, Filip; Crampton-Platt, Alex; Timmermans, Martijn J T N; Vogler, Alfried P

    2015-07-01

    High-throughput DNA methods hold great promise for the study of taxonomically intractable mesofauna of the soil. Here, we assess species diversity and community structure in a phylogenetic framework, by sequencing total DNA from bulk specimen samples and assembly of mitochondrial genomes. The combination of mitochondrial metagenomics and DNA barcode sequencing of 1494 specimens in 69 soil samples from three geographic regions in southern Iberia revealed >300 species of soil Coleoptera (beetles) from a broad spectrum of phylogenetic lineages. A set of 214 mitochondrial sequences longer than 3000 bp was generated and used to estimate a well-supported phylogenetic tree of the order Coleoptera. Shorter sequences, including cox1 barcodes, were placed on this mitogenomic tree. Raw Illumina reads were mapped against all available sequences to test for species present in local samples. This approach simultaneously established the species richness, phylogenetic composition and community turnover at species and phylogenetic levels. We find a strong signature of vertical structuring in soil fauna that shows high local community differentiation between deep soil and superficial horizons at phylogenetic levels. Within the two vertical layers, turnover among regions was primarily at the tip (species) level and was stronger in the deep soil than leaf litter communities, pointing to layer-mediated drivers determining species diversification, spatial structure and evolutionary assembly of soil communities. This integrated phylogenetic framework opens the application of phylogenetic community ecology to the mesofauna of the soil, among the most diverse and least well-understood ecosystems, and will propel both theoretical and applied soil science. © 2015 John Wiley & Sons Ltd.

  6. Metagenomic profiling of microbial composition and antibiotic resistance determinants in Puget Sound.

    Directory of Open Access Journals (Sweden)

    Jesse A Port

    Full Text Available Human-health relevant impacts on marine ecosystems are increasing on both spatial and temporal scales. Traditional indicators for environmental health monitoring and microbial risk assessment have relied primarily on single species analyses and have provided only limited spatial and temporal information. More high-throughput, broad-scale approaches to evaluate these impacts are therefore needed to provide a platform for informing public health. This study uses shotgun metagenomics to survey the taxonomic composition and antibiotic resistance determinant content of surface water bacterial communities in the Puget Sound estuary. Metagenomic DNA was collected at six sites in Puget Sound in addition to one wastewater treatment plant (WWTP that discharges into the Sound and pyrosequenced. A total of ~550 Mbp (1.4 million reads were obtained, 22 Mbp of which could be assembled into contigs. While the taxonomic and resistance determinant profiles across the open Sound samples were similar, unique signatures were identified when comparing these profiles across the open Sound, a nearshore marina and WWTP effluent. The open Sound was dominated by α-Proteobacteria (in particular Rhodobacterales sp., γ-Proteobacteria and Bacteroidetes while the marina and effluent had increased abundances of Actinobacteria, β-Proteobacteria and Firmicutes. There was a significant increase in the antibiotic resistance gene signal from the open Sound to marina to WWTP effluent, suggestive of a potential link to human impacts. Mobile genetic elements associated with environmental and pathogenic bacteria were also differentially abundant across the samples. This study is the first comparative metagenomic survey of Puget Sound and provides baseline data for further assessments of community composition and antibiotic resistance determinants in the environment using next generation sequencing technologies. In addition, these genomic signals of potential human impact can be used

  7. Comparative Metagenomics Reveals the Distinctive Adaptive Features of the Spongia officinalis Endosymbiotic Consortium

    Science.gov (United States)

    Karimi, Elham; Ramos, Miguel; Gonçalves, Jorge M. S.; Xavier, Joana R.; Reis, Margarida P.; Costa, Rodrigo

    2017-01-01

    Current knowledge of sponge microbiome functioning derives mostly from comparative analyses with bacterioplankton communities. We employed a metagenomics-centered approach to unveil the distinct features of the Spongia officinalis endosymbiotic consortium in the context of its two primary environmental vicinities. Microbial metagenomic DNA samples (n = 10) from sponges, seawater, and sediments were subjected to Hiseq Illumina sequencing (c. 15 million 100 bp reads per sample). Totals of 10,272 InterPro (IPR) predicted protein entries and 784 rRNA gene operational taxonomic units (OTUs, 97% cut-off) were uncovered from all metagenomes. Despite the large divergence in microbial community assembly between the surveyed biotopes, the S. officinalis symbiotic community shared slightly greater similarity (p terpenoid synthases presented, to varying degrees, higher frequencies in sediments than in seawater. In contrast, much higher abundances of motility and chemotaxis genes were found in sediments and seawater than in sponges. Higher cell and surface densities, sponge cell shedding and particle uptake, and putative chemical signaling processes favoring symbiont persistence in particulate matrices all may act as mechanisms underlying the observed degrees of taxonomic connectivity and functional conve