WorldWideScience

Sample records for bacterial genome scale

  1. In the fast lane: large-scale bacterial genome engineering.

    Science.gov (United States)

    Fehér, Tamás; Burland, Valerie; Pósfai, György

    2012-07-31

    The last few years have witnessed rapid progress in bacterial genome engineering. The long-established, standard ways of DNA synthesis, modification, transfer into living cells, and incorporation into genomes have given way to more effective, large-scale, robust genome modification protocols. Expansion of these engineering capabilities is due to several factors. Key advances include: (i) progress in oligonucleotide synthesis and in vitro and in vivo assembly methods, (ii) optimization of recombineering techniques, (iii) introduction of parallel, large-scale, combinatorial, and automated genome modification procedures, and (iv) rapid identification of the modifications by barcode-based analysis and sequencing. Combination of the brute force of these techniques with sophisticated bioinformatic design and modeling opens up new avenues for the analysis of gene functions and cellular network interactions, but also in engineering more effective producer strains. This review presents a summary of recent technological advances in bacterial genome engineering.

  2. Genome-scale co-evolutionary inference identifies functions and clients of bacterial Hsp90.

    Directory of Open Access Journals (Sweden)

    Maximilian O Press

    Full Text Available The molecular chaperone Hsp90 is essential in eukaryotes, in which it facilitates the folding of developmental regulators and signal transduction proteins known as Hsp90 clients. In contrast, Hsp90 is not essential in bacteria, and a broad characterization of its molecular and organismal function is lacking. To enable such characterization, we used a genome-scale phylogenetic analysis to identify genes that co-evolve with bacterial Hsp90. We find that genes whose gain and loss were coordinated with Hsp90 throughout bacterial evolution tended to function in flagellar assembly, chemotaxis, and bacterial secretion, suggesting that Hsp90 may aid assembly of protein complexes. To add to the limited set of known bacterial Hsp90 clients, we further developed a statistical method to predict putative clients. We validated our predictions by demonstrating that the flagellar protein FliN and the chemotaxis kinase CheA behaved as Hsp90 clients in Escherichia coli, confirming the predicted role of Hsp90 in chemotaxis and flagellar assembly. Furthermore, normal Hsp90 function is important for wild-type motility and/or chemotaxis in E. coli. This novel function of bacterial Hsp90 agreed with our subsequent finding that Hsp90 is associated with a preference for multiple habitats and may therefore face a complex selection regime. Taken together, our results reveal previously unknown functions of bacterial Hsp90 and open avenues for future experimental exploration by implicating Hsp90 in the assembly of membrane protein complexes and adaptation to novel environments.

  3. Targeted Large-Scale Deletion of Bacterial Genomes Using CRISPR-Nickases.

    Science.gov (United States)

    Standage-Beier, Kylie; Zhang, Qi; Wang, Xiao

    2015-11-20

    Programmable CRISPR-Cas systems have augmented our ability to produce precise genome manipulations. Here we demonstrate and characterize the ability of CRISPR-Cas derived nickases to direct targeted recombination of both small and large genomic regions flanked by repetitive elements in Escherichia coli. While CRISPR directed double-stranded DNA breaks are highly lethal in many bacteria, we show that CRISPR-guided nickase systems can be programmed to make precise, nonlethal, single-stranded incisions in targeted genomic regions. This induces recombination events and leads to targeted deletion. We demonstrate that dual-targeted nicking enables deletion of 36 and 97 Kb of the genome. Furthermore, multiplex targeting enables deletion of 133 Kb, accounting for approximately 3% of the entire E. coli genome. This technology provides a framework for methods to manipulate bacterial genomes using CRISPR-nickase systems. We envision this system working synergistically with preexisting bacterial genome engineering methods.

  4. Biofilm Formation Mechanisms of Pseudomonas aeruginosa Predicted via Genome-Scale Kinetic Models of Bacterial Metabolism.

    Science.gov (United States)

    Vital-Lopez, Francisco G; Reifman, Jaques; Wallqvist, Anders

    2015-10-01

    A hallmark of Pseudomonas aeruginosa is its ability to establish biofilm-based infections that are difficult to eradicate. Biofilms are less susceptible to host inflammatory and immune responses and have higher antibiotic tolerance than free-living planktonic cells. Developing treatments against biofilms requires an understanding of bacterial biofilm-specific physiological traits. Research efforts have started to elucidate the intricate mechanisms underlying biofilm development. However, many aspects of these mechanisms are still poorly understood. Here, we addressed questions regarding biofilm metabolism using a genome-scale kinetic model of the P. aeruginosa metabolic network and gene expression profiles. Specifically, we computed metabolite concentration differences between known mutants with altered biofilm formation and the wild-type strain to predict drug targets against P. aeruginosa biofilms. We also simulated the altered metabolism driven by gene expression changes between biofilm and stationary growth-phase planktonic cultures. Our analysis suggests that the synthesis of important biofilm-related molecules, such as the quorum-sensing molecule Pseudomonas quinolone signal and the exopolysaccharide Psl, is regulated not only through the expression of genes in their own synthesis pathway, but also through the biofilm-specific expression of genes in pathways competing for precursors to these molecules. Finally, we investigated why mutants defective in anthranilate degradation have an impaired ability to form biofilms. Alternative to a previous hypothesis that this biofilm reduction is caused by a decrease in energy production, we proposed that the dysregulation of the synthesis of secondary metabolites derived from anthranilate and chorismate is what impaired the biofilms of these mutants. Notably, these insights generated through our kinetic model-based approach are not accessible from previous constraint-based model analyses of P. aeruginosa biofilm

  5. LocateP: Genome-scale subcellular-location predictor for bacterial proteins

    Directory of Open Access Journals (Sweden)

    Zhou Miaomiao

    2008-03-01

    Full Text Available Abstract Background In the past decades, various protein subcellular-location (SCL predictors have been developed. Most of these predictors, like TMHMM 2.0, SignalP 3.0, PrediSi and Phobius, aim at the identification of one or a few SCLs, whereas others such as CELLO and Psortb.v.2.0 aim at a broader classification. Although these tools and pipelines can achieve a high precision in the accurate prediction of signal peptides and transmembrane helices, they have a much lower accuracy when other sequence characteristics are concerned. For instance, it proved notoriously difficult to identify the fate of proteins carrying a putative type I signal peptidase (SPIase cleavage site, as many of those proteins are retained in the cell membrane as N-terminally anchored membrane proteins. Moreover, most of the SCL classifiers are based on the classification of the Swiss-Prot database and consequently inherited the inconsistency of that SCL classification. As accurate and detailed SCL prediction on a genome scale is highly desired by experimental researchers, we decided to construct a new SCL prediction pipeline: LocateP. Results LocateP combines many of the existing high-precision SCL identifiers with our own newly developed identifiers for specific SCLs. The LocateP pipeline was designed such that it mimics protein targeting and secretion processes. It distinguishes 7 different SCLs within Gram-positive bacteria: intracellular, multi-transmembrane, N-terminally membrane anchored, C-terminally membrane anchored, lipid-anchored, LPxTG-type cell-wall anchored, and secreted/released proteins. Moreover, it distinguishes pathways for Sec- or Tat-dependent secretion and alternative secretion of bacteriocin-like proteins. The pipeline was tested on data sets extracted from literature, including experimental proteomics studies. The tests showed that LocateP performs as well as, or even slightly better than other SCL predictors for some locations and outperforms

  6. The large-scale blast score ratio (LS-BSR) pipeline: a method to rapidly compare genetic content between bacterial genomes.

    Science.gov (United States)

    Sahl, Jason W; Caporaso, J Gregory; Rasko, David A; Keim, Paul

    2014-01-01

    Background. As whole genome sequence data from bacterial isolates becomes cheaper to generate, computational methods are needed to correlate sequence data with biological observations. Here we present the large-scale BLAST score ratio (LS-BSR) pipeline, which rapidly compares the genetic content of hundreds to thousands of bacterial genomes, and returns a matrix that describes the relatedness of all coding sequences (CDSs) in all genomes surveyed. This matrix can be easily parsed in order to identify genetic relationships between bacterial genomes. Although pipelines have been published that group peptides by sequence similarity, no other software performs the rapid, large-scale, full-genome comparative analyses carried out by LS-BSR. Results. To demonstrate the utility of the method, the LS-BSR pipeline was tested on 96 Escherichia coli and Shigella genomes; the pipeline ran in 163 min using 16 processors, which is a greater than 7-fold speedup compared to using a single processor. The BSR values for each CDS, which indicate a relative level of relatedness, were then mapped to each genome on an independent core genome single nucleotide polymorphism (SNP) based phylogeny. Comparisons were then used to identify clade specific CDS markers and validate the LS-BSR pipeline based on molecular markers that delineate between classical E. coli pathogenic variant (pathovar) designations. Scalability tests demonstrated that the LS-BSR pipeline can process 1,000 E. coli genomes in 27-57 h, depending upon the alignment method, using 16 processors. Conclusions. LS-BSR is an open-source, parallel implementation of the BSR algorithm, enabling rapid comparison of the genetic content of large numbers of genomes. The results of the pipeline can be used to identify specific markers between user-defined phylogenetic groups, and to identify the loss and/or acquisition of genetic information between bacterial isolates. Taxa-specific genetic markers can then be translated into clinical

  7. The large-scale blast score ratio (LS-BSR pipeline: a method to rapidly compare genetic content between bacterial genomes

    Directory of Open Access Journals (Sweden)

    Jason W. Sahl

    2014-04-01

    Full Text Available Background. As whole genome sequence data from bacterial isolates becomes cheaper to generate, computational methods are needed to correlate sequence data with biological observations. Here we present the large-scale BLAST score ratio (LS-BSR pipeline, which rapidly compares the genetic content of hundreds to thousands of bacterial genomes, and returns a matrix that describes the relatedness of all coding sequences (CDSs in all genomes surveyed. This matrix can be easily parsed in order to identify genetic relationships between bacterial genomes. Although pipelines have been published that group peptides by sequence similarity, no other software performs the rapid, large-scale, full-genome comparative analyses carried out by LS-BSR.Results. To demonstrate the utility of the method, the LS-BSR pipeline was tested on 96 Escherichia coli and Shigella genomes; the pipeline ran in 163 min using 16 processors, which is a greater than 7-fold speedup compared to using a single processor. The BSR values for each CDS, which indicate a relative level of relatedness, were then mapped to each genome on an independent core genome single nucleotide polymorphism (SNP based phylogeny. Comparisons were then used to identify clade specific CDS markers and validate the LS-BSR pipeline based on molecular markers that delineate between classical E. coli pathogenic variant (pathovar designations. Scalability tests demonstrated that the LS-BSR pipeline can process 1,000 E. coli genomes in 27–57 h, depending upon the alignment method, using 16 processors.Conclusions. LS-BSR is an open-source, parallel implementation of the BSR algorithm, enabling rapid comparison of the genetic content of large numbers of genomes. The results of the pipeline can be used to identify specific markers between user-defined phylogenetic groups, and to identify the loss and/or acquisition of genetic information between bacterial isolates. Taxa-specific genetic markers can then be translated

  8. Insights from 20 years of bacterial genome sequencing

    DEFF Research Database (Denmark)

    Land, Miriam; Hauser, Loren; Jun, Se-Ran

    2015-01-01

    of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome sequencing? There are many practical applications, such as genome-scale metabolic modeling, biosurveillance...

  9. Marine Bacterial Genomics

    DEFF Research Database (Denmark)

    Machado, Henrique

    microorganisms to be used as cell factories for production. Therefore exploitation of new microbial niches and use of different strategies is an opportunity to boost discoveries. Even though scientists have started to explore several habitats other than the terrestrial ones, the marine environment stands out...... as a hitherto under-explored niche. This thesis work uses high-throughput sequencing technologies on a collection of marine bacteria established during the Galathea 3 expedition, with the purpose of unraveling new biodiversity and new bioactivities. Several tools were used for genomic analysis in order...... to better understand the potential harbored in marine bacteria. The work presented makes use of whole genome sequencing of marine bacteria to prove that the genetic repertoire for secondary metabolite production harbored in these bacteria is far larger than anticipated; to identify and develop a new...

  10. Bacterial genome reengineering.

    Science.gov (United States)

    Zhou, Jindan; Rudd, Kenneth E

    2011-01-01

    The web application PrimerPair at ecogene.org generates large sets of paired DNA sequences surrounding- all protein and RNA genes of Escherichia coli K-12. Many DNA fragments, which these primers amplify, can be used to implement a genome reengineering strategy using complementary in vitro cloning and in vivo recombineering. The integration of a primer design tool with a model organism database increases the level of quality control. Computer-assisted design of gene primer pairs relies upon having highly accurate genomic DNA sequence information that exactly matches the DNA of the cells being used in the laboratory to ensure predictable DNA hybridizations. It is equally crucial to have confidence that the predicted start codons define the locations of genes accurately. Annotations in the EcoGene database are queried by PrimerPair to eliminate pseudogenes, IS elements, and other problematic genes before the design process starts. These projects progressively familiarize users with the EcoGene content, scope, and application interfaces that are useful for genome reengineering projects. The first protocol leads to the design of a pair of primer sequences that were used to clone and express a single gene. The N-terminal protein sequence was experimentally verified and the protein was detected in the periplasm. This is followed by instructions to design PCR primer pairs for cloning gene fragments encoding 50 periplasmic proteins without their signal peptides. The design process begins with the user simply designating one pair of forward and reverse primer endpoint positions relative to all start and stop codon positions. The gene name, genomic coordinates, and primer DNA sequences are reported to the user. When making chromosomal deletions, the integrity of the provisional primer design is checked to see whether it will generate any unwanted double deletions with adjacent genes. The bad designs are recalculated and replacement primers are provided alongside the

  11. Distribution of Triplet Separators in Bacterial Genomes

    Institute of Scientific and Technical Information of China (English)

    HU Rui; ZHENG Wei-Mou

    2001-01-01

    Distributions of triplet separator lengths for two bacterial complete genomes are analyzed. The theoretical distributions for the independent random sequence and the first-order Markov chain are derived and compared with the distributions of the bacterial genomes. A prominent double band structure, which does not exist in the theoretical distributions, is observed in the bacterial distributions for most triplets.``

  12. Value of a newly sequenced bacterial genome

    DEFF Research Database (Denmark)

    Barbosa, Eudes; Aburjaile, Flavia F; Ramos, Rommel Tj;

    2014-01-01

    Next-generation sequencing (NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also...... in an exponential increase in draft (partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome...

  13. Insights from twenty years of bacterial genome sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Jun, Se Ran [ORNL; Nookaew, Intawat [ORNL; Leuze, Michael Rex [ORNL; Ahn, Tae-Hyuk [ORNL; Karpinets, Tatiana V [ORNL; Lund, Ole [Technical University of Denmark; Kora, Guruprasad H [ORNL; Wassenaar, Trudy [Molecular Microbiology & Genomics Consultants, Zotzenheim, Germany; Poudel, Suresh [ORNL; Ussery, David W [ORNL

    2015-01-01

    Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome

  14. Bacterial Communities: Interactions to Scale

    Directory of Open Access Journals (Sweden)

    Reed M. Stubbendieck

    2016-08-01

    Full Text Available In the environment, bacteria live in complex multispecies communities. These communities span in scale from small, multicellular aggregates to billions or trillions of cells within the gastrointestinal tract of animals. The dynamics of bacterial communities are determined by pairwise interactions that occur between different species in the community. Though interactions occur between a few cells at a time, the outcomes of these interchanges have ramifications that ripple through many orders of magnitude, and ultimately affect the macroscopic world including the health of host organisms. In this review we cover how bacterial competition influences the structures of bacterial communities. We also emphasize methods and insights garnered from culture-dependent pairwise interaction studies, metagenomic analyses, and modeling experiments. Finally, we argue that the integration of multiple approaches will be instrumental to future understanding of the underlying dynamics of bacterial communities.

  15. Value of a newly sequenced bacterial genome

    Institute of Scientific and Technical Information of China (English)

    Eudes; GV; Barbosa; Flavia; F; Aburjaile; Rommel; TJ; Ramos; Adriana; R; Carneiro; Yves; Le; Loir; Jan; Baumbach; Anderson; Miyoshi; Artur; Silva; Vasco; Azevedo

    2014-01-01

    Next-generation sequencing(NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting in an exponential increase in draft(partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the "scientific value" of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses the factors that could be leading to the increase in the number of draft deposits and the consequent loss of relevant biological information.

  16. Dynamics of genome rearrangement in bacterial populations.

    Directory of Open Access Journals (Sweden)

    Aaron E Darling

    represent the first characterization of genome arrangement evolution in a bacterial population evolving outside laboratory conditions. Insight into the process of genomic rearrangement may further the understanding of pathogen population dynamics and selection on the architecture of circular bacterial chromosomes.

  17. One bacterial cell, one complete genome.

    Directory of Open Access Journals (Sweden)

    Tanja Woyke

    Full Text Available While the bulk of the finished microbial genomes sequenced to date are derived from cultured bacterial and archaeal representatives, the vast majority of microorganisms elude current culturing attempts, severely limiting the ability to recover complete or even partial genomes from these environmental species. Single cell genomics is a novel culture-independent approach, which enables access to the genetic material of an individual cell. No single cell genome has to our knowledge been closed and finished to date. Here we report the completed genome from an uncultured single cell of Candidatus Sulcia muelleri DMIN. Digital PCR on single symbiont cells isolated from the bacteriome of the green sharpshooter Draeculacephala minerva bacteriome allowed us to assess that this bacteria is polyploid with genome copies ranging from approximately 200-900 per cell, making it a most suitable target for single cell finishing efforts. For single cell shotgun sequencing, an individual Sulcia cell was isolated and whole genome amplified by multiple displacement amplification (MDA. Sanger-based finishing methods allowed us to close the genome. To verify the correctness of our single cell genome and exclude MDA-derived artifacts, we independently shotgun sequenced and assembled the Sulcia genome from pooled bacteriomes using a metagenomic approach, yielding a nearly identical genome. Four variations we detected appear to be genuine biological differences between the two samples. Comparison of the single cell genome with bacteriome metagenomic sequence data detected two single nucleotide polymorphisms (SNPs, indicating extremely low genetic diversity within a Sulcia population. This study demonstrates the power of single cell genomics to generate a complete, high quality, non-composite reference genome within an environmental sample, which can be used for population genetic analyzes.

  18. One Bacterial Cell, One Complete Genome

    Energy Technology Data Exchange (ETDEWEB)

    Woyke, Tanja; Tighe, Damon; Mavrommatis, Konstantinos; Clum, Alicia; Copeland, Alex; Schackwitz, Wendy; Lapidus, Alla; Wu, Dongying; McCutcheon, John P.; McDonald, Bradon R.; Moran, Nancy A.; Bristow, James; Cheng, Jan-Fang

    2010-04-26

    While the bulk of the finished microbial genomes sequenced to date are derived from cultured bacterial and archaeal representatives, the vast majority of microorganisms elude current culturing attempts, severely limiting the ability to recover complete or even partial genomes from these environmental species. Single cell genomics is a novel culture-independent approach, which enables access to the genetic material of an individual cell. No single cell genome has to our knowledge been closed and finished to date. Here we report the completed genome from an uncultured single cell of Candidatus Sulcia muelleri DMIN. Digital PCR on single symbiont cells isolated from the bacteriome of the green sharpshooter Draeculacephala minerva bacteriome allowed us to assess that this bacteria is polyploid with genome copies ranging from approximately 200?900 per cell, making it a most suitable target for single cell finishing efforts. For single cell shotgun sequencing, an individual Sulcia cell was isolated and whole genome amplified by multiple displacement amplification (MDA). Sanger-based finishing methods allowed us to close the genome. To verify the correctness of our single cell genome and exclude MDA-derived artifacts, we independently shotgun sequenced and assembled the Sulcia genome from pooled bacteriomes using a metagenomic approach, yielding a nearly identical genome. Four variations we detected appear to be genuine biological differences between the two samples. Comparison of the single cell genome with bacteriome metagenomic sequence data detected two single nucleotide polymorphisms (SNPs), indicating extremely low genetic diversity within a Sulcia population. This study demonstrates the power of single cell genomics to generate a complete, high quality, non-composite reference genome within an environmental sample, which can be used for population genetic analyzes.

  19. Genome Calligrapher: A Web Tool for Refactoring Bacterial Genome Sequences for de Novo DNA Synthesis.

    Science.gov (United States)

    Christen, Matthias; Deutsch, Samuel; Christen, Beat

    2015-08-21

    Recent advances in synthetic biology have resulted in an increasing demand for the de novo synthesis of large-scale DNA constructs. Any process improvement that enables fast and cost-effective streamlining of digitized genetic information into fabricable DNA sequences holds great promise to study, mine, and engineer genomes. Here, we present Genome Calligrapher, a computer-aided design web tool intended for whole genome refactoring of bacterial chromosomes for de novo DNA synthesis. By applying a neutral recoding algorithm, Genome Calligrapher optimizes GC content and removes obstructive DNA features known to interfere with the synthesis of double-stranded DNA and the higher order assembly into large DNA constructs. Subsequent bioinformatics analysis revealed that synthesis constraints are prevalent among bacterial genomes. However, a low level of codon replacement is sufficient for refactoring bacterial genomes into easy-to-synthesize DNA sequences. To test the algorithm, 168 kb of synthetic DNA comprising approximately 20 percent of the synthetic essential genome of the cell-cycle bacterium Caulobacter crescentus was streamlined and then ordered from a commercial supplier of low-cost de novo DNA synthesis. The successful assembly into eight 20 kb segments indicates that Genome Calligrapher algorithm can be efficiently used to refactor difficult-to-synthesize DNA. Genome Calligrapher is broadly applicable to recode biosynthetic pathways, DNA sequences, and whole bacterial genomes, thus offering new opportunities to use synthetic biology tools to explore the functionality of microbial diversity. The Genome Calligrapher web tool can be accessed at https://christenlab.ethz.ch/GenomeCalligrapher  .

  20. Insights from genomics into bacterial pathogen populations.

    Directory of Open Access Journals (Sweden)

    Daniel J Wilson

    2012-09-01

    Full Text Available Bacterial pathogens impose a heavy burden of disease on human populations worldwide. The gravest threats are posed by highly virulent respiratory pathogens, enteric pathogens, and HIV-associated infections. Tuberculosis alone is responsible for the deaths of 1.5 million people annually. Treatment options for bacterial pathogens are being steadily eroded by the evolution and spread of drug resistance. However, population-level whole genome sequencing offers new hope in the fight against pathogenic bacteria. By providing insights into bacterial evolution and disease etiology, these approaches pave the way for novel interventions and therapeutic targets. Sequencing populations of bacteria across the whole genome provides unprecedented resolution to investigate (i within-host evolution, (ii transmission history, and (iii population structure. Moreover, advances in rapid benchtop sequencing herald a new era of real-time genomics in which sequencing and analysis can be deployed within hours in response to rapidly changing public health emergencies. The purpose of this review is to highlight the transformative effect of population genomics on bacteriology, and to consider the prospects for answering abiding questions such as why bacteria cause disease.

  1. The evolution of domain-content in bacterial genomes

    Directory of Open Access Journals (Sweden)

    van Nimwegen Erik

    2008-12-01

    Full Text Available Abstract Background Across all sequenced bacterial genomes, the number of domains nc in different functional categories c scales as a power-law in the total number of domains n, i.e. nc∝nαc MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOBa42aaSbaaSqaaiabdogaJbqabaGccqGHDisTcqWGUbGBdaahaaWcbeqaaiabeg7aHnaaBaaameaacqWGJbWyaeqaaaaaaaa@34EC@, with exponents αc that vary across functional categories. Here we investigate the implications of these scaling laws for the evolution of domain-content in bacterial genomes and derive the simplest evolutionary model consistent with these scaling laws. Results We show that, using only an assumption of time invariance, the scaling laws uniquely determine the relative rates of domain additions and deletions across all functional categories and evolutionary lineages. In particular, the model predicts that the rate of additions and deletions of domains of category c is proportional to the number of domains nc currently in the genome and we discuss the implications of this observation for the role of horizontal transfer in genome evolution. Second, in addition to being proportional to nc, the rate of additions and deletions of domains of category c is proportional to a category-dependent constant ρc, which is the same for all evolutionary lineages. This 'evolutionary potential' ρc represents the relative probability for additions/deletions of domains of category c to be fixed in the population by selection and is predicted to equal the scaling exponent αc. By comparing the domain content of 93 pairs of closely-related genomes from all over the phylogenetic tree of bacteria, we demonstrate that the model's predictions are supported by available genome-sequence data. Conclusion Our results establish a direct

  2. Transforming clinical microbiology with bacterial genome sequencing.

    Science.gov (United States)

    Didelot, Xavier; Bowden, Rory; Wilson, Daniel J; Peto, Tim E A; Crook, Derrick W

    2012-09-01

    Whole-genome sequencing of bacteria has recently emerged as a cost-effective and convenient approach for addressing many microbiological questions. Here, we review the current status of clinical microbiology and how it has already begun to be transformed by using next-generation sequencing. We focus on three essential tasks: identifying the species of an isolate, testing its properties, such as resistance to antibiotics and virulence, and monitoring the emergence and spread of bacterial pathogens. We predict that the application of next-generation sequencing will soon be sufficiently fast, accurate and cheap to be used in routine clinical microbiology practice, where it could replace many complex current techniques with a single, more efficient workflow.

  3. Gene calling and bacterial genome annotation with BG7.

    Science.gov (United States)

    Tobes, Raquel; Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Kovach, Evdokim; Alekhin, Alexey; Pareja, Eduardo

    2015-01-01

    New massive sequencing technologies are providing many bacterial genome sequences from diverse taxa but a refined annotation of these genomes is crucial for obtaining scientific findings and new knowledge. Thus, bacterial genome annotation has emerged as a key point to investigate in bacteria. Any efficient tool designed specifically to annotate bacterial genomes sequenced with massively parallel technologies has to consider the specific features of bacterial genomes (absence of introns and scarcity of nonprotein-coding sequence) and of next-generation sequencing (NGS) technologies (presence of errors and not perfectly assembled genomes). These features make it convenient to focus on coding regions and, hence, on protein sequences that are the elements directly related with biological functions. In this chapter we describe how to annotate bacterial genomes with BG7, an open-source tool based on a protein-centered gene calling/annotation paradigm. BG7 is specifically designed for the annotation of bacterial genomes sequenced with NGS. This tool is sequence error tolerant maintaining their capabilities for the annotation of highly fragmented genomes or for annotating mixed sequences coming from several genomes (as those obtained through metagenomics samples). BG7 has been designed with scalability as a requirement, with a computing infrastructure completely based on cloud computing (Amazon Web Services).

  4. Genomic and Transcriptomic Analyses of Foodborne Bacterial Pathogens

    Science.gov (United States)

    Zhang, Wei; Dudley, Edward G.; Wade, Joseph T.

    DNA microarrays (often interchangeably called DNA chips or DNA arrays) are among the most popular analytical tools for high-throughput comparative genomic and transcriptomic analyses of foodborne bacterial pathogens. A typical DNA microarray contains hundreds to millions of small DNA probes that are chemically attached (or "printed") onto the surface of a microscopic glass slide. Depending on the specific "printing" and probe synthesis technologies for different microarray platforms, such DNA probes can be PCR amplicons or in situ synthesized short oligonucleotides. DNA microarray technologies have revolutionized the way that we investigate the biology of foodborne bacterial pathogens. The major advantage of these technologies is that DNA microarrays allow comparison of subtle genomic or transcriptomic variations between two bacterial samples, such as genomic variations between two different bacterial strains or transcriptomic alterations of same bacterial strain under two different treatments. Some applications of comparative genomic hybridization microarrays and global gene expression microarrays have been covered in previous chapters of this book.

  5. Genome Update: alignment of bacterial chromosomes

    DEFF Research Database (Denmark)

    Ussery, David; Jensen, Mette; Poulsen, Tine Rugh;

    2004-01-01

    There are four new microbial genomes listed in this month's Genome Update, three belonging to Gram-positive bacteria and one belonging to an archaeon that lives at pH 0; all of these genomes are listed in Table 1⇓. The method of genome comparison this month is that of genome alignment and, as an ...

  6. Genome engineering and gene expression control for bacterial strain development.

    Science.gov (United States)

    Song, Chan Woo; Lee, Joungmin; Lee, Sang Yup

    2015-01-01

    In recent years, a number of techniques and tools have been developed for genome engineering and gene expression control to achieve desired phenotypes of various bacteria. Here we review and discuss the recent advances in bacterial genome manipulation and gene expression control techniques, and their actual uses with accompanying examples. Genome engineering has been commonly performed based on homologous recombination. During such genome manipulation, the counterselection systems employing SacB or nucleases have mainly been used for the efficient selection of desired engineered strains. The recombineering technology enables simple and more rapid manipulation of the bacterial genome. The group II intron-mediated genome engineering technology is another option for some bacteria that are difficult to be engineered by homologous recombination. Due to the increasing demands on high-throughput screening of bacterial strains having the desired phenotypes, several multiplex genome engineering techniques have recently been developed and validated in some bacteria. Another approach to achieve desired bacterial phenotypes is the repression of target gene expression without the modification of genome sequences. This can be performed by expressing antisense RNA, small regulatory RNA, or CRISPR RNA to repress target gene expression at the transcriptional or translational level. All of these techniques allow efficient and rapid development and screening of bacterial strains having desired phenotypes, and more advanced techniques are expected to be seen.

  7. Bacteriophage functional genomics and its role in bacterial pathogen detection.

    Science.gov (United States)

    Klumpp, Jochen; Fouts, Derrick E; Sozhamannan, Shanmuga

    2013-07-01

    Emerging and reemerging bacterial infectious diseases are a major public health concern worldwide. The role of bacteriophages in the emergence of novel bacterial pathogens by horizontal gene transfer was highlighted by the May 2011 Escherichia coli O104:H4 outbreaks that originated in Germany and spread to other European countries. This outbreak also highlighted the pivotal role played by recent advances in functional genomics in rapidly deciphering the virulence mechanism elicited by this novel pathogen and developing rapid diagnostics and therapeutics. However, despite a steady increase in the number of phage sequences in the public databases, boosted by the next-generation sequencing technologies, few functional genomics studies of bacteriophages have been conducted. Our definition of 'functional genomics' encompasses a range of aspects: phage genome sequencing, annotation and ascribing functions to phage genes, prophage identification in bacterial sequences, elucidating the events in various stages of phage life cycle using genomic, transcriptomic and proteomic approaches, defining the mechanisms of host takeover including specific bacterial-phage protein interactions and identifying virulence and other adaptive features encoded by phages and finally, using prophage genomic information for bacterial detection/diagnostics. Given the breadth and depth of this definition and the fact that some of these aspects (especially phage-encoded virulence/adaptive features) have been treated extensively in other reviews, we restrict our focus only on certain aspects. These include phage genome sequencing and annotation, identification of prophages in bacterial sequences and genetic characterization of phages, functional genomics of the infection process and finally, bacterial identification using genomic information.

  8. Harnessing CRISPR-Cas systems for bacterial genome editing.

    Science.gov (United States)

    Selle, Kurt; Barrangou, Rodolphe

    2015-04-01

    Manipulation of genomic sequences facilitates the identification and characterization of key genetic determinants in the investigation of biological processes. Genome editing via clustered regularly interspaced short palindromic repeats (CRISPR)-CRISPR-associated (Cas) constitutes a next-generation method for programmable and high-throughput functional genomics. CRISPR-Cas systems are readily reprogrammed to induce sequence-specific DNA breaks at target loci, resulting in fixed mutations via host-dependent DNA repair mechanisms. Although bacterial genome editing is a relatively unexplored and underrepresented application of CRISPR-Cas systems, recent studies provide valuable insights for the widespread future implementation of this technology. This review summarizes recent progress in bacterial genome editing and identifies fundamental genetic and phenotypic outcomes of CRISPR targeting in bacteria, in the context of tool development, genome homeostasis, and DNA repair.

  9. LATERAL GENE TRANSFER AND THE HISTORY OF BACTERIAL GENOMES

    Energy Technology Data Exchange (ETDEWEB)

    Howard Ochman

    2006-02-22

    The aims of this research were to elucidate the role and extent of lateral transfer in the differentiation of bacterial strains and species, and to assess the impact of gene transfer on the evolution of bacterial genomes. The ultimate goal of the project is to examine the dynamics of a core set of protein-coding genes (i.e., those that are distributed universally among Bacteria) by developing conserved primers that would allow their amplification and sequencing in any bacterial taxa. In addition, we adopted a bioinformatic approach to elucidate the extent of lateral gene transfer in sequenced genome.

  10. Identifying characteristic scales in the human genome

    Science.gov (United States)

    Carpena, P.; Bernaola-Galván, P.; Coronado, A. V.; Hackenberg, M.; Oliver, J. L.

    2007-03-01

    The scale-free, long-range correlations detected in DNA sequences contrast with characteristic lengths of genomic elements, being particularly incompatible with the isochores (long, homogeneous DNA segments). By computing the local behavior of the scaling exponent α of detrended fluctuation analysis (DFA), we discriminate between sequences with and without true scaling, and we find that no single scaling exists in the human genome. Instead, human chromosomes show a common compositional structure with two characteristic scales, the large one corresponding to the isochores and the other to small and medium scale genomic elements.

  11. Correcting Inconsistencies and Errors in Bacterial Genome Metadata Using an Automated Curation Tool in Excel (AutoCurE).

    Science.gov (United States)

    Schmedes, Sarah E; King, Jonathan L; Budowle, Bruce

    2015-01-01

    Whole-genome data are invaluable for large-scale comparative genomic studies. Current sequencing technologies have made it feasible to sequence entire bacterial genomes with relative ease and time with a substantially reduced cost per nucleotide, hence cost per genome. More than 3,000 bacterial genomes have been sequenced and are available at the finished status. Publically available genomes can be readily downloaded; however, there are challenges to verify the specific supporting data contained within the download and to identify errors and inconsistencies that may be present within the organizational data content and metadata. AutoCurE, an automated tool for bacterial genome database curation in Excel, was developed to facilitate local database curation of supporting data that accompany downloaded genomes from the National Center for Biotechnology Information. AutoCurE provides an automated approach to curate local genomic databases by flagging inconsistencies or errors by comparing the downloaded supporting data to the genome reports to verify genome name, RefSeq accession numbers, the presence of archaea, BioProject/UIDs, and sequence file descriptions. Flags are generated for nine metadata fields if there are inconsistencies between the downloaded genomes and genomes reports and if erroneous or missing data are evident. AutoCurE is an easy-to-use tool for local database curation for large-scale genome data prior to downstream analyses.

  12. Two-dimensional DNA displays for comparisons of bacterial genomes

    Directory of Open Access Journals (Sweden)

    Malloff Chad

    2003-01-01

    Full Text Available We have developed two whole genome-scanning techniques to aid in the discovery of polymorphisms as well as horizontally acquired genes in prokaryotic organisms. First, two-dimensional bacterial genomic display (2DBGD was developed using restriction enzyme fragmentation to separate genomic DNA based on size, and then employing denaturing gradient gel electrophoresis (DGGE in the second dimension to exploit differences in sequence composition. This technique was used to generate high-resolution displays that enable the direct comparison of > 800 genomic fragments simultaneously and can be adapted for the high-throughput comparison of bacterial genomes. 2DBGDs are capable of detecting acquired and altered DNA, however, only in very closely related strains. If used to compare more distantly related strains (e.g. different species within a genus numerous small changes (i.e. small deletions and point mutations unrelated to the interesting phenotype, would encumber the comparison of 2DBGDs. For this reason a second method, bacterial comparative genomic hybridization (BCGH, was developed to directly compare bacterial genomes to identify gain or loss of genomic DNA. BCGH relies on performing 2DBGD on a pooled sample of genomic DNA from 2 strains to be compared and subsequently hybridizing the resulting 2DBGD blot separately with DNA from each individual strain. Unique spots (hybridization signals represent foreign DNA. The identification of novel DNA is easily achieved by excising the DNA from a dried gel followed by subsequent cloning and sequencing. 2DBGD and BCGH thus represent novel high resolution genome scanning techniques for directly identifying altered and/or acquired DNA.

  13. Bacterial Cellular Engineering by Genome Editing and Gene Silencing

    Directory of Open Access Journals (Sweden)

    Nobutaka Nakashima

    2014-02-01

    Full Text Available Genome editing is an important technology for bacterial cellular engineering, which is commonly conducted by homologous recombination-based procedures, including gene knockout (disruption, knock-in (insertion, and allelic exchange. In addition, some new recombination-independent approaches have emerged that utilize catalytic RNAs, artificial nucleases, nucleic acid analogs, and peptide nucleic acids. Apart from these methods, which directly modify the genomic structure, an alternative approach is to conditionally modify the gene expression profile at the posttranscriptional level without altering the genomes. This is performed by expressing antisense RNAs to knock down (silence target mRNAs in vivo. This review describes the features and recent advances on methods used in genomic engineering and silencing technologies that are advantageously used for bacterial cellular engineering.

  14. Bacterial Recombineering: Genome Engineering via Phage-Based Homologous Recombination.

    Science.gov (United States)

    Pines, Gur; Freed, Emily F; Winkler, James D; Gill, Ryan T

    2015-11-20

    The ability to specifically modify bacterial genomes in a precise and efficient manner is highly desired in various fields, ranging from molecular genetics to metabolic engineering and synthetic biology. Much has changed from the initial realization that phage-derived genes may be employed for such tasks to today, where recombineering enables complex genetic edits within a genome or a population. Here, we review the major developments leading to recombineering becoming the method of choice for in situ bacterial genome editing while highlighting the various applications of recombineering in pushing the boundaries of synthetic biology. We also present the current understanding of the mechanism of recombineering. Finally, we discuss in detail issues surrounding recombineering efficiency and future directions for recombineering-based genome editing.

  15. Differentiation of regions with atypical oligonucleotide composition in bacterial genomes

    Directory of Open Access Journals (Sweden)

    Reva Oleg N

    2005-10-01

    Full Text Available Abstract Background Complete sequencing of bacterial genomes has become a common technique of present day microbiology. Thereafter, data mining in the complete sequence is an essential step. New in silico methods are needed that rapidly identify the major features of genome organization and facilitate the prediction of the functional class of ORFs. We tested the usefulness of local oligonucleotide usage (OU patterns to recognize and differentiate types of atypical oligonucleotide composition in DNA sequences of bacterial genomes. Results A total of 163 bacterial genomes of eubacteria and archaea published in the NCBI database were analyzed. Local OU patterns exhibit substantial intrachromosomal variation in bacteria. Loci with alternative OU patterns were parts of horizontally acquired gene islands or ancient regions such as genes for ribosomal proteins and RNAs. OU statistical parameters, such as local pattern deviation (D, pattern skew (PS and OU variance (OUV enabled the detection and visualization of gene islands of different functional classes. Conclusion A set of approaches has been designed for the statistical analysis of nucleotide sequences of bacterial genomes. These methods are useful for the visualization and differentiation of regions with atypical oligonucleotide composition prior to or accompanying gene annotation.

  16. [Bacterial genomics and metagenomics: clinical applications and medical relevance].

    Science.gov (United States)

    Diene, S M; Bertelli, C; Pillonel, T; Schrenzel, J; Greub, G

    2014-11-12

    New sequencing technologies provide in a short time and at low cost high amount of genomic sequences useful for applications such as: a) development of diagnostic PCRs and/or serological tests; b) detection of virulence factors (virulome) or genes/SNPs associated with resistance to antibiotics (resistome) and c) investigation of transmission and dissemination of bacterial pathogens. Thus, bacterial genomics of medical importance is useful to clinical microbiologists, to infectious diseases specialists as well as to epidemiologists. Determining the microbial composition of a sample by metagenomics is another application of new sequencing technologies, useful to understand the impact of bacteria on various non-infectious diseases such as obesity, asthma, or diabetes. Genomics and metagenomics will likely become a specialized diagnostic analysis.

  17. CONTIGuator: a bacterial genomes finishing tool for structural insights on draft genomes

    Directory of Open Access Journals (Sweden)

    Bazzicalupo Marco

    2011-06-01

    Full Text Available Abstract Recent developments in sequencing technologies have given the opportunity to sequence many bacterial genomes with limited cost and labor, compared to previous techniques. However, a limiting step of genome sequencing is the finishing process, needed to infer the relative position of each contig and close sequencing gaps. An additional degree of complexity is given by bacterial species harboring more than one replicon, which are not contemplated by the currently available programs. The availability of a large number of bacterial genomes allows geneticists to use complete genomes (possibly from the same species as templates for contigs mapping. Here we present CONTIGuator, a software tool for contigs mapping over a reference genome which allows the visualization of a map of contigs, underlining loss and/or gain of genetic elements and permitting to finish multipartite genomes. The functionality of CONTIGuator was tested using four genomes, demonstrating its improved performances compared to currently available programs. Our approach appears efficient, with a clear visualization, allowing the user to perform comparative structural genomics analysis on draft genomes. CONTIGuator is a Python script for Linux environments and can be used on normal desktop machines and can be downloaded from http://contiguator.sourceforge.net.

  18. Genes but not genomes reveal bacterial domestication of Lactococcus lactis.

    Directory of Open Access Journals (Sweden)

    Delphine Passerini

    Full Text Available BACKGROUND: The population structure and diversity of Lactococcus lactis subsp. lactis, a major industrial bacterium involved in milk fermentation, was determined at both gene and genome level. Seventy-six lactococcal isolates of various origins were studied by different genotyping methods and thirty-six strains displaying unique macrorestriction fingerprints were analyzed by a new multilocus sequence typing (MLST scheme. This gene-based analysis was compared to genomic characteristics determined by pulsed-field gel electrophoresis (PFGE. METHODOLOGY/PRINCIPAL FINDINGS: The MLST analysis revealed that L. lactis subsp. lactis is essentially clonal with infrequent intra- and intergenic recombination; also, despite its taxonomical classification as a subspecies, it displays a genetic diversity as substantial as that within several other bacterial species. Genome-based analysis revealed a genome size variability of 20%, a value typical of bacteria inhabiting different ecological niches, and that suggests a large pan-genome for this subspecies. However, the genomic characteristics (macrorestriction pattern, genome or chromosome size, plasmid content did not correlate to the MLST-based phylogeny, with strains from the same sequence type (ST differing by up to 230 kb in genome size. CONCLUSION/SIGNIFICANCE: The gene-based phylogeny was not fully consistent with the traditional classification into dairy and non-dairy strains but supported a new classification based on ecological separation between "environmental" strains, the main contributors to the genetic diversity within the subspecies, and "domesticated" strains, subject to recent genetic bottlenecks. Comparison between gene- and genome-based analyses revealed little relationship between core and dispensable genome phylogenies, indicating that clonal diversification and phenotypic variability of the "domesticated" strains essentially arose through substantial genomic flux within the dispensable

  19. Genome scale engineering techniques for metabolic engineering.

    Science.gov (United States)

    Liu, Rongming; Bassalo, Marcelo C; Zeitoun, Ramsey I; Gill, Ryan T

    2015-11-01

    Metabolic engineering has expanded from a focus on designs requiring a small number of genetic modifications to increasingly complex designs driven by advances in genome-scale engineering technologies. Metabolic engineering has been generally defined by the use of iterative cycles of rational genome modifications, strain analysis and characterization, and a synthesis step that fuels additional hypothesis generation. This cycle mirrors the Design-Build-Test-Learn cycle followed throughout various engineering fields that has recently become a defining aspect of synthetic biology. This review will attempt to summarize recent genome-scale design, build, test, and learn technologies and relate their use to a range of metabolic engineering applications.

  20. Comparative bacterial proteomics: analysis of the core genome concept.

    Directory of Open Access Journals (Sweden)

    Stephen J Callister

    Full Text Available While comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry, experimental validation of the existence of this core genome requires extensive measurement and is typically not undertaken. Enabled by an extensive proteome database developed over six years, we have experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. Although genomic studies can establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits.

  1. Comparative Bacterial Proteomics: Analysis of the Core Genome Concept

    Energy Technology Data Exchange (ETDEWEB)

    Callister, Stephen J.; McCue, Lee Ann; Turse, Josh E.; Monroe, Matthew E.; Auberry, Kenneth J.; Smith, Richard D.; Adkins, Joshua N.; Lipton, Mary S.

    2008-02-06

    Comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry. Experimental validation of the existence of this core genome requires extensive measurement and is not typically undertaken. Enabled by an extensive proteome database development over a six year period, we experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. While genomic studies establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits.

  2. BEACON: automated tool for Bacterial GEnome Annotation ComparisON

    KAUST Repository

    Kalkatawi, Manal Matoq Saeed

    2015-08-18

    Background Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). Results The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON’s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced. Conclusions We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/

  3. Genome Assembly and Computational Analysis Pipelines for Bacterial Pathogens

    KAUST Repository

    Rangkuti, Farania Gama Ardhina

    2011-06-01

    Pathogens lie behind the deadliest pandemics in history. To date, AIDS pandemic has resulted in more than 25 million fatal cases, while tuberculosis and malaria annually claim more than 2 million lives. Comparative genomic analyses are needed to gain insights into the molecular mechanisms of pathogens, but the abundance of biological data dictates that such studies cannot be performed without the assistance of computational approaches. This explains the significant need for computational pipelines for genome assembly and analyses. The aim of this research is to develop such pipelines. This work utilizes various bioinformatics approaches to analyze the high-­throughput genomic sequence data that has been obtained from several strains of bacterial pathogens. A pipeline has been compiled for quality control for sequencing and assembly, and several protocols have been developed to detect contaminations. Visualization has been generated of genomic data in various formats, in addition to alignment, homology detection and sequence variant detection. We have also implemented a metaheuristic algorithm that significantly improves bacterial genome assemblies compared to other known methods. Experiments on Mycobacterium tuberculosis H37Rv data showed that our method resulted in improvement of N50 value of up to 9697% while consistently maintaining high accuracy, covering around 98% of the published reference genome. Other improvement efforts were also implemented, consisting of iterative local assemblies and iterative correction of contiguated bases. Our result expedites the genomic analysis of virulent genes up to single base pair resolution. It is also applicable to virtually every pathogenic microorganism, propelling further research in the control of and protection from pathogen-­associated diseases.

  4. Elucidation of operon structures across closely related bacterial genomes.

    Science.gov (United States)

    Zhou, Chuan; Ma, Qin; Li, Guojun

    2014-01-01

    About half of the protein-coding genes in prokaryotic genomes are organized into operons to facilitate co-regulation during transcription. With the evolution of genomes, operon structures are undergoing changes which could coordinate diverse gene expression patterns in response to various stimuli during the life cycle of a bacterial cell. Here we developed a graph-based model to elucidate the diversity of operon structures across a set of closely related bacterial genomes. In the constructed graph, each node represents one orthologous gene group (OGG) and a pair of nodes will be connected if any two genes, from the corresponding two OGGs respectively, are located in the same operon as immediate neighbors in any of the considered genomes. Through identifying the connected components in the above graph, we found that genes in a connected component are likely to be functionally related and these identified components tend to form treelike topology, such as paths and stars, corresponding to different biological mechanisms in transcriptional regulation as follows. Specifically, (i) a path-structure component integrates genes encoding a protein complex, such as ribosome; and (ii) a star-structure component not only groups related genes together, but also reflects the key functional roles of the central node of this component, such as the ABC transporter with a transporter permease and substrate-binding proteins surrounding it. Most interestingly, the genes from organisms with highly diverse living environments, i.e., biomass degraders and animal pathogens of clostridia in our study, can be clearly classified into different topological groups on some connected components.

  5. Elucidation of operon structures across closely related bacterial genomes.

    Directory of Open Access Journals (Sweden)

    Chuan Zhou

    Full Text Available About half of the protein-coding genes in prokaryotic genomes are organized into operons to facilitate co-regulation during transcription. With the evolution of genomes, operon structures are undergoing changes which could coordinate diverse gene expression patterns in response to various stimuli during the life cycle of a bacterial cell. Here we developed a graph-based model to elucidate the diversity of operon structures across a set of closely related bacterial genomes. In the constructed graph, each node represents one orthologous gene group (OGG and a pair of nodes will be connected if any two genes, from the corresponding two OGGs respectively, are located in the same operon as immediate neighbors in any of the considered genomes. Through identifying the connected components in the above graph, we found that genes in a connected component are likely to be functionally related and these identified components tend to form treelike topology, such as paths and stars, corresponding to different biological mechanisms in transcriptional regulation as follows. Specifically, (i a path-structure component integrates genes encoding a protein complex, such as ribosome; and (ii a star-structure component not only groups related genes together, but also reflects the key functional roles of the central node of this component, such as the ABC transporter with a transporter permease and substrate-binding proteins surrounding it. Most interestingly, the genes from organisms with highly diverse living environments, i.e., biomass degraders and animal pathogens of clostridia in our study, can be clearly classified into different topological groups on some connected components.

  6. Reconstruction of a Bacterial Genome from DNA Cassettes

    Energy Technology Data Exchange (ETDEWEB)

    Christopher Dupont; John Glass; Laura Sheahan; Shibu Yooseph; Lisa Zeigler Allen; Mathangi Thiagarajan; Andrew Allen; Robert Friedman; J. Craig Venter

    2011-12-31

    This basic research program comprised two major areas: (1) acquisition and analysis of marine microbial metagenomic data and development of genomic analysis tools for broad, external community use; (2) development of a minimal bacterial genome. Our Marine Metagenomic Diversity effort generated and analyzed shotgun sequencing data from microbial communities sampled from over 250 sites around the world. About 40% of the 26 Gbp of sequence data has been made publicly available to date with a complete release anticipated in six months. Our results and those mining the deposited data have revealed a vast diversity of genes coding for critical metabolic processes whose phylogenetic and geographic distributions will enable a deeper understanding of carbon and nutrient cycling, microbial ecology, and rapid rate evolutionary processes such as horizontal gene transfer by viruses and plasmids. A global assembly of the generated dataset resulted in a massive set (5Gbp) of genome fragments that provide context to the majority of the generated data that originated from uncultivated organisms. Our Synthetic Biology team has made significant progress towards the goal of synthesizing a minimal mycoplasma genome that will have all of the machinery for independent life. This project, once completed, will provide fundamentally new knowledge about requirements for microbial life and help to lay a basic research foundation for developing microbiological approaches to bioenergy.

  7. Bacterial delivery of TALEN proteins for human genome editing.

    Directory of Open Access Journals (Sweden)

    Jingyue Jia

    Full Text Available Transcription Activator-Like Effector Nucleases (TALENs are a novel class of sequence-specific nucleases that have recently gained prominence for its ease of production and high efficiency in genome editing. A TALEN pair recognizes specific DNA sequences and introduce double-strand break in the target site, triggering non-homologous end joining and homologous recombination. Current methods of TALEN delivery involves introduction of foreign genetic materials, such as plasmid DNA or mRNA, through transfection. Here, we show an alternative way of TALEN delivery, bacterial type III secretion system (T3SS mediated direct injection of the TALEN proteins into human cells. Bacterially injected TALEN was shown to efficiently target host cell nucleus where it persists for almost 12 hours. Using a pair of TALENs targeting venus gene, such injected nuclear TALENs were shown functional in introducing DNA mutation in the target site. Interestingly, S-phase cells seem to show greater sensitivity to the TALEN mediated target gene modification. Accordingly, efficiency of such genome editing can easily be manipulated by the infection dose, number of repeated infections as well as enrichment of S phase cells. This work further extends the utility of T3SS in the delivery of functional proteins into mammalian cells to alter their characters for biomedical applications.

  8. Bacterial delivery of TALEN proteins for human genome editing.

    Science.gov (United States)

    Jia, Jingyue; Jin, Yongxin; Bian, Ting; Wu, Donghai; Yang, Lijun; Terada, Naohiro; Wu, Weihui; Jin, Shouguang

    2014-01-01

    Transcription Activator-Like Effector Nucleases (TALENs) are a novel class of sequence-specific nucleases that have recently gained prominence for its ease of production and high efficiency in genome editing. A TALEN pair recognizes specific DNA sequences and introduce double-strand break in the target site, triggering non-homologous end joining and homologous recombination. Current methods of TALEN delivery involves introduction of foreign genetic materials, such as plasmid DNA or mRNA, through transfection. Here, we show an alternative way of TALEN delivery, bacterial type III secretion system (T3SS) mediated direct injection of the TALEN proteins into human cells. Bacterially injected TALEN was shown to efficiently target host cell nucleus where it persists for almost 12 hours. Using a pair of TALENs targeting venus gene, such injected nuclear TALENs were shown functional in introducing DNA mutation in the target site. Interestingly, S-phase cells seem to show greater sensitivity to the TALEN mediated target gene modification. Accordingly, efficiency of such genome editing can easily be manipulated by the infection dose, number of repeated infections as well as enrichment of S phase cells. This work further extends the utility of T3SS in the delivery of functional proteins into mammalian cells to alter their characters for biomedical applications.

  9. Reductive genome evolution at both ends of the bacterial population size spectrum.

    Science.gov (United States)

    Batut, Bérénice; Knibbe, Carole; Marais, Gabriel; Daubin, Vincent

    2014-12-01

    Bacterial genomes show substantial variations in size. The smallest bacterial genomes are those of endocellular symbionts of eukaryotic hosts, which have undergone massive genome reduction and show patterns that are consistent with the degenerative processes that are predicted to occur in species with small effective population sizes. However, similar genome reduction is found in some free-living marine cyanobacteria that are characterized by extremely large populations. In this Opinion article, we discuss the different hypotheses that have been proposed to account for this reductive genome evolution at both ends of the bacterial population size spectrum.

  10. Identification of DNA motifs implicated in maintenance of bacterial core genomes by predictive modeling.

    Science.gov (United States)

    Halpern, David; Chiapello, Hélène; Schbath, Sophie; Robin, Stéphane; Hennequet-Antier, Christelle; Gruss, Alexandra; El Karoui, Meriem

    2007-09-01

    Bacterial biodiversity at the species level, in terms of gene acquisition or loss, is so immense that it raises the question of how essential chromosomal regions are spared from uncontrolled rearrangements. Protection of the genome likely depends on specific DNA motifs that impose limits on the regions that undergo recombination. Although most such motifs remain unidentified, they are theoretically predictable based on their genomic distribution properties. We examined the distribution of the "crossover hotspot instigator," or Chi, in Escherichia coli, and found that its exceptional distribution is restricted to the core genome common to three strains. We then formulated a set of criteria that were incorporated in a statistical model to search core genomes for motifs potentially involved in genome stability in other species. Our strategy led us to identify and biologically validate two distinct heptamers that possess Chi properties, one in Staphylococcus aureus, and the other in several streptococci. This strategy paves the way for wide-scale discovery of other important functional noncoding motifs that distinguish core genomes from the strain-variable regions.

  11. Identification of DNA motifs implicated in maintenance of bacterial core genomes by predictive modeling.

    Directory of Open Access Journals (Sweden)

    David Halpern

    2007-09-01

    Full Text Available Bacterial biodiversity at the species level, in terms of gene acquisition or loss, is so immense that it raises the question of how essential chromosomal regions are spared from uncontrolled rearrangements. Protection of the genome likely depends on specific DNA motifs that impose limits on the regions that undergo recombination. Although most such motifs remain unidentified, they are theoretically predictable based on their genomic distribution properties. We examined the distribution of the "crossover hotspot instigator," or Chi, in Escherichia coli, and found that its exceptional distribution is restricted to the core genome common to three strains. We then formulated a set of criteria that were incorporated in a statistical model to search core genomes for motifs potentially involved in genome stability in other species. Our strategy led us to identify and biologically validate two distinct heptamers that possess Chi properties, one in Staphylococcus aureus, and the other in several streptococci. This strategy paves the way for wide-scale discovery of other important functional noncoding motifs that distinguish core genomes from the strain-variable regions.

  12. Systematic determination of the mosaic structure of bacterial genomes: species backbone versus strain-specific loops

    Directory of Open Access Journals (Sweden)

    Gendrault-Jacquemard A

    2005-07-01

    Full Text Available Abstract Background Public databases now contain multitude of complete bacterial genomes, including several genomes of the same species. The available data offers new opportunities to address questions about bacterial genome evolution, a task that requires reliable fine comparison data of closely related genomes. Recent analyses have shown, using pairwise whole genome alignments, that it is possible to segment bacterial genomes into a common conserved backbone and strain-specific sequences called loops. Results Here, we generalize this approach and propose a strategy that allows systematic and non-biased genome segmentation based on multiple genome alignments. Segmentation analyses, as applied to 13 different bacterial species, confirmed the feasibility of our approach to discern the 'mosaic' organization of bacterial genomes. Segmentation results are available through a Web interface permitting functional analysis, extraction and visualization of the backbone/loops structure of documented genomes. To illustrate the potential of this approach, we performed a precise analysis of the mosaic organization of three E. coli strains and functional characterization of the loops. Conclusion The segmentation results including the backbone/loops structure of 13 bacterial species genomes are new and available for use by the scientific community at the URL: http://genome.jouy.inra.fr/mosaic.

  13. A Bacterial Analysis Platform: An Integrated System for Analysing Bacterial Whole Genome Sequencing Data for Clinical Diagnostics and Surveillance

    DEFF Research Database (Denmark)

    Thomsen, Martin Christen Frølund; Ahrenfeldt, Johanne; Bellod Cisneros, Jose Luis;

    2016-01-01

    web-based tools we developed a single pipeline for batch uploading of whole genome sequencing data from multiple bacterial isolates. The pipeline will automatically identify the bacterial species and, if applicable, assemble the genome, identify the multilocus sequence type, plasmids, virulence genes...... and made publicly available, providing easy-to-use automated analysis of bacterial whole genome sequencing data. The platform may be of immediate relevance as a guide for investigators using whole genome sequencing for clinical diagnostics and surveillance. The platform is freely available at: https......Recent advances in whole genome sequencing have made the technology available for routine use in microbiological laboratories. However, a major obstacle for using this technology is the availability of simple and automatic bioinformatics tools. Based on previously published and already available...

  14. Impact of genome reduction on bacterial metabolism and its regulation.

    Science.gov (United States)

    Yus, Eva; Maier, Tobias; Michalodimitrakis, Konstantinos; van Noort, Vera; Yamada, Takuji; Chen, Wei-Hua; Wodke, Judith A H; Güell, Marc; Martínez, Sira; Bourgeois, Ronan; Kühner, Sebastian; Raineri, Emanuele; Letunic, Ivica; Kalinina, Olga V; Rode, Michaela; Herrmann, Richard; Gutiérrez-Gallego, Ricardo; Russell, Robert B; Gavin, Anne-Claude; Bork, Peer; Serrano, Luis

    2009-11-27

    To understand basic principles of bacterial metabolism organization and regulation, but also the impact of genome size, we systematically studied one of the smallest bacteria, Mycoplasma pneumoniae. A manually curated metabolic network of 189 reactions catalyzed by 129 enzymes allowed the design of a defined, minimal medium with 19 essential nutrients. More than 1300 growth curves were recorded in the presence of various nutrient concentrations. Measurements of biomass indicators, metabolites, and 13C-glucose experiments provided information on directionality, fluxes, and energetics; integration with transcription profiling enabled the global analysis of metabolic regulation. Compared with more complex bacteria, the M. pneumoniae metabolic network has a more linear topology and contains a higher fraction of multifunctional enzymes; general features such as metabolite concentrations, cellular energetics, adaptability, and global gene expression responses are similar, however.

  15. Genome scale metabolic modeling of cancer

    DEFF Research Database (Denmark)

    Nilsson, Avlant; Nielsen, Jens

    2016-01-01

    been used as scaffolds for analysis of high throughput data to allow mechanistic interpretation of changes in expression. Finally, GEMs allow quantitative flux predictions using flux balance analysis (FBA). Here we critically review the requirements for successful FBA simulations of cancer cells......Cancer cells reprogram metabolism to support rapid proliferation and survival. Energy metabolism is particularly important for growth and genes encoding enzymes involved in energy metabolism are frequently altered in cancer cells. A genome scale metabolic model (GEM) is a mathematical formalization...... of metabolism which allows simulation and hypotheses testing of metabolic strategies. It has successfully been applied to many microorganisms and is now used to study cancer metabolism. Generic models of human metabolism have been reconstructed based on the existence of metabolic genes in the human genome...

  16. Genome trees constructed using five different approaches suggest new major bacterial clades

    Directory of Open Access Journals (Sweden)

    Tatusov Roman L

    2001-10-01

    Full Text Available Abstract Background The availability of multiple complete genome sequences from diverse taxa prompts the development of new phylogenetic approaches, which attempt to incorporate information derived from comparative analysis of complete gene sets or large subsets thereof. Such attempts are particularly relevant because of the major role of horizontal gene transfer and lineage-specific gene loss, at least in the evolution of prokaryotes. Results Five largely independent approaches were employed to construct trees for completely sequenced bacterial and archaeal genomes: i presence-absence of genomes in clusters of orthologous genes; ii conservation of local gene order (gene pairs among prokaryotic genomes; iii parameters of identity distribution for probable orthologs; iv analysis of concatenated alignments of ribosomal proteins; v comparison of trees constructed for multiple protein families. All constructed trees support the separation of the two primary prokaryotic domains, bacteria and archaea, as well as some terminal bifurcations within the bacterial and archaeal domains. Beyond these obvious groupings, the trees made with different methods appeared to differ substantially in terms of the relative contributions of phylogenetic relationships and similarities in gene repertoires caused by similar life styles and horizontal gene transfer to the tree topology. The trees based on presence-absence of genomes in orthologous clusters and the trees based on conserved gene pairs appear to be strongly affected by gene loss and horizontal gene transfer. The trees based on identity distributions for orthologs and particularly the tree made of concatenated ribosomal protein sequences seemed to carry a stronger phylogenetic signal. The latter tree supported three potential high-level bacterial clades,: i Chlamydia-Spirochetes, ii Thermotogales-Aquificales (bacterial hyperthermophiles, and ii Actinomycetes-Deinococcales-Cyanobacteria. The latter group also

  17. Predicting statistical properties of open reading frames in bacterial genomes.

    Directory of Open Access Journals (Sweden)

    Katharina Mir

    Full Text Available An analytical model based on the statistical properties of Open Reading Frames (ORFs of eubacterial genomes such as codon composition and sequence length of all reading frames was developed. This new model predicts the average length, maximum length as well as the length distribution of the ORFs of 70 species with GC contents varying between 21% and 74%. Furthermore, the number of annotated genes is predicted with high accordance. However, the ORF length distribution in the five alternative reading frames shows interesting deviations from the predicted distribution. In particular, long ORFs appear more often than expected statistically. The unexpected depletion of stop codons in these alternative open reading frames cannot completely be explained by a biased codon usage in the +1 frame. While it is unknown if the stop codon depletion has a biological function, it could be due to a protein coding capacity of alternative ORFs exerting a selection pressure which prevents the fixation of stop codon mutations. The comparison of the analytical model with bacterial genomes, therefore, leads to a hypothesis suggesting novel gene candidates which can now be investigated in subsequent wet lab experiments.

  18. Sequencing of Bacterial Genomes: Principles and Insights into Pathogenesis and Development of Antibiotics

    Directory of Open Access Journals (Sweden)

    Eric S. Donkor

    2013-10-01

    Full Text Available The impact of bacterial diseases on public health has become enormous, and is partly due to the increasing trend of antibiotic resistance displayed by bacterial pathogens. Sequencing of bacterial genomes has significantly improved our understanding about the biology of many bacterial pathogens as well as identification of novel antibiotic targets. Since the advent of genome sequencing two decades ago, about 1,800 bacterial genomes have been fully sequenced and these include important aetiological agents such as Streptococcus pneumoniae, Mycobacterium tuberculosis, Escherichia coli O157:H7, Vibrio cholerae, Clostridium difficile and Staphylococcus aureus. Very recently, there has been an explosion of bacterial genome data and is due to the development of next generation sequencing technologies, which are evolving so rapidly. Indeed, the field of microbial genomics is advancing at a very fast rate and it is difficult for researchers to be abreast with the new developments. This highlights the need for regular updates in microbial genomics through comprehensive reviews. This review paper seeks to provide an update on bacterial genome sequencing generally, and to analyze insights gained from sequencing in two areas, including bacterial pathogenesis and the development of antibiotics.

  19. The SeqWord Genome Browser: an online tool for the identification and visualization of atypical regions of bacterial genomes through oligonucleotide usage

    Directory of Open Access Journals (Sweden)

    Tümmler Burkhard

    2008-08-01

    Full Text Available Abstract Background Data mining in large DNA sequences is a major challenge in microbial genomics and bioinformatics. Oligonucleotide usage (OU patterns provide a wealth of information for large scale sequence analysis and visualization. The purpose of this research was to make OU statistical analysis available as a novel web-based tool for functional genomics and annotation. The tool is also available as a downloadable package. Results The SeqWord Genome Browser (SWGB was developed to visualize the natural compositional variation of DNA sequences. The applet is also used for identification of divergent genomic regions both in annotated sequences of bacterial chromosomes, plasmids, phages and viruses, and in raw DNA sequences prior to annotation by comparing local and global OU patterns. The applet allows fast and reliable identification of clusters of horizontally transferred genomic islands, large multi-domain genes and genes for ribosomal RNA. Within the majority of genomic fragments (also termed genomic core sequence, regions enriched with housekeeping genes, ribosomal proteins and the regions rich in pseudogenes or genetic vestiges may be contrasted. Conclusion The SWGB applet presents a range of comprehensive OU statistical parameters calculated for a range of bacterial species, plasmids and phages. It is available on the Internet at http://www.bi.up.ac.za/SeqWord/mhhapplet.php.

  20. A new experimental approach for studying bacterial genomic island evolution identifies island genes with bacterial host-specific expression patterns

    Directory of Open Access Journals (Sweden)

    Nickerson Cheryl A

    2006-01-01

    Full Text Available Abstract Background Genomic islands are regions of bacterial genomes that have been acquired by horizontal transfer and often contain blocks of genes that function together for specific processes. Recently, it has become clear that the impact of genomic islands on the evolution of different bacterial species is significant and represents a major force in establishing bacterial genomic variation. However, the study of genomic island evolution has been mostly performed at the sequence level using computer software or hybridization analysis to compare different bacterial genomic sequences. We describe here a novel experimental approach to study the evolution of species-specific bacterial genomic islands that identifies island genes that have evolved in such a way that they are differentially-expressed depending on the bacterial host background into which they are transferred. Results We demonstrate this approach by using a "test" genomic island that we have cloned from the Salmonella typhimurium genome (island 4305 and transferred to a range of Gram negative bacterial hosts of differing evolutionary relationships to S. typhimurium. Systematic analysis of the expression of the island genes in the different hosts compared to proper controls allowed identification of genes with genera-specific expression patterns. The data from the analysis can be arranged in a matrix to give an expression "array" of the island genes in the different bacterial backgrounds. A conserved 19-bp DNA site was found upstream of at least two of the differentially-expressed island genes. To our knowledge, this is the first systematic analysis of horizontally-transferred genomic island gene expression in a broad range of Gram negative hosts. We also present evidence in this study that the IS200 element found in island 4305 in S. typhimurium strain LT2 was inserted after the island had already been acquired by the S. typhimurium lineage and that this element is likely not

  1. What Makes a Bacterial Species Pathogenic?:Comparative Genomic Analysis of the Genus Leptospira.

    Directory of Open Access Journals (Sweden)

    Derrick E Fouts

    2016-02-01

    Full Text Available Leptospirosis, caused by spirochetes of the genus Leptospira, is a globally widespread, neglected and emerging zoonotic disease. While whole genome analysis of individual pathogenic, intermediately pathogenic and saprophytic Leptospira species has been reported, comprehensive cross-species genomic comparison of all known species of infectious and non-infectious Leptospira, with the goal of identifying genes related to pathogenesis and mammalian host adaptation, remains a key gap in the field. Infectious Leptospira, comprised of pathogenic and intermediately pathogenic Leptospira, evolutionarily diverged from non-infectious, saprophytic Leptospira, as demonstrated by the following computational biology analyses: 1 the definitive taxonomy and evolutionary relatedness among all known Leptospira species; 2 genomically-predicted metabolic reconstructions that indicate novel adaptation of infectious Leptospira to mammals, including sialic acid biosynthesis, pathogen-specific porphyrin metabolism and the first-time demonstration of cobalamin (B12 autotrophy as a bacterial virulence factor; 3 CRISPR/Cas systems demonstrated only to be present in pathogenic Leptospira, suggesting a potential mechanism for this clade's refractoriness to gene targeting; 4 finding Leptospira pathogen-specific specialized protein secretion systems; 5 novel virulence-related genes/gene families such as the Virulence Modifying (VM (PF07598 paralogs proteins and pathogen-specific adhesins; 6 discovery of novel, pathogen-specific protein modification and secretion mechanisms including unique lipoprotein signal peptide motifs, Sec-independent twin arginine protein secretion motifs, and the absence of certain canonical signal recognition particle proteins from all Leptospira; and 7 and demonstration of infectious Leptospira-specific signal-responsive gene expression, motility and chemotaxis systems. By identifying large scale changes in infectious (pathogenic and intermediately

  2. Scale-Invariant Correlations in Dynamic Bacterial Clusters

    Science.gov (United States)

    Chen, Xiao; Dong, Xu; Be'er, Avraham; Swinney, Harry L.; Zhang, H. P.

    2012-04-01

    In Bacillus subtilis colonies, motile bacteria move collectively, spontaneously forming dynamic clusters. These bacterial clusters share similarities with other systems exhibiting polarized collective motion, such as bird flocks or fish schools. Here we study experimentally how velocity and orientation fluctuations within clusters are spatially correlated. For a range of cell density and cluster size, the correlation length is shown to be 30% of the spatial size of clusters, and the correlation functions collapse onto a master curve after rescaling the separation with correlation length. Our results demonstrate that correlations of velocity and orientation fluctuations are scale invariant in dynamic bacterial clusters.

  3. Mechanical Genomics Identifies Diverse Modulators of Bacterial Cell Stiffness.

    Science.gov (United States)

    Auer, George K; Lee, Timothy K; Rajendram, Manohary; Cesar, Spencer; Miguel, Amanda; Huang, Kerwyn Casey; Weibel, Douglas B

    2016-06-22

    Bacteria must maintain mechanical integrity to withstand the large osmotic pressure differential across the cell membrane and wall. Although maintaining mechanical integrity is critical for proper cellular function, a fact exploited by prominent cell-wall-targeting antibiotics, the proteins that contribute to cellular mechanics remain unidentified. Here, we describe a high-throughput optical method for quantifying cell stiffness and apply this technique to a genome-wide collection of ∼4,000 Escherichia coli mutants. We identify genes with roles in diverse functional processes spanning cell-wall synthesis, energy production, and DNA replication and repair that significantly change cell stiffness when deleted. We observe that proteins with biochemically redundant roles in cell-wall synthesis exhibit different stiffness defects when deleted. Correlating our data with chemical screens reveals that reducing membrane potential generally increases cell stiffness. In total, our work demonstrates that bacterial cell stiffness is a property of both the cell wall and broader cell physiology and lays the groundwork for future systematic studies of mechanoregulation.

  4. Entangled fates of holobiont genomes during invasion: nested bacterial and host diversities in Caulerpa taxifolia

    KAUST Repository

    Arnaud-Haond, S.

    2017-01-30

    Successful prevention and mitigation of biological invasions requires retracing the initial steps of introduction, as well as understanding key elements enhancing the adaptability of invasive species. We studied the genetic diversity of the green alga Caulerpa taxifolia and its associated bacterial communities in several areas around the world. The striking congruence of α and ß diversity of the algal genome and endophytic communities reveals a tight association, supporting the holobiont concept as best describing the unit of spreading and invasion. Both genomic compartments support the hypotheses of a unique accidental introduction in the Mediterranean and of multiple invasion events in Southern Australia. In addition to helping with tracing the origin of invasion, bacterial communities exhibit metabolic functions that can potentially enhance adaptability and competitiveness of the consortium they form with their host. We thus hypothesize that low genetic diversities of both host and symbiont communities may contribute to the recent regression in the Mediterranean, in contrast with the persistence of highly diverse assemblages in southern Australia. This study supports the importance of scaling up from the host to the holobiont for a comprehensive understanding of invasions. This article is protected by copyright. All rights reserved.

  5. Bacterial sigma factors: a historical, structural, and genomic perspective.

    Science.gov (United States)

    Feklístov, Andrey; Sharon, Brian D; Darst, Seth A; Gross, Carol A

    2014-01-01

    Transcription initiation is the crucial focal point of gene expression in prokaryotes. The key players in this process, sigma factors (σs), associate with the catalytic core RNA polymerase to guide it through the essential steps of initiation: promoter recognition and opening, and synthesis of the first few nucleotides of the transcript. Here we recount the key advances in σ biology, from their discovery 45 years ago to the most recent progress in understanding their structure and function at the atomic level. Recent data provide important structural insights into the mechanisms whereby σs initiate promoter opening. We discuss both the housekeeping σs, which govern transcription of the majority of cellular genes, and the alternative σs, which direct RNA polymerase to specialized operons in response to environmental and physiological cues. The review concludes with a genome-scale view of the extracytoplasmic function σs, the most abundant group of alternative σs.

  6. Using multilocus sequence typing to study bacterial variation: prospects in the genomic era.

    Science.gov (United States)

    Jolley, Keith A; Maiden, Martin C J

    2014-01-01

    Multilocus sequence typing (MLST) indexes the sequence variation present in a small number (usually seven) of housekeeping gene fragments located around the bacterial genome. Unique alleles at these loci are assigned arbitrary integer identifiers, which effectively summarizes the variation present in several thousand base pairs of genome sequence information as a series of numbers. Comparing bacterial isolates using allele-based methods efficiently corrects for the effects of lateral gene transfer present in many bacterial populations and is computationally efficient. This 'gene-by-gene' approach can be applied to larger collections of loci, such as the ribosomal protein genes used in ribosomal MLST (rMLST), up to and including the complete set of coding sequences present in a genome, whole-genome MLST (wgMLST), providing scalable, efficient and readily interpreted genome analysis.

  7. Evolution in an oncogenic bacterial species with extreme genome plasticity: Helicobacter pylori East Asian genomes

    Directory of Open Access Journals (Sweden)

    Handa Naofumi

    2011-05-01

    Full Text Available Abstract Background The genome of Helicobacter pylori, an oncogenic bacterium in the human stomach, rapidly evolves and shows wide geographical divergence. The high incidence of stomach cancer in East Asia might be related to bacterial genotype. We used newly developed comparative methods to follow the evolution of East Asian H. pylori genomes using 20 complete genome sequences from Japanese, Korean, Amerind, European, and West African strains. Results A phylogenetic tree of concatenated well-defined core genes supported divergence of the East Asian lineage (hspEAsia; Japanese and Korean from the European lineage ancestor, and then from the Amerind lineage ancestor. Phylogenetic profiling revealed a large difference in the repertoire of outer membrane proteins (including oipA, hopMN, babABC, sabAB and vacA-2 through gene loss, gain, and mutation. All known functions associated with molybdenum, a rare element essential to nearly all organisms that catalyzes two-electron-transfer oxidation-reduction reactions, appeared to be inactivated. Two pathways linking acetyl~CoA and acetate appeared intact in some Japanese strains. Phylogenetic analysis revealed greater divergence between the East Asian (hspEAsia and the European (hpEurope genomes in proteins in host interaction, specifically virulence factors (tipα, outer membrane proteins, and lipopolysaccharide synthesis (human Lewis antigen mimicry enzymes. Divergence was also seen in proteins in electron transfer and translation fidelity (miaA, tilS, a DNA recombinase/exonuclease that recognizes genome identity (addA, and DNA/RNA hybrid nucleases (rnhAB. Positively selected amino acid changes between hspEAsia and hpEurope were mapped to products of cagA, vacA, homC (outer membrane protein, sotB (sugar transport, and a translation fidelity factor (miaA. Large divergence was seen in genes related to antibiotics: frxA (metronidazole resistance, def (peptide deformylase, drug target, and ftsA (actin

  8. Bacterial Genomics Reveal the Complex Epidemiology of an Emerging Pathogen in Arctic and Boreal Ungulates

    Directory of Open Access Journals (Sweden)

    Taya L. Forde

    2016-11-01

    Full Text Available Northern ecosystems are currently experiencing unprecedented ecological change, largely driven by a rapidly changing climate. Pathogen range expansion, and emergence and altered patterns of infectious disease, are increasingly reported in wildlife at high latitudes. Understanding the causes and consequences of shifting pathogen diversity and host-pathogen interactions in these ecosystems is important for wildlife conservation, and for indigenous populations that depend on wildlife. Among the key questions are whether disease events are associated with endemic or recently introduced pathogens, and whether emerging strains are spreading throughout the region. In this study, we used a phylogenomic approach to address these questions of pathogen endemicity and spread for Erysipelothrix rhusiopathiae, an opportunistic multi-host bacterial pathogen associated with recent mortalities in arctic and boreal ungulate populations in North America. We isolated E. rhusiopathiae from carcasses associated with large-scale die-offs of muskoxen in the Canadian Arctic Archipelago, and from contemporaneous mortality events and/or population declines among muskoxen in northwestern Alaska and caribou and moose in western Canada. Bacterial genomic diversity differed markedly among these locations; minimal divergence was present among isolates from muskoxen in the Canadian Arctic, while in caribou and moose populations, strains from highly divergent clades were isolated from the same location, or even from within a single carcass. These results indicate that mortalities among northern ungulates are not associated with a single emerging strain of E. rhusiopathiae, and that alternate hypotheses need to be explored. Our study illustrates the value and limitations of bacterial genomic data for discriminating between ecological hypotheses of disease emergence, and highlights the importance of studying emerging pathogens within the broader context of environmental and host

  9. DIYA: A Bacterial Annotation Pipeline for any Genomics Lab

    Science.gov (United States)

    2009-02-12

    microbial genomes overnight (Mardis, 2008). These technologies have created many new small ‘genome centers’ ( Zwick , 2005). DIYA (Do-It- Yourself...2008) The development of PIPA: an integrated and automated pipeline for genome-wide protein function annotation. BMC Bioinformatics, 9, 52. Zwick ,M.E

  10. Bayesian prediction of bacterial growth temperature range based on genome sequences

    DEFF Research Database (Denmark)

    Jensen, Dan Børge; Vesth, Tammi Camilla; Hallin, Peter Fischer

    2012-01-01

    on a genomic sequence, would thus allow for an efficient and targeted search for production organisms, reducing the need for culturing experiments. Results: This study found a total of 40 protein families useful for distinction between three thermophilicity classes (thermophiles, mesophiles and psychrophiles...... and psychrophilic adapted bacterial genomes....

  11. Ensembl Genomes 2013: scaling up access to genome-wide data

    Science.gov (United States)

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provi...

  12. Triad pattern algorithm for predicting strong promoter candidates in bacterial genomes

    Directory of Open Access Journals (Sweden)

    Sakanyan Vehary

    2008-05-01

    Full Text Available Abstract Background Bacterial promoters, which increase the efficiency of gene expression, differ from other promoters by several characteristics. This difference, not yet widely exploited in bioinformatics, looks promising for the development of relevant computational tools to search for strong promoters in bacterial genomes. Results We describe a new triad pattern algorithm that predicts strong promoter candidates in annotated bacterial genomes by matching specific patterns for the group I σ70 factors of Escherichia coli RNA polymerase. It detects promoter-specific motifs by consecutively matching three patterns, consisting of an UP-element, required for interaction with the α subunit, and then optimally-separated patterns of -35 and -10 boxes, required for interaction with the σ70 subunit of RNA polymerase. Analysis of 43 bacterial genomes revealed that the frequency of candidate sequences depends on the A+T content of the DNA under examination. The accuracy of in silico prediction was experimentally validated for the genome of a hyperthermophilic bacterium, Thermotoga maritima, by applying a cell-free expression assay using the predicted strong promoters. In this organism, the strong promoters govern genes for translation, energy metabolism, transport, cell movement, and other as-yet unidentified functions. Conclusion The triad pattern algorithm developed for predicting strong bacterial promoters is well suited for analyzing bacterial genomes with an A+T content of less than 62%. This computational tool opens new prospects for investigating global gene expression, and individual strong promoters in bacteria of medical and/or economic significance.

  13. Physical descriptions of the bacterial nucleoid at large scales, and their biological implications

    CERN Document Server

    Benza, Vincenzo G; Dorfman, Kevin D; Scolari, Vittore F; Bromek, Krystyna; Cicuta, Pietro; Lagomarsino, Marco Cosentino

    2012-01-01

    Recent experimental and theoretical approaches have attempted to quantify the physical organization (compaction and geometry) of the bacterial chromosome with its complement of proteins (the nucleoid). The genomic DNA exists in a complex and dynamic protein-rich state, which is highly organised at various length scales. This has implications on modulating (when not enabling) the core biological processes of replication, transcription, segregation. We overview the progress in this area, driven in the last few years by new scientific ideas and new interdisciplinary experimental techniques, ranging from high space- and time-resolution microscopy to high-throughput genomics employing sequencing to map different aspects of the nucleoid-related interactome. The aim of this review is to present the wide spectrum of experimental and theoretical findings coherently, from a physics viewpoint. We also discuss some attempts of interpretation that unify different results, highlighting the role that statistical and soft co...

  14. Determining and comparing protein function in Bacterial genome sequences

    DEFF Research Database (Denmark)

    Vesth, Tammi Camilla

    In November 2013, there was around 21.000 different prokaryotic genomes sequenced and publicly available, and the number is growing daily with another 20.000 or more genomes expected to be sequenced and deposited by the end of 2014. An important part of the analysis of this data is the functional...... on known functions. This thesis describes the development of new tools for comparative functional annotation and a system for comparative genomics in general. As novel sequenced genomes are becoming more readily available, there is a need for standard analysis tools. The system CMG-biotools is presented...... here as an example of such a system and was used to analyze a set of genomes from the Negativicutes class, a group of bacteria closely related to Gram positives but which has a different cell wall structure and stains Gram negative, as the name indicates. The results of this work show that genomes...

  15. Identification of prophages in bacterial genomes by dinucleotide relative abundance difference.

    Directory of Open Access Journals (Sweden)

    K V Srividhya

    Full Text Available BACKGROUND: Prophages are integrated viral forms in bacterial genomes that have been found to contribute to interstrain genetic variability. Many virulence-associated genes are reported to be prophage encoded. Present computational methods to detect prophages are either by identifying possible essential proteins such as integrases or by an extension of this technique, which involves identifying a region containing proteins similar to those occurring in prophages. These methods suffer due to the problem of low sequence similarity at the protein level, which suggests that a nucleotide based approach could be useful. METHODOLOGY: Earlier dinucleotide relative abundance (DRA have been used to identify regions, which deviate from the neighborhood areas, in genomes. We have used the difference in the dinucleotide relative abundance (DRAD between the bacterial and prophage DNA to aid location of DNA stretches that could be of prophage origin in bacterial genomes. Prophage sequences which deviate from bacterial regions in their dinucleotide frequencies are detected by scanning bacterial genome sequences. The method was validated using a subset of genomes with prophage data from literature reports. A web interface for prophage scan based on this method is available at http://bicmku.in:8082/prophagedb/dra.html. Two hundred bacterial genomes which do not have annotated prophages have been scanned for prophage regions using this method. CONCLUSIONS: The relative dinucleotide distribution difference helps detect prophage regions in genome sequences. The usefulness of this method is seen in the identification of 461 highly probable loci pertaining to prophages which have not been annotated so earlier. This work emphasizes the need to extend the efforts to detect and annotate prophage elements in genome sequences.

  16. Construction and Analysis of Siberian Tiger Bacterial Artificial Chromosome Library with Approximately 6.5-Fold Genome Equivalent Coverage

    Directory of Open Access Journals (Sweden)

    Changqing Liu

    2014-03-01

    Full Text Available Bacterial artificial chromosome (BAC libraries are extremely valuable for the genome-wide genetic dissection of complex organisms. The Siberian tiger, one of the most well-known wild primitive carnivores in China, is an endangered animal. In order to promote research on its genome, a high-redundancy BAC library of the Siberian tiger was constructed and characterized. The library is divided into two sub-libraries prepared from blood cells and two sub-libraries prepared from fibroblasts. This BAC library contains 153,600 individually archived clones; for PCR-based screening of the library, BACs were placed into 40 superpools of 10 × 384-deep well microplates. The average insert size of BAC clones was estimated to be 116.5 kb, representing approximately 6.46 genome equivalents of the haploid genome and affording a 98.86% statistical probability of obtaining at least one clone containing a unique DNA sequence. Screening the library with 19 microsatellite markers and a SRY sequence revealed that each of these markers were present in the library; the average number of positive clones per marker was 6.74 (range 2 to 12, consistent with 6.46 coverage of the tiger genome. Additionally, we identified 72 microsatellite markers that could potentially be used as genetic markers. This BAC library will serve as a valuable resource for physical mapping, comparative genomic study and large-scale genome sequencing in the tiger.

  17. Construction and Preliminary Characterization Analysis of Wuzhishan Miniature Pig Bacterial Artificial Chromosome Library with Approximately 8-Fold Genome Equivalent Coverage

    Directory of Open Access Journals (Sweden)

    Changqing Liu

    2013-01-01

    Full Text Available Bacterial artificial chromosome (BAC libraries have been invaluable tools for the genome-wide genetic dissection of complex organisms. Here, we report the construction and characterization of a high-redundancy BAC library from a very valuable pig breed in China, Wuzhishan miniature pig (Sus scrofa, using its blood cells and fibroblasts, respectively. The library contains approximately 153,600 clones ordered in 40 superpools of 10 × 384-deep well microplates. The average insert size of BAC clones was estimated to be 152.3 kb, representing approximately 7.68 genome equivalents of the porcine haploid genome and a 99.93% statistical probability of obtaining at least one clone containing a unique DNA sequence in the library. 19 pairs of microsatellite marker primers covering porcine chromosomes were used for screening the BAC library, which showed that each of these markers was positive in the library; the positive clone number was 2 to 9, and the average number was 7.89, which was consistent with 7.68-fold coverage of the porcine genome. And there were no significant differences of genomic BAC library from blood cells and fibroblast cells. Therefore, we identified 19 microsatellite markers that could potentially be used as genetic markers. As a result, this BAC library will serve as a valuable resource for gene identification, physical mapping, and comparative genomics and large-scale genome sequencing in the porcine.

  18. Construction and analysis of Siberian tiger bacterial artificial chromosome library with approximately 6.5-fold genome equivalent coverage.

    Science.gov (United States)

    Liu, Changqing; Bai, Chunyu; Guo, Yu; Liu, Dan; Lu, Taofeng; Li, Xiangchen; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2014-03-07

    Bacterial artificial chromosome (BAC) libraries are extremely valuable for the genome-wide genetic dissection of complex organisms. The Siberian tiger, one of the most well-known wild primitive carnivores in China, is an endangered animal. In order to promote research on its genome, a high-redundancy BAC library of the Siberian tiger was constructed and characterized. The library is divided into two sub-libraries prepared from blood cells and two sub-libraries prepared from fibroblasts. This BAC library contains 153,600 individually archived clones; for PCR-based screening of the library, BACs were placed into 40 superpools of 10 × 384-deep well microplates. The average insert size of BAC clones was estimated to be 116.5 kb, representing approximately 6.46 genome equivalents of the haploid genome and affording a 98.86% statistical probability of obtaining at least one clone containing a unique DNA sequence. Screening the library with 19 microsatellite markers and a SRY sequence revealed that each of these markers were present in the library; the average number of positive clones per marker was 6.74 (range 2 to 12), consistent with 6.46 coverage of the tiger genome. Additionally, we identified 72 microsatellite markers that could potentially be used as genetic markers. This BAC library will serve as a valuable resource for physical mapping, comparative genomic study and large-scale genome sequencing in the tiger.

  19. [Homologous recombination among bacterial genomes: the measurement and identification].

    Science.gov (United States)

    Xianwei, Yang; Ruifu, Yang; Yujun, Cui

    2016-02-01

    Homologous recombination is one of important sources in shaping the bacterial population diversity, which disrupts the clonal relationship among different lineages through horizontal transferring of DNA-segments. As consequence of blurring the vertical inheritance signals, the homologous recombination raises difficulties in phylogenetic analysis and reconstruction of population structure. Here we discuss the impacts of homologous recombination in inferring phylogenetic relationship among bacterial isolates, and summarize the tools and models separately used in recombination measurement and identification. We also highlight the merits and drawbacks of various approaches, aiming to assist in the practical application for the analysis of homologous recombination in bacterial evolution research.

  20. Construction and characterization of bacterial artificial chromosomes (BACs) containing herpes simplex virus full-length genomes.

    Science.gov (United States)

    Nagel, Claus-Henning; Pohlmann, Anja; Sodeik, Beate

    2014-01-01

    Bacterial artificial chromosomes (BACs) are suitable vectors not only to maintain the large genomes of herpesviruses in Escherichia coli but also to enable the traceless introduction of any mutation using modern tools of bacterial genetics. To clone a herpes simplex virus genome, a BAC replication origin is first introduced into the viral genome by homologous recombination in eukaryotic host cells. As part of their nuclear replication cycle, genomes of herpesviruses circularize and these replication intermediates are then used to transform bacteria. After cloning, the integrity of the recombinant viral genomes is confirmed by restriction length polymorphism analysis and sequencing. The BACs may then be used to design virus mutants. Upon transfection into eukaryotic cells new herpesvirus strains harboring the desired mutations can be recovered and used for experiments in cultured cells as well as in animal infection models.

  1. A highly precise and portable genome engineering method allows comparison of mutational effects across bacterial species.

    Science.gov (United States)

    Nyerges, Ákos; Csörgő, Bálint; Nagy, István; Bálint, Balázs; Bihari, Péter; Lázár, Viktória; Apjok, Gábor; Umenhoffer, Kinga; Bogos, Balázs; Pósfai, György; Pál, Csaba

    2016-03-01

    Currently available tools for multiplex bacterial genome engineering are optimized for a few laboratory model strains, demand extensive prior modification of the host strain, and lead to the accumulation of numerous off-target modifications. Building on prior development of multiplex automated genome engineering (MAGE), our work addresses these problems in a single framework. Using a dominant-negative mutant protein of the methyl-directed mismatch repair (MMR) system, we achieved a transient suppression of DNA repair in Escherichia coli, which is necessary for efficient oligonucleotide integration. By integrating all necessary components into a broad-host vector, we developed a new workflow we term pORTMAGE. It allows efficient modification of multiple loci, without any observable off-target mutagenesis and prior modification of the host genome. Because of the conserved nature of the bacterial MMR system, pORTMAGE simultaneously allows genome editing and mutant library generation in other biotechnologically and clinically relevant bacterial species. Finally, we applied pORTMAGE to study a set of antibiotic resistance-conferring mutations in Salmonella enterica and E. coli. Despite over 100 million y of divergence between the two species, mutational effects remained generally conserved. In sum, a single transformation of a pORTMAGE plasmid allows bacterial species of interest to become an efficient host for genome engineering. These advances pave the way toward biotechnological and therapeutic applications. Finally, pORTMAGE allows systematic comparison of mutational effects and epistasis across a wide range of bacterial species.

  2. A reference pan-genome approach to comparative bacterial genomics: identification of novel epidemiological markers in pathogenic Campylobacter.

    Directory of Open Access Journals (Sweden)

    Guillaume Méric

    Full Text Available The increasing availability of hundreds of whole bacterial genomes provides opportunities for enhanced understanding of the genes and alleles responsible for clinically important phenotypes and how they evolved. However, it is a significant challenge to develop easy-to-use and scalable methods for characterizing these large and complex data and relating it to disease epidemiology. Existing approaches typically focus on either homologous sequence variation in genes that are shared by all isolates, or non-homologous sequence variation--focusing on genes that are differentially present in the population. Here we present a comparative genomics approach that simultaneously approximates core and accessory genome variation in pathogen populations and apply it to pathogenic species in the genus Campylobacter. A total of 7 published Campylobacter jejuni and Campylobacter coli genomes were selected to represent diversity across these species, and a list of all loci that were present at least once was compiled. After filtering duplicates a 7-isolate reference pan-genome, of 3,933 loci, was defined. A core genome of 1,035 genes was ubiquitous in the sample accounting for 59% of the genes in each isolate (average genome size of 1.68 Mb. The accessory genome contained 2,792 genes. A Campylobacter population sample of 192 genomes was screened for the presence of reference pan-genome loci with gene presence defined as a BLAST match of ≥ 70% identity over ≥ 50% of the locus length--aligned using MUSCLE on a gene-by-gene basis. A total of 21 genes were present only in C. coli and 27 only in C. jejuni, providing information about functional differences associated with species and novel epidemiological markers for population genomic analyses. Homologs of these genes were found in several of the genomes used to define the pan-genome and, therefore, would not have been identified using a single reference strain approach.

  3. Host imprints on bacterial genomes--rapid, divergent evolution in individual patients.

    Directory of Open Access Journals (Sweden)

    Jaroslaw Zdziarski

    Full Text Available Bacteria lose or gain genetic material and through selection, new variants become fixed in the population. Here we provide the first, genome-wide example of a single bacterial strain's evolution in different deliberately colonized patients and the surprising insight that hosts appear to personalize their microflora. By first obtaining the complete genome sequence of the prototype asymptomatic bacteriuria strain E. coli 83972 and then resequencing its descendants after therapeutic bladder colonization of different patients, we identified 34 mutations, which affected metabolic and virulence-related genes. Further transcriptome and proteome analysis proved that these genome changes altered bacterial gene expression resulting in unique adaptation patterns in each patient. Our results provide evidence that, in addition to stochastic events, adaptive bacterial evolution is driven by individual host environments. Ongoing loss of gene function supports the hypothesis that evolution towards commensalism rather than virulence is favored during asymptomatic bladder colonization.

  4. The Bacterial Origins of the CRISPR Genome-Editing Revolution.

    Science.gov (United States)

    Sontheimer, Erik J; Barrangou, Rodolphe

    2015-07-01

    Like most of the tools that enable modern life science research, the recent genome-editing revolution has its biological roots in the world of bacteria and archaea. Clustered, regularly interspaced, short palindromic repeats (CRISPR) loci are found in the genomes of many bacteria and most archaea, and underlie an adaptive immune system that protects the host cell against invasive nucleic acids such as viral genomes. In recent years, engineered versions of these systems have enabled efficient DNA targeting in living cells from dozens of species (including humans and other eukaryotes), and the exploitation of the resulting endogenous DNA repair pathways has provided a route to fast, easy, and affordable genome editing. In only three years after RNA-guided DNA cleavage was first harnessed, the ability to edit genomes via simple, user-defined RNA sequences has already revolutionized nearly all areas of biological science. CRISPR-based technologies are now poised to similarly revolutionize many facets of clinical medicine, and even promise to advance the long-term goal of directly editing genomic sequences of patients with inherited disease. In this review, we describe the biological and mechanistic basis for these remarkable immune systems, and how their engineered derivatives are revolutionizing basic and clinical research.

  5. Using Genome-scale Models to Predict Biological Capabilities

    DEFF Research Database (Denmark)

    O’Brien, Edward J.; Monk, Jonathan M.; Palsson, Bernhard O.

    2015-01-01

    Constraint-based reconstruction and analysis (COBRA) methods at the genome scale have been under development since the first whole-genome sequences appeared in the mid-1990s. A few years ago, this approach began to demonstrate the ability to predict a range of cellular functions, including cellular...... growth capabilities on various substrates and the effect of gene knockouts at the genome scale. Thus, much interest has developed in understanding and applying these methods to areas such as metabolic engineering, antibiotic design, and organismal and enzyme evolution. This Primer will get you started....

  6. Use of genome-scale microbial models for metabolic engineering

    DEFF Research Database (Denmark)

    Patil, Kiran Raosaheb; Åkesson, M.; Nielsen, Jens

    2004-01-01

    network structures. The major challenge for metabolic engineering in the post-genomic era is to broaden its design methodologies to incorporate genome-scale biological data. Genome-scale stoichiometric models of microorganisms represent a first step in this direction.......Metabolic engineering serves as an integrated approach to design new cell factories by providing rational design procedures and valuable mathematical and experimental tools. Mathematical models have an important role for phenotypic analysis, but can also be used for the design of optimal metabolic...

  7. Large Scale Bacterial Colony Screening of Diversified FRET Biosensors.

    Directory of Open Access Journals (Sweden)

    Julia Litzlbauer

    Full Text Available Biosensors based on Förster Resonance Energy Transfer (FRET between fluorescent protein mutants have started to revolutionize physiology and biochemistry. However, many types of FRET biosensors show relatively small FRET changes, making measurements with these probes challenging when used under sub-optimal experimental conditions. Thus, a major effort in the field currently lies in designing new optimization strategies for these types of sensors. Here we describe procedures for optimizing FRET changes by large scale screening of mutant biosensor libraries in bacterial colonies. We describe optimization of biosensor expression, permeabilization of bacteria, software tools for analysis, and screening conditions. The procedures reported here may help in improving FRET changes in multiple suitable classes of biosensors.

  8. Genome Sequences of Nine Gram-Negative Vaginal Bacterial Isolates

    Science.gov (United States)

    Deitzler, Grace E.; Ruiz, Maria J.; Lu, Wendy; Weimer, Cory; Park, SoEun; Robinson, Lloyd S.; Hallsworth-Pepin, Kymberlie; Wollam, Aye; Mitreva, Makedonka

    2016-01-01

    The vagina is home to a wide variety of bacteria that have great potential to impact human health. Here, we announce reference strains (now available through BEI Resources) and draft genome sequences for 9 Gram-negative vaginal isolates from the taxa Citrobacter, Klebsiella, Fusobacterium, Proteus, and Prevotella. PMID:27688330

  9. A hybrid approach for the automated finishing of bacterial genomes.

    Science.gov (United States)

    Bashir, Ali; Klammer, Aaron A; Robins, William P; Chin, Chen-Shan; Webster, Dale; Paxinos, Ellen; Hsu, David; Ashby, Meredith; Wang, Susana; Peluso, Paul; Sebra, Robert; Sorenson, Jon; Bullard, James; Yen, Jackie; Valdovino, Marie; Mollova, Emilia; Luong, Khai; Lin, Steven; LaMay, Brianna; Joshi, Amruta; Rowe, Lori; Frace, Michael; Tarr, Cheryl L; Turnsek, Maryann; Davis, Brigid M; Kasarskis, Andrew; Mekalanos, John J; Waldor, Matthew K; Schadt, Eric E

    2012-07-01

    Advances in DNA sequencing technology have improved our ability to characterize most genomic diversity. However, accurate resolution of large structural events is challenging because of the short read lengths of second-generation technologies. Third-generation sequencing technologies, which can yield longer multikilobase reads, have the potential to address limitations associated with genome assembly. Here we combine sequencing data from second- and third-generation DNA sequencing technologies to assemble the two-chromosome genome of a recent Haitian cholera outbreak strain into two nearly finished contigs at >99.9% accuracy. Complex regions with clinically relevant structure were completely resolved. In separate control assemblies on experimental and simulated data for the canonical N16961 cholera reference strain, we obtained 14 scaffolds of greater than 1 kb for the experimental data and 8 scaffolds of greater than 1 kb for the simulated data, which allowed us to correct several errors in contigs assembled from the short-read data alone. This work provides a blueprint for the next generation of rapid microbial identification and full-genome assembly.

  10. The OME Framework for genome-scale systems biology

    Energy Technology Data Exchange (ETDEWEB)

    Palsson, Bernhard O. [Univ. of California, San Diego, CA (United States); Ebrahim, Ali [Univ. of California, San Diego, CA (United States); Federowicz, Steve [Univ. of California, San Diego, CA (United States)

    2014-12-19

    The life sciences are undergoing continuous and accelerating integration with computational and engineering sciences. The biology that many in the field have been trained on may be hardly recognizable in ten to twenty years. One of the major drivers for this transformation is the blistering pace of advancements in DNA sequencing and synthesis. These advances have resulted in unprecedented amounts of new data, information, and knowledge. Many software tools have been developed to deal with aspects of this transformation and each is sorely needed [1-3]. However, few of these tools have been forced to deal with the full complexity of genome-scale models along with high throughput genome- scale data. This particular situation represents a unique challenge, as it is simultaneously necessary to deal with the vast breadth of genome-scale models and the dizzying depth of high-throughput datasets. It has been observed time and again that as the pace of data generation continues to accelerate, the pace of analysis significantly lags behind [4]. It is also evident that, given the plethora of databases and software efforts [5-12], it is still a significant challenge to work with genome-scale metabolic models, let alone next-generation whole cell models [13-15]. We work at the forefront of model creation and systems scale data generation [16-18]. The OME Framework was borne out of a practical need to enable genome-scale modeling and data analysis under a unified framework to drive the next generation of genome-scale biological models. Here we present the OME Framework. It exists as a set of Python classes. However, we want to emphasize the importance of the underlying design as an addition to the discussions on specifications of a digital cell. A great deal of work and valuable progress has been made by a number of communities [13, 19-24] towards interchange formats and implementations designed to achieve similar goals. While many software tools exist for handling genome-scale

  11. Ori-Finder: A web-based system for finding oriCs in unannotated bacterial genomes

    Directory of Open Access Journals (Sweden)

    Zhang Chun-Ting

    2008-02-01

    Full Text Available Abstract Background Chromosomal replication is the central event in the bacterial cell cycle. Identification of replication origins (oriCs is necessary for almost all newly sequenced bacterial genomes. Given the increasing pace of genome sequencing, the current available software for predicting oriCs, however, still leaves much to be desired. Therefore, the increasing availability of genome sequences calls for improved software to identify oriCs in newly sequenced and unannotated bacterial genomes. Results We have developed Ori-Finder, an online system for finding oriCs in bacterial genomes based on an integrated method comprising the analysis of base composition asymmetry using the Z-curve method, distribution of DnaA boxes, and the occurrence of genes frequently close to oriCs. The program can also deal with unannotated genome sequences by integrating the gene-finding program ZCURVE 1.02. Output of the predicted results is exported to an HTML report, which offers convenient views on the results in both graphical and tabular formats. Conclusion A web-based system to predict replication origins of bacterial genomes has been presented here. Based on this system, oriC regions have been predicted for the bacterial genomes available in GenBank currently. It is hoped that Ori-Finder will become a useful tool for the identification and analysis of oriCs in both bacterial and archaeal genomes.

  12. GFinisher: a new strategy to refine and finish bacterial genome assemblies

    Science.gov (United States)

    Guizelini, Dieval; Raittz, Roberto T.; Cruz, Leonardo M.; Souza, Emanuel M.; Steffens, Maria B. R.; Pedrosa, Fabio O.

    2016-10-01

    Despite the development in DNA sequencing technology, improving the number and the length of reads, the process of reconstruction of complete genome sequences, the so called genome assembly, is still complex. Only 13% of the prokaryotic genome sequencing projects have been completed. Draft genome sequences deposited in public databases are fragmented in contigs and may lack the full gene complement. The aim of the present work is to identify assembly errors and improve the assembly process of bacterial genomes. The biological patterns observed in genomic sequences and the application of a priori information can allow the identification of misassembled regions, and the reorganization and improvement of the overall de novo genome assembly. GFinisher starts generating a Fuzzy GC skew graphs for each contig in an assembly and follows breaking down the contigs in critical points in order to reassemble and close them using jFGap. This has been successfully applied to dataset from 96 genome assemblies, decreasing the number of contigs by up to 86%. GFinisher can easily optimize assemblies of prokaryotic draft genomes and can be used to improve the assembly programs based on nucleotide sequence patterns in the genome. The software and source code are available at http://gfinisher.sourceforge.net/.

  13. Complete Genomes of Classical Swine Fever Virus Cloned into Bacterial Artificial Chromosomes

    DEFF Research Database (Denmark)

    Rasmussen, Thomas Bruun; Reimann, I.; Uttenthal, Åse;

    Complete genome amplification of viral RNA provides a new tool for the generation of modified pestiviruses. We have used our full-genome amplification strategy for generation of amplicons representing complete genomes of classical swine fever virus. The amplicons were cloned directly into a stabl...... single-copy bacterial artificial chromosome (BAC) generating full-length pestivirus DNAs from which infectious RNA transcripts could be also derived. Our strategy allows construction of stable infectious BAC DNAs from a single full-length PCR product....

  14. Complete Genome Sequence of a Human Cytomegalovirus Strain AD169 Bacterial Artificial Chromosome Clone

    Science.gov (United States)

    Ostermann, Eleonore; Spohn, Michael; Indenbirken, Daniela

    2016-01-01

    The complete sequence of the human cytomegalovirus strain AD169 (variant ATCC) cloned as a bacterial artificial chromosome (AD169-BAC, also known as HB15 or pHB15) was determined. The viral genome has a length of 230,290 bp and shows 52 nucleotide differences compared to a previously sequenced AD169varATCC clone. PMID:27034483

  15. Complete genome sequence of Japanese erwinia strain ejp617, a bacterial shoot blight pathogen of pear.

    Science.gov (United States)

    Park, Duck Hwan; Thapa, Shree Prasad; Choi, Beom-Soon; Kim, Won-Sik; Hur, Jang Hyun; Cho, Jun Mo; Lim, Jong-Sung; Choi, Ik-Young; Lim, Chun Keun

    2011-01-01

    The Japanese Erwinia strain Ejp617 is a plant pathogen that causes bacterial shoot blight of pear in Japan. Here, we report the complete genome sequence of strain Ejp617 isolated from Nashi pears in Japan to provide further valuable insight among related Erwinia species.

  16. The early stage of bacterial genome-reductive evolution in the host.

    Directory of Open Access Journals (Sweden)

    Han Song

    2010-05-01

    Full Text Available The equine-associated obligate pathogen Burkholderia mallei was developed by reductive evolution involving a substantial portion of the genome from Burkholderia pseudomallei, a free-living opportunistic pathogen. With its short history of divergence (approximately 3.5 myr, B. mallei provides an excellent resource to study the early steps in bacterial genome reductive evolution in the host. By examining 20 genomes of B. mallei and B. pseudomallei, we found that stepwise massive expansion of IS (insertion sequence elements ISBma1, ISBma2, and IS407A occurred during the evolution of B. mallei. Each element proliferated through the sites where its target selection preference was met. Then, ISBma1 and ISBma2 contributed to the further spread of IS407A by providing secondary insertion sites. This spread increased genomic deletions and rearrangements, which were predominantly mediated by IS407A. There were also nucleotide-level disruptions in a large number of genes. However, no significant signs of erosion were yet noted in these genes. Intriguingly, all these genomic modifications did not seriously alter the gene expression patterns inherited from B. pseudomallei. This efficient and elaborate genomic transition was enabled largely through the formation of the highly flexible IS-blended genome and the guidance by selective forces in the host. The detailed IS intervention, unveiled for the first time in this study, may represent the key component of a general mechanism for early bacterial evolution in the host.

  17. Construction of a llama bacterial artificial chromosome library with approximately 9-fold genome equivalent coverage.

    Science.gov (United States)

    Airmet, K W; Hinckley, J D; Tree, L T; Moss, M; Blumell, S; Ulicny, K; Gustafson, A K; Weed, M; Theodosis, R; Lehnardt, M; Genho, J; Stevens, M R; Kooyman, D L

    2012-01-01

    The Ilama is an important agricultural livestock in much of South America. The llama is increasing in popularity in the United States as a companion animal. Little work has been done to improve llama production using modern technology. A paucity of information is available regarding the llama genome. We report the construction of a llama bacterial artificial chromosome (BAC) library of about 196,224 clones in the vector pECBAC1. Using flow cytometry and bovine, human, mouse, and chicken as controls, we determined the llama genome size to be 2.4 × 10⁹ bp. The average insert size of the library is 137.8 kb corresponding to approximately 9-fold genome coverage. Further studies are needed to further characterize the library and llama genome. We anticipate that this new library will help facilitate future genomic studies in the llama.

  18. An evaluation of multiple annealing and looping based genome amplification using a synthetic bacterial community

    Institute of Scientific and Technical Information of China (English)

    WANG Yong; GAO Zhaoming; XU Ying; LI Guangyu; HE Lisheng; QIAN Peiyuan

    2016-01-01

    The low biomass in environmental samples is a major challenge for microbial metagenomic studies. The amplification of a genomic DNA was frequently applied to meeting the minimum requirement of the DNA for a high-throughput next-generation-sequencing technology. Using a synthetic bacterial community, the amplification efficiency of the Multiple Annealing and Looping Based Amplification Cycles (MALBAC) kit that is originally developed to amplify the single-cell genomic DNA of mammalian organisms is examined. The DNA template of 10 pg in each reaction of the MALBAC amplification may generate enough DNA for Illumina sequencing. Using 10 pg and 100 pg templates for each reaction set, the MALBAC kit shows a stable and homogeneous amplification as indicated by the highly consistent coverage of the reads from the two amplified samples on the contigs assembled by the original unamplified sample. Although GenomePlex whole genome amplification kit allows one to generate enough DNA using 100 pg of template in each reaction, the minority of the mixed bacterial species is not linearly amplified. For both of the kits, the GC-rich regions of the genomic DNA are not efficiently amplified as suggested by the low coverage of the contigs with the high GC content. The high efficiency of the MALBAC kit is supported for the amplification of environmental microbial DNA samples, and the concerns on its application are also raised to bacterial species with the high GC content.

  19. Associations between inverted repeats and the structural evolution of bacterial genomes.

    Science.gov (United States)

    Achaz, Guillaume; Coissac, Eric; Netter, Pierre; Rocha, Eduardo P C

    2003-08-01

    The stability of the structure of bacterial genomes is challenged by recombination events. Since major rearrangements (i.e., inversions) are thought to frequently operate by homologous recombination between inverted repeats, we analyzed the presence and distribution of such repeats in bacterial genomes and their relation to the conservation of chromosomal structure. First, we show that there is a strong under-representation of inverted repeats, relative to direct repeats, in most chromosomes, especially among the ones regarded as most stable. Second, we show that the avoidance of repeats is frequently associated with the stability of the genomes. Closely related genomes reported to differ in terms of stability are also found to differ in the number of inverted repeats. Third, when using replication strand bias as a proxy for genome stability, we find a significant negative correlation between this strand bias and the abundance of inverted repeats. Fourth, when measuring the recombining potential of inverted repeats and their eventual impact on different features of the chromosomal structure, we observe a tendency of repeats to be located in the chromosome in such a way that rearrangements produce a smaller strand switch and smaller asymmetries than expected by chance. Finally, we discuss the limitations of our analysis and the influence of factors such as the nature of repeats, e.g., transposases, or the differences in the recombination machinery among bacteria. These results shed light on the challenges imposed on the genome structure by the presence of inverted repeats.

  20. Genomic Analyses of Bacterial Porin-Cytochrome Gene Clusters

    Directory of Open Access Journals (Sweden)

    Liang eShi

    2014-11-01

    Full Text Available The porin-cytochrome (Pcc protein complex is responsible for trans-outer membrane electron transfer during extracellular reduction of Fe(III by the dissimilatory metal-reducing bacterium Geobacter sulfurreducens PCA. The identified and characterized Pcc complex of G. sulfurreducens PCA consists of a porin-like outer-membrane protein, a periplasmic 8-heme c-type cytochrome (c-Cyt and an outer-membrane 12-heme c-Cyt, and the genes encoding the Pcc proteins are clustered in the same regions of genome (i.e., the pcc gene clusters of G. sulfurreducens PCA. A survey of additionally microbial genomes has identified the pcc gene clusters in all sequenced Geobacter spp. and other bacteria from six different phyla, including Anaeromyxobacter dehalogenans 2CP-1, A. dehalogenans 2CP-C, Anaeromyxobacter sp. K, Candidatus Kuenenia stuttgartiensis, Denitrovibrio acetiphilus DSM 12809, Desulfurispirillum indicum S5, Desulfurivibrio alkaliphilus AHT2, Desulfurobacterium thermolithotrophum DSM 11699, Desulfuromonas acetoxidans DSM 684, Ignavibacterium album JCM 16511, and Thermovibrio ammonificans HB-1. The numbers of genes in the pcc gene clusters vary, ranging from two to nine. Similar to the metal-reducing (Mtr gene clusters of other Fe(III-reducing bacteria, such as Shewanella spp., additional genes that encode putative c-Cyts with predicted cellular localizations at the cytoplasmic membrane, periplasm and outer membrane often associate with the pcc gene clusters. This suggests that the Pcc-associated c-Cyts may be part of the pathways for extracellular electron transfer reactions. The presence of pcc gene clusters in the microorganisms that do not reduce solid-phase Fe(III and Mn(IV oxides, such as D. alkaliphilus AHT2 and I. album JCM 16511, also suggests that some of the pcc gene clusters may be involved in extracellular electron transfer reactions with the substrates other than Fe(III and Mn(IV oxides.

  1. Endozoicomonas genomes reveal functional adaptation and plasticity in bacterial strains symbiotically associated with diverse marine hosts

    KAUST Repository

    Neave, Matthew J.

    2017-01-17

    Endozoicomonas bacteria are globally distributed and often abundantly associated with diverse marine hosts including reef-building corals, yet their function remains unknown. In this study we generated novel Endozoicomonas genomes from single cells and metagenomes obtained directly from the corals Stylophora pistillata, Pocillopora verrucosa, and Acropora humilis. We then compared these culture-independent genomes to existing genomes of bacterial isolates acquired from a sponge, sea slug, and coral to examine the functional landscape of this enigmatic genus. Sequencing and analysis of single cells and metagenomes resulted in four novel genomes with 60–76% and 81–90% genome completeness, respectively. These data also confirmed that Endozoicomonas genomes are large and are not streamlined for an obligate endosymbiotic lifestyle, implying that they have free-living stages. All genomes show an enrichment of genes associated with carbon sugar transport and utilization and protein secretion, potentially indicating that Endozoicomonas contribute to the cycling of carbohydrates and the provision of proteins to their respective hosts. Importantly, besides these commonalities, the genomes showed evidence for differential functional specificity and diversification, including genes for the production of amino acids. Given this metabolic diversity of Endozoicomonas we propose that different genotypes play disparate roles and have diversified in concert with their hosts.

  2. Comprehensive prediction of chromosome dimer resolution sites in bacterial genomes

    Directory of Open Access Journals (Sweden)

    Arakawa Kazuharu

    2011-01-01

    Full Text Available Abstract Background During the replication process of bacteria with circular chromosomes, an odd number of homologous recombination events results in concatenated dimer chromosomes that cannot be partitioned into daughter cells. However, many bacteria harbor a conserved dimer resolution machinery consisting of one or two tyrosine recombinases, XerC and XerD, and their 28-bp target site, dif. Results To study the evolution of the dif/XerCD system and its relationship with replication termination, we report the comprehensive prediction of dif sequences in silico using a phylogenetic prediction approach based on iterated hidden Markov modeling. Using this method, dif sites were identified in 641 organisms among 16 phyla, with a 97.64% identification rate for single-chromosome strains. The dif sequence positions were shown to be strongly correlated with the GC skew shift-point that is induced by replicational mutation/selection pressures, but the difference in the positions of the predicted dif sites and the GC skew shift-points did not correlate with the degree of replicational mutation/selection pressures. Conclusions The sequence of dif sites is widely conserved among many bacterial phyla, and they can be computationally identified using our method. The lack of correlation between dif position and the degree of GC skew suggests that replication termination does not occur strictly at dif sites.

  3. Genome-scale engineering for systems and synthetic biology.

    Science.gov (United States)

    Esvelt, Kevin M; Wang, Harris H

    2013-01-01

    Genome-modification technologies enable the rational engineering and perturbation of biological systems. Historically, these methods have been limited to gene insertions or mutations at random or at a few pre-defined locations across the genome. The handful of methods capable of targeted gene editing suffered from low efficiencies, significant labor costs, or both. Recent advances have dramatically expanded our ability to engineer cells in a directed and combinatorial manner. Here, we review current technologies and methodologies for genome-scale engineering, discuss the prospects for extending efficient genome modification to new hosts, and explore the implications of continued advances toward the development of flexibly programmable chasses, novel biochemistries, and safer organismal and ecological engineering.

  4. CRISPR-Cas: From the Bacterial Adaptive Immune System to a Versatile Tool for Genome Engineering.

    Science.gov (United States)

    Kirchner, Marion; Schneider, Sabine

    2015-11-01

    The field of biology has been revolutionized by the recent advancement of an adaptive bacterial immune system as a universal genome engineering tool. Bacteria and archaea use repetitive genomic elements termed clustered regularly interspaced short palindromic repeats (CRISPR) in combination with an RNA-guided nuclease (CRISPR-associated nuclease: Cas) to target and destroy invading DNA. By choosing the appropriate sequence of the guide RNA, this two-component system can be used to efficiently modify, target, and edit genomic loci of interest in plants, insects, fungi, mammalian cells, and whole organisms. This has opened up new frontiers in genome engineering, including the potential to treat or cure human genetic disorders. Now the potential risks as well as the ethical, social, and legal implications of this powerful new technique move into the limelight.

  5. The diversity of a distributed genome in bacterial populations

    CERN Document Server

    Baumdicker, F; Pfaffelhuber, P

    2009-01-01

    The distributed genome hypothesis states that the set of genes in a population of bacteria is distributed over all individuals that belong to the specific taxon. It implies that certain genes can be gained and lost from generation to generation. We use the random genealogy given by a Kingman coalescent in order to superimpose events of gene gain and loss along ancestral lines. Gene gains occur at constant rate along ancestral lines. We assume that gained genes have never been present in the population before. Gene losses occur at a rate proportional to the number of genes present along the ancestral line. In this "infinitely many genes model" we derive moments for several statistics within a sample: the average number of genes per individual, the average number of genes differing between individuals, the number of incongruent pairs of genes, the total number of different genes in the sample and the gene frequency spectrum. We demonstrate that the model gives a reasonable fit with gene frequency data from mari...

  6. Comparative genomics of the bacterial genus Listeria: Genome evolution is characterized by limited gene acquisition and limited gene loss

    Directory of Open Access Journals (Sweden)

    Barker Melissa

    2010-12-01

    Full Text Available Abstract Background The bacterial genus Listeria contains pathogenic and non-pathogenic species, including the pathogens L. monocytogenes and L. ivanovii, both of which carry homologous virulence gene clusters such as the prfA cluster and clusters of internalin genes. Initial evidence for multiple deletions of the prfA cluster during the evolution of Listeria indicates that this genus provides an interesting model for studying the evolution of virulence and also presents practical challenges with regard to definition of pathogenic strains. Results To better understand genome evolution and evolution of virulence characteristics in Listeria, we used a next generation sequencing approach to generate draft genomes for seven strains representing Listeria species or clades for which genome sequences were not available. Comparative analyses of these draft genomes and six publicly available genomes, which together represent the main Listeria species, showed evidence for (i a pangenome with 2,032 core and 2,918 accessory genes identified to date, (ii a critical role of gene loss events in transition of Listeria species from facultative pathogen to saprotroph, even though a consistent pattern of gene loss seemed to be absent, and a number of isolates representing non-pathogenic species still carried some virulence associated genes, and (iii divergence of modern pathogenic and non-pathogenic Listeria species and strains, most likely circa 47 million years ago, from a pathogenic common ancestor that contained key virulence genes. Conclusions Genome evolution in Listeria involved limited gene loss and acquisition as supported by (i a relatively high coverage of the predicted pan-genome by the observed pan-genome, (ii conserved genome size (between 2.8 and 3.2 Mb, and (iii a highly syntenic genome. Limited gene loss in Listeria did include loss of virulence associated genes, likely associated with multiple transitions to a saprotrophic lifestyle. The genus

  7. Genome resequencing in Populus: Revealing large-scale genome variation and implications on specialized-trait genomics

    Energy Technology Data Exchange (ETDEWEB)

    Muchero, Wellington [ORNL; Labbe, Jessy L [ORNL; Priya, Ranjan [University of Tennessee, Knoxville (UTK); DiFazio, Steven P [West Virginia University, Morgantown; Tuskan, Gerald A [ORNL

    2014-01-01

    To date, Populus ranks among a few plant species with a complete genome sequence and other highly developed genomic resources. With the first genome sequence among all tree species, Populus has been adopted as a suitable model organism for genomic studies in trees. However, far from being just a model species, Populus is a key renewable economic resource that plays a significant role in providing raw materials for the biofuel and pulp and paper industries. Therefore, aside from leading frontiers of basic tree molecular biology and ecological research, Populus leads frontiers in addressing global economic challenges related to fuel and fiber production. The latter fact suggests that research aimed at improving quality and quantity of Populus as a raw material will likely drive the pursuit of more targeted and deeper research in order to unlock the economic potential tied in molecular biology processes that drive this tree species. Advances in genome sequence-driven technologies, such as resequencing individual genotypes, which in turn facilitates large scale SNP discovery and identification of large scale polymorphisms are key determinants of future success in these initiatives. In this treatise we discuss implications of genome sequence-enable technologies on Populus genomic and genetic studies of complex and specialized-traits.

  8. Physical descriptions of the bacterial nucleoid at large scales, and their biological implications

    Science.gov (United States)

    Benza, Vincenzo G.; Bassetti, Bruno; Dorfman, Kevin D.; Scolari, Vittore F.; Bromek, Krystyna; Cicuta, Pietro; Cosentino Lagomarsino, Marco

    2012-07-01

    Recent experimental and theoretical approaches have attempted to quantify the physical organization (compaction and geometry) of the bacterial chromosome with its complement of proteins (the nucleoid). The genomic DNA exists in a complex and dynamic protein-rich state, which is highly organized at various length scales. This has implications for modulating (when not directly enabling) the core biological processes of replication, transcription and segregation. We overview the progress in this area, driven in the last few years by new scientific ideas and new interdisciplinary experimental techniques, ranging from high space- and time-resolution microscopy to high-throughput genomics employing sequencing to map different aspects of the nucleoid-related interactome. The aim of this review is to present the wide spectrum of experimental and theoretical findings coherently, from a physics viewpoint. In particular, we highlight the role that statistical and soft condensed matter physics play in describing this system of fundamental biological importance, specifically reviewing classic and more modern tools from the theory of polymers. We also discuss some attempts toward unifying interpretations of the current results, pointing to possible directions for future investigation.

  9. Genome scale transcriptomics of baculovirus-insect interactions.

    Science.gov (United States)

    Nguyen, Quan; Nielsen, Lars K; Reid, Steven

    2013-11-12

    Baculovirus-insect cell technologies are applied in the production of complex proteins, veterinary and human vaccines, gene delivery vectors' and biopesticides. Better understanding of how baculoviruses and insect cells interact would facilitate baculovirus-based production. While complete genomic sequences are available for over 58 baculovirus species, little insect genomic information is known. The release of the Bombyx mori and Plutella xylostella genomes, the accumulation of EST sequences for several Lepidopteran species, and especially the availability of two genome-scale analysis tools, namely oligonucleotide microarrays and next generation sequencing (NGS), have facilitated expression studies to generate a rich picture of insect gene responses to baculovirus infections. This review presents current knowledge on the interaction dynamics of the baculovirus-insect system' which is relatively well studied in relation to nucleocapsid transportation, apoptosis, and heat shock responses, but is still poorly understood regarding responses involved in pro-survival pathways, DNA damage pathways, protein degradation, translation, signaling pathways, RNAi pathways, and importantly metabolic pathways for energy, nucleotide and amino acid production. We discuss how the two genome-scale transcriptomic tools can be applied for studying such pathways and suggest that proteomics and metabolomics can produce complementary findings to transcriptomic studies.

  10. Genome Scale Transcriptomics of Baculovirus-Insect Interactions

    Directory of Open Access Journals (Sweden)

    Steven Reid

    2013-11-01

    Full Text Available Baculovirus-insect cell technologies are applied in the production of complex proteins, veterinary and human vaccines, gene delivery vectors‚ and biopesticides. Better understanding of how baculoviruses and insect cells interact would facilitate baculovirus-based production. While complete genomic sequences are available for over 58 baculovirus species, little insect genomic information is known. The release of the Bombyx mori and Plutella xylostella genomes, the accumulation of EST sequences for several Lepidopteran species, and especially the availability of two genome-scale analysis tools, namely oligonucleotide microarrays and next generation sequencing (NGS, have facilitated expression studies to generate a rich picture of insect gene responses to baculovirus infections. This review presents current knowledge on the interaction dynamics of the baculovirus-insect system‚ which is relatively well studied in relation to nucleocapsid transportation, apoptosis, and heat shock responses, but is still poorly understood regarding responses involved in pro-survival pathways, DNA damage pathways, protein degradation, translation, signaling pathways, RNAi pathways, and importantly metabolic pathways for energy, nucleotide and amino acid production. We discuss how the two genome-scale transcriptomic tools can be applied for studying such pathways and suggest that proteomics and metabolomics can produce complementary findings to transcriptomic studies.

  11. Large-scale production of magnetic nanoparticles using bacterial fermentation.

    Science.gov (United States)

    Moon, Ji-Won; Rawn, Claudia J; Rondinone, Adam J; Love, Lonnie J; Roh, Yul; Everett, S Michelle; Lauf, Robert J; Phelps, Tommy J

    2010-10-01

    Production of both nano-sized particles of crystalline pure phase magnetite and magnetite substituted with Co, Ni, Cr, Mn, Zn or the rare earths for some of the Fe has been demonstrated using microbial processes. This microbial production of magnetic nanoparticles can be achieved in large quantities and at low cost. In these experiments, over 1 kg (wet weight) of Zn-substituted magnetite (nominal composition of Zn(0.6)Fe(2.4)O4) was recovered from 30 l fermentations. Transmission electron microscopy (TEM) was used to confirm that the extracellular magnetites exhibited good mono-dispersity. TEM results also showed a highly reproducible particle size and corroborated average crystallite size (ACS) of 13.1 ± 0.8 nm determined through X-ray diffraction (N = 7) at a 99% confidence level. Based on scale-up experiments performed using a 35-l reactor, the increase in ACS reproducibility may be attributed to a combination of factors including an increase of electron donor input, availability of divalent substitution metal ions and fewer ferrous ions in the case of substituted magnetite, and increased reactor volume overcoming differences in each batch. Commercial nanometer sized magnetite (25-50 nm) may cost $500/kg. However, microbial processes are potentially capable of producing 5-90 nm pure or substituted magnetites at a fraction of the cost of traditional chemical synthesis. While there are numerous approaches for the synthesis of nanoparticles, bacterial fermentation of magnetite or metal-substituted magnetite may represent an advantageous manufacturing technology with respect to yield, reproducibility and scalable synthesis with low costs at low energy input.

  12. Large-scale data mining pilot project in human genome

    Energy Technology Data Exchange (ETDEWEB)

    Musick, R.; Fidelis, R.; Slezak, T.

    1997-05-01

    This whitepaper briefly describes a new, aggressive effort in large- scale data Livermore National Labs. The implications of `large- scale` will be clarified Section. In the short term, this effort will focus on several @ssion-critical questions of Genome project. We will adapt current data mining techniques to the Genome domain, to quantify the accuracy of inference results, and lay the groundwork for a more extensive effort in large-scale data mining. A major aspect of the approach is that we will be fully-staffed data warehousing effort in the human Genome area. The long term goal is strong applications- oriented research program in large-@e data mining. The tools, skill set gained will be directly applicable to a wide spectrum of tasks involving a for large spatial and multidimensional data. This includes applications in ensuring non-proliferation, stockpile stewardship, enabling Global Ecology (Materials Database Industrial Ecology), advancing the Biosciences (Human Genome Project), and supporting data for others (Battlefield Management, Health Care).

  13. bcgTree: automatized phylogenetic tree building from bacterial core genomes.

    Science.gov (United States)

    Ankenbrand, Markus J; Keller, Alexander

    2016-10-01

    The need for multi-gene analyses in scientific fields such as phylogenetics and DNA barcoding has increased in recent years. In particular, these approaches are increasingly important for differentiating bacterial species, where reliance on the standard 16S rDNA marker can result in poor resolution. Additionally, the assembly of bacterial genomes has become a standard task due to advances in next-generation sequencing technologies. We created a bioinformatic pipeline, bcgTree, which uses assembled bacterial genomes either from databases or own sequencing results from the user to reconstruct their phylogenetic history. The pipeline automatically extracts 107 essential single-copy core genes, found in a majority of bacteria, using hidden Markov models and performs a partitioned maximum-likelihood analysis. Here, we describe the workflow of bcgTree and, as a proof-of-concept, its usefulness in resolving the phylogeny of 293 publically available bacterial strains of the genus Lactobacillus. We also evaluate its performance in both low- and high-level taxonomy test sets. The tool is freely available at github ( https://github.com/iimog/bcgTree ) and our institutional homepage ( http://www.dna-analytics.biozentrum.uni-wuerzburg.de ).

  14. Testing the infinitely many genes model for the evolution of the bacterial core genome and pangenome.

    Science.gov (United States)

    Collins, R Eric; Higgs, Paul G

    2012-11-01

    When groups of related bacterial genomes are compared, the number of core genes found in all genomes is usually much less than the mean genome size, whereas the size of the pangenome (the set of genes found on at least one of the genomes) is much larger than the mean size of one genome. We analyze 172 complete genomes of Bacilli and compare the properties of the pangenomes and core genomes of monophyletic subsets taken from this group. We then assess the capabilities of several evolutionary models to predict these properties. The infinitely many genes (IMG) model is based on the assumption that each new gene can arise only once. The predictions of the model depend on the shape of the evolutionary tree that underlies the divergence of the genomes. We calculate results for coalescent trees, star trees, and arbitrary phylogenetic trees of predefined fixed branch length. On a star tree, the pangenome size increases linearly with the number of genomes, as has been suggested in some previous studies, whereas on a coalescent tree, it increases logarithmically. The coalescent tree gives a better fit to the data, for all the examples we consider. In some cases, a fixed phylogenetic tree proved better than the coalescent tree at reproducing structure in the gene frequency spectrum, but little improvement was gained in predictions of the core and pangenome sizes. Most of the data are well explained by a model with three classes of gene: an essential class that is found in all genomes, a slow class whose rate of origination and deletion is slow compared with the time of divergence of the genomes, and a fast class showing rapid origination and deletion. Although the majority of genes originating in a genome are in the fast class, these genes are not retained for long periods, and the majority of genes present in a genome are in the slow or essential classes. In general, we show that the IMG model is useful for comparison with experimental genome data both for species level and

  15. Large-Scale Sequencing: The Future of Genomic Sciences Colloquium

    Energy Technology Data Exchange (ETDEWEB)

    Margaret Riley; Merry Buckley

    2009-01-01

    Genetic sequencing and the various molecular techniques it has enabled have revolutionized the field of microbiology. Examining and comparing the genetic sequences borne by microbes - including bacteria, archaea, viruses, and microbial eukaryotes - provides researchers insights into the processes microbes carry out, their pathogenic traits, and new ways to use microorganisms in medicine and manufacturing. Until recently, sequencing entire microbial genomes has been laborious and expensive, and the decision to sequence the genome of an organism was made on a case-by-case basis by individual researchers and funding agencies. Now, thanks to new technologies, the cost and effort of sequencing is within reach for even the smallest facilities, and the ability to sequence the genomes of a significant fraction of microbial life may be possible. The availability of numerous microbial genomes will enable unprecedented insights into microbial evolution, function, and physiology. However, the current ad hoc approach to gathering sequence data has resulted in an unbalanced and highly biased sampling of microbial diversity. A well-coordinated, large-scale effort to target the breadth and depth of microbial diversity would result in the greatest impact. The American Academy of Microbiology convened a colloquium to discuss the scientific benefits of engaging in a large-scale, taxonomically-based sequencing project. A group of individuals with expertise in microbiology, genomics, informatics, ecology, and evolution deliberated on the issues inherent in such an effort and generated a set of specific recommendations for how best to proceed. The vast majority of microbes are presently uncultured and, thus, pose significant challenges to such a taxonomically-based approach to sampling genome diversity. However, we have yet to even scratch the surface of the genomic diversity among cultured microbes. A coordinated sequencing effort of cultured organisms is an appropriate place to begin

  16. Phylogeny of bacterial and archaeal genomes using conserved genes: supertrees and supermatrices.

    Directory of Open Access Journals (Sweden)

    Jenna Morgan Lang

    Full Text Available Over 3000 microbial (bacterial and archaeal genomes have been made publically available to date, providing an unprecedented opportunity to examine evolutionary genomic trends and offering valuable reference data for a variety of other studies such as metagenomics. The utility of these genome sequences is greatly enhanced when we have an understanding of how they are phylogenetically related to each other. Therefore, we here describe our efforts to reconstruct the phylogeny of all available bacterial and archaeal genomes. We identified 24, single-copy, ubiquitous genes suitable for this phylogenetic analysis. We used two approaches to combine the data for the 24 genes. First, we concatenated alignments of all genes into a single alignment from which a Maximum Likelihood (ML tree was inferred using RAxML. Second, we used a relatively new approach to combining gene data, Bayesian Concordance Analysis (BCA, as implemented in the BUCKy software, in which the results of 24 single-gene phylogenetic analyses are used to generate a "primary concordance" tree. A comparison of the concatenated ML tree and the primary concordance (BUCKy tree reveals that the two approaches give similar results, relative to a phylogenetic tree inferred from the 16S rRNA gene. After comparing the results and the methods used, we conclude that the current best approach for generating a single phylogenetic tree, suitable for use as a reference phylogeny for comparative analyses, is to perform a maximum likelihood analysis of a concatenated alignment of conserved, single-copy genes.

  17. From Environment to Man: Genome Evolution and Adaptation of Human Opportunistic Bacterial Pathogens

    Science.gov (United States)

    Aujoulat, Fabien; Roger, Frédéric; Bourdier, Alice; Lotthé, Anne; Lamy, Brigitte; Marchandin, Hélène; Jumas-Bilak, Estelle

    2012-01-01

    Environment is recognized as a huge reservoir for bacterial species and a source of human pathogens. Some environmental bacteria have an extraordinary range of activities that include promotion of plant growth or disease, breakdown of pollutants, production of original biomolecules, but also multidrug resistance and human pathogenicity. The versatility of bacterial life-style involves adaptation to various niches. Adaptation to both open environment and human specific niches is a major challenge that involves intermediate organisms allowing pre-adaptation to humans. The aim of this review is to analyze genomic features of environmental bacteria in order to explain their adaptation to human beings. The genera Pseudomonas, Aeromonas and Ochrobactrum provide valuable examples of opportunistic behavior associated to particular genomic structure and evolution. Particularly, we performed original genomic comparisons among aeromonads and between the strictly intracellular pathogens Brucella spp. and the mild opportunistic pathogens Ochrobactrum spp. We conclude that the adaptation to human could coincide with a speciation in action revealed by modifications in both genomic and population structures. This adaptation-driven speciation could be a major mechanism for the emergence of true pathogens besides the acquisition of specialized virulence factors. PMID:24704914

  18. Genome-scale metabolic representation of Amycolatopsis balhimycina

    DEFF Research Database (Denmark)

    Vongsangnak, Wanwipa; Figueiredo, L. F.; Förster, Jochen

    2012-01-01

    EC numbers, 647 metabolites and 1,363 metabolic reactions. During the analysis of the metabolic model, linear, quadratic and evolutionary programming algorithms using flux balance analysis (FBA), minimization of metabolic adjustment (MOMA), and OptGene, respectively were applied as well as phenotypic...... biosynthesis in Amycolatopsis balhimycina. The balhimycin yield obtained by A. balhimycina is, however, low and there is therefore a need to improve balhimycin production. In this study, we performed genome sequencing, assembly and annotation analysis of A. balhimycina and further used these annotated data...... to reconstruct a genome‐scale metabolic model for the organism. Here we generated an almost complete A. balhimycina genome sequence comprising 10,562,587 base pairs assembled into 2,153 contigs. The high GC‐genome (∼69%) includes 8,585 open reading frames (ORFs). We used our integrative toolbox called SEQTOR...

  19. Replicon-dependent bacterial genome evolution: the case of Sinorhizobium meliloti.

    Science.gov (United States)

    Galardini, Marco; Pini, Francesco; Bazzicalupo, Marco; Biondi, Emanuele G; Mengoni, Alessio

    2013-01-01

    Many bacterial species, such as the alphaproteobacterium Sinorhizobium meliloti, are characterized by open pangenomes and contain multipartite genomes consisting of a chromosome and other large-sized replicons, such as chromids, megaplasmids, and plasmids. The evolutionary forces in both functional and structural aspects that shape the pangenome of species with multipartite genomes are still poorly understood. Therefore, we sequenced the genomes of 10 new S. meliloti strains, analyzed with four publicly available additional genomic sequences. Results indicated that the three main replicons present in these strains (a chromosome, a chromid, and a megaplasmid) partly show replicon-specific behaviors related to strain differentiation. In particular, the pSymB chromid was shown to be a hot spot for positively selected genes, and, unexpectedly, genes resident in the pSymB chromid were also found to be more widespread in distant taxa than those located in the other replicons. Moreover, through the exploitation of a DNA proximity network, a series of conserved "DNA backbones" were found to shape the evolution of the genome structure, with the rest of the genome experiencing rearrangements. The presented data allow depicting a scenario where the pSymB chromid has a distinctive role in intraspecies differentiation and in evolution through positive selection, whereas the pSymA megaplasmid mostly contributes to structural fluidity and to the emergence of new functions, indicating a specific evolutionary role for each replicon in the pangenome evolution.

  20. Applying Shannon's information theory to bacterial and phage genomes and metagenomes.

    Science.gov (United States)

    Akhter, Sajia; Bailey, Barbara A; Salamon, Peter; Aziz, Ramy K; Edwards, Robert A

    2013-01-01

    All sequence data contain inherent information that can be measured by Shannon's uncertainty theory. Such measurement is valuable in evaluating large data sets, such as metagenomic libraries, to prioritize their analysis and annotation, thus saving computational resources. Here, Shannon's index of complete phage and bacterial genomes was examined. The information content of a genome was found to be highly dependent on the genome length, GC content, and sequence word size. In metagenomic sequences, the amount of information correlated with the number of matches found by comparison to sequence databases. A sequence with more information (higher uncertainty) has a higher probability of being significantly similar to other sequences in the database. Measuring uncertainty may be used for rapid screening for sequences with matches in available database, prioritizing computational resources, and indicating which sequences with no known similarities are likely to be important for more detailed analysis.

  1. Applying Shannon's information theory to bacterial and phage genomes and metagenomes

    Science.gov (United States)

    Akhter, Sajia; Bailey, Barbara A.; Salamon, Peter; Aziz, Ramy K.; Edwards, Robert A.

    2013-01-01

    All sequence data contain inherent information that can be measured by Shannon's uncertainty theory. Such measurement is valuable in evaluating large data sets, such as metagenomic libraries, to prioritize their analysis and annotation, thus saving computational resources. Here, Shannon's index of complete phage and bacterial genomes was examined. The information content of a genome was found to be highly dependent on the genome length, GC content, and sequence word size. In metagenomic sequences, the amount of information correlated with the number of matches found by comparison to sequence databases. A sequence with more information (higher uncertainty) has a higher probability of being significantly similar to other sequences in the database. Measuring uncertainty may be used for rapid screening for sequences with matches in available database, prioritizing computational resources, and indicating which sequences with no known similarities are likely to be important for more detailed analysis. PMID:23301154

  2. Applying Shannon's information theory to bacterial and phage genomes and metagenomes

    Science.gov (United States)

    Akhter, Sajia; Bailey, Barbara A.; Salamon, Peter; Aziz, Ramy K.; Edwards, Robert A.

    2013-01-01

    All sequence data contain inherent information that can be measured by Shannon's uncertainty theory. Such measurement is valuable in evaluating large data sets, such as metagenomic libraries, to prioritize their analysis and annotation, thus saving computational resources. Here, Shannon's index of complete phage and bacterial genomes was examined. The information content of a genome was found to be highly dependent on the genome length, GC content, and sequence word size. In metagenomic sequences, the amount of information correlated with the number of matches found by comparison to sequence databases. A sequence with more information (higher uncertainty) has a higher probability of being significantly similar to other sequences in the database. Measuring uncertainty may be used for rapid screening for sequences with matches in available database, prioritizing computational resources, and indicating which sequences with no known similarities are likely to be important for more detailed analysis.

  3. An evaluation of Comparative Genome Sequencing (CGS by comparing two previously-sequenced bacterial genomes

    Directory of Open Access Journals (Sweden)

    Herring Christopher D

    2007-08-01

    Full Text Available Abstract Background With the development of new technology, it has recently become practical to resequence the genome of a bacterium after experimental manipulation. It is critical though to know the accuracy of the technique used, and to establish confidence that all of the mutations were detected. Results In order to evaluate the accuracy of genome resequencing using the microarray-based Comparative Genome Sequencing service provided by Nimblegen Systems Inc., we resequenced the E. coli strain W3110 Kohara using MG1655 as a reference, both of which have been completely sequenced using traditional sequencing methods. CGS detected 7 of 8 small sequence differences, one large deletion, and 9 of 12 IS element insertions present in W3110, but did not detect a large chromosomal inversion. In addition, we confirmed that CGS also detected 2 SNPs, one deletion and 7 IS element insertions that are not present in the genome sequence, which we attribute to changes that occurred after the creation of the W3110 lambda clone library. The false positive rate for SNPs was one per 244 Kb of genome sequence. Conclusion CGS is an effective way to detect multiple mutations present in one bacterium relative to another, and while highly cost-effective, is prone to certain errors. Mutations occurring in repeated sequences or in sequences with a high degree of secondary structure may go undetected. It is also critical to follow up on regions of interest in which SNPs were not called because they often indicate deletions or IS element insertions.

  4. Cytotoxic chromosomal targeting by CRISPR/Cas systems can reshape bacterial genomes and expel or remodel pathogenicity islands.

    Directory of Open Access Journals (Sweden)

    Reuben B Vercoe

    2013-04-01

    Full Text Available In prokaryotes, clustered regularly interspaced short palindromic repeats (CRISPRs and their associated (Cas proteins constitute a defence system against bacteriophages and plasmids. CRISPR/Cas systems acquire short spacer sequences from foreign genetic elements and incorporate these into their CRISPR arrays, generating a memory of past invaders. Defence is provided by short non-coding RNAs that guide Cas proteins to cleave complementary nucleic acids. While most spacers are acquired from phages and plasmids, there are examples of spacers that match genes elsewhere in the host bacterial chromosome. In Pectobacterium atrosepticum the type I-F CRISPR/Cas system has acquired a self-complementary spacer that perfectly matches a protospacer target in a horizontally acquired island (HAI2 involved in plant pathogenicity. Given the paucity of experimental data about CRISPR/Cas-mediated chromosomal targeting, we examined this process by developing a tightly controlled system. Chromosomal targeting was highly toxic via targeting of DNA and resulted in growth inhibition and cellular filamentation. The toxic phenotype was avoided by mutations in the cas operon, the CRISPR repeats, the protospacer target, and protospacer-adjacent motif (PAM beside the target. Indeed, the natural self-targeting spacer was non-toxic due to a single nucleotide mutation adjacent to the target in the PAM sequence. Furthermore, we show that chromosomal targeting can result in large-scale genomic alterations, including the remodelling or deletion of entire pre-existing pathogenicity islands. These features can be engineered for the targeted deletion of large regions of bacterial chromosomes. In conclusion, in DNA-targeting CRISPR/Cas systems, chromosomal interference is deleterious by causing DNA damage and providing a strong selective pressure for genome alterations, which may have consequences for bacterial evolution and pathogenicity.

  5. Genome Sequences of 15 Gardnerella vaginalis Strains Isolated from the Vaginas of Women with and without Bacterial Vaginosis

    Science.gov (United States)

    Robinson, Lloyd S.; Perry, Justin; Lek, Sai; Wollam, Aye; Sodergren, Erica; Weinstock, George

    2016-01-01

    Gardnerella vaginalis is a predominant species in bacterial vaginosis, a dysbiosis of the vagina that is associated with adverse health outcomes, including preterm birth. Here, we present the draft genome sequences of 15 Gardnerella vaginalis strains (now available through BEI Resources) isolated from women with and without bacterial vaginosis. PMID:27688326

  6. Power Laws, Scale-Free Networks and Genome Biology

    CERN Document Server

    Koonin, Eugene V; Karev, Georgy P

    2006-01-01

    Power Laws, Scale-free Networks and Genome Biology deals with crucial aspects of the theoretical foundations of systems biology, namely power law distributions and scale-free networks which have emerged as the hallmarks of biological organization in the post-genomic era. The chapters in the book not only describe the interesting mathematical properties of biological networks but moves beyond phenomenology, toward models of evolution capable of explaining the emergence of these features. The collection of chapters, contributed by both physicists and biologists, strives to address the problems in this field in a rigorous but not excessively mathematical manner and to represent different viewpoints, which is crucial in this emerging discipline. Each chapter includes, in addition to technical descriptions of properties of biological networks and evolutionary models, a more general and accessible introduction to the respective problems. Most chapters emphasize the potential of theoretical systems biology for disco...

  7. Metabolic Complementarity and Genomics of the Dual Bacterial Symbiosis of Sharpshooters

    Science.gov (United States)

    Wu, Dongying; Daugherty, Sean C; Van Aken, Susan E; Pai, Grace H; Watkins, Kisha L; Khouri, Hoda; Tallon, Luke J; Zaborsky, Jennifer M; Dunbar, Helen E; Tran, Phat L; Moran, Nancy A

    2006-01-01

    Mutualistic intracellular symbiosis between bacteria and insects is a widespread phenomenon that has contributed to the global success of insects. The symbionts, by provisioning nutrients lacking from diets, allow various insects to occupy or dominate ecological niches that might otherwise be unavailable. One such insect is the glassy-winged sharpshooter (Homalodisca coagulata), which feeds on xylem fluid, a diet exceptionally poor in organic nutrients. Phylogenetic studies based on rRNA have shown two types of bacterial symbionts to be coevolving with sharpshooters: the gamma-proteobacterium Baumannia cicadellinicola and the Bacteroidetes species Sulcia muelleri. We report here the sequencing and analysis of the 686,192–base pair genome of B. cicadellinicola and approximately 150 kilobase pairs of the small genome of S. muelleri, both isolated from H. coagulata. Our study, which to our knowledge is the first genomic analysis of an obligate symbiosis involving multiple partners, suggests striking complementarity in the biosynthetic capabilities of the two symbionts: B. cicadellinicola devotes a substantial portion of its genome to the biosynthesis of vitamins and cofactors required by animals and lacks most amino acid biosynthetic pathways, whereas S. muelleri apparently produces most or all of the essential amino acids needed by its host. This finding, along with other results of our genome analysis, suggests the existence of metabolic codependency among the two unrelated endosymbionts and their insect host. This dual symbiosis provides a model case for studying correlated genome evolution and genome reduction involving multiple organisms in an intimate, obligate mutualistic relationship. In addition, our analysis provides insight for the first time into the differences in symbionts between insects (e.g., aphids) that feed on phloem versus those like H. coagulata that feed on xylem. Finally, the genomes of these two symbionts provide potential targets for

  8. Metabolic complementarity and genomics of the dual bacterial symbiosis of sharpshooters.

    Directory of Open Access Journals (Sweden)

    Dongying Wu

    2006-06-01

    Full Text Available Mutualistic intracellular symbiosis between bacteria and insects is a widespread phenomenon that has contributed to the global success of insects. The symbionts, by provisioning nutrients lacking from diets, allow various insects to occupy or dominate ecological niches that might otherwise be unavailable. One such insect is the glassy-winged sharpshooter (Homalodisca coagulata, which feeds on xylem fluid, a diet exceptionally poor in organic nutrients. Phylogenetic studies based on rRNA have shown two types of bacterial symbionts to be coevolving with sharpshooters: the gamma-proteobacterium Baumannia cicadellinicola and the Bacteroidetes species Sulcia muelleri. We report here the sequencing and analysis of the 686,192-base pair genome of B. cicadellinicola and approximately 150 kilobase pairs of the small genome of S. muelleri, both isolated from H. coagulata. Our study, which to our knowledge is the first genomic analysis of an obligate symbiosis involving multiple partners, suggests striking complementarity in the biosynthetic capabilities of the two symbionts: B. cicadellinicola devotes a substantial portion of its genome to the biosynthesis of vitamins and cofactors required by animals and lacks most amino acid biosynthetic pathways, whereas S. muelleri apparently produces most or all of the essential amino acids needed by its host. This finding, along with other results of our genome analysis, suggests the existence of metabolic codependency among the two unrelated endosymbionts and their insect host. This dual symbiosis provides a model case for studying correlated genome evolution and genome reduction involving multiple organisms in an intimate, obligate mutualistic relationship. In addition, our analysis provides insight for the first time into the differences in symbionts between insects (e.g., aphids that feed on phloem versus those like H. coagulata that feed on xylem. Finally, the genomes of these two symbionts provide potential

  9. Next-generation genome-scale models for metabolic engineering

    DEFF Research Database (Denmark)

    King, Zachary A.; Lloyd, Colton J.; Feist, Adam M.;

    2015-01-01

    Constraint-based reconstruction and analysis (COBRA) methods have become widely used tools for metabolic engineering in both academic and industrial laboratories. By employing a genome-scale in silico representation of the metabolic network of a host organism, COBRA methods can be used to predict...... examples of applying COBRA methods to strain optimization are presented and discussed. Then, an outlook is provided on the next generation of COBRA models and the new types of predictions they will enable for systems metabolic engineering....

  10. Meta-analysis of general bacterial subclades in whole-genome phylogenies using tree topology profiling.

    Science.gov (United States)

    Meinel, Thomas; Krause, Antje

    2012-01-01

    In the last two decades, a large number of whole-genome phylogenies have been inferred to reconstruct the Tree of Life (ToL). Underlying data models range from gene or functionality content in species to phylogenetic gene family trees and multiple sequence alignments of concatenated protein sequences. Diversity in data models together with the use of different tree reconstruction techniques, disruptive biological effects and the steadily increasing number of genomes have led to a huge diversity in published phylogenies. Comparison of those and, moreover, identification of the impact of inference properties (underlying data model, inference technique) on particular reconstructions is almost impossible. In this work, we introduce tree topology profiling as a method to compare already published whole-genome phylogenies. This method requires visual determination of the particular topology in a drawn whole-genome phylogeny for a set of particular bacterial clans. For each clan, neighborhoods to other bacteria are collected into a catalogue of generalized alternative topologies. Particular topology alternatives found for an ordered list of bacterial clans reveal a topology profile that represents the analyzed phylogeny. To simulate the inhomogeneity of published gene content phylogenies we generate a set of seven phylogenies using different inference techniques and the SYSTERS-PhyloMatrix data model. After tree topology profiling on in total 54 selected published and newly inferred phylogenies, we separate artefactual from biologically meaningful phylogenies and associate particular inference results (phylogenies) with inference background (inference techniques as well as data models). Topological relationships of particular bacterial species groups are presented. With this work we introduce tree topology profiling into the scientific field of comparative phylogenomics.

  11. Extraction of ribosomal RNA and genomic DNA from soil for studying the diversity of the indigenous bacterial community

    NARCIS (Netherlands)

    Duarte, G.F.; Rosado, A.S.; Keijzer-Wolters, A.C.; Elsas, van J.D.

    1998-01-01

    A method for the indirect (cell extraction followed by nucleic acid extraction) isolation of bacterial ribosomal RNA (rRNA) and genomic DNA from soil was developed. The protocol allowed for the rapid parallel extraction of genomic DNA as well as small and large ribosomal subunit RNA from four soils

  12. Genomic divergences among cattle, dog and human estimated from large-scale alignments of genomic sequences

    Directory of Open Access Journals (Sweden)

    Shade Larry L

    2006-06-01

    Full Text Available Abstract Background Approximately 11 Mb of finished high quality genomic sequences were sampled from cattle, dog and human to estimate genomic divergences and their regional variation among these lineages. Results Optimal three-way multi-species global sequence alignments for 84 cattle clones or loci (each >50 kb of genomic sequence were constructed using the human and dog genome assemblies as references. Genomic divergences and substitution rates were examined for each clone and for various sequence classes under different functional constraints. Analysis of these alignments revealed that the overall genomic divergences are relatively constant (0.32–0.37 change/site for pairwise comparisons among cattle, dog and human; however substitution rates vary across genomic regions and among different sequence classes. A neutral mutation rate (2.0–2.2 × 10(-9 change/site/year was derived from ancestral repetitive sequences, whereas the substitution rate in coding sequences (1.1 × 10(-9 change/site/year was approximately half of the overall rate (1.9–2.0 × 10(-9 change/site/year. Relative rate tests also indicated that cattle have a significantly faster rate of substitution as compared to dog and that this difference is about 6%. Conclusion This analysis provides a large-scale and unbiased assessment of genomic divergences and regional variation of substitution rates among cattle, dog and human. It is expected that these data will serve as a baseline for future mammalian molecular evolution studies.

  13. An empirical strategy for characterizing bacterial proteomes across species in the absence of genomic sequences.

    Directory of Open Access Journals (Sweden)

    Joshua E Turse

    Full Text Available Global protein identification through current proteomics methods typically depends on the availability of sequenced genomes. In spite of increasingly high throughput sequencing technologies, this information is not available for every microorganism and rarely available for entire microbial communities. Nevertheless, the protein-level homology that exists between related bacteria makes it possible to extract biological information from the proteome of an organism or microbial community by using the genomic sequences of a near neighbor organism. Here, we demonstrate a trans-organism search strategy for determining the extent to which near-neighbor genome sequences can be applied to identify proteins in unsequenced environmental isolates. In proof of concept testing, we found that within a CLUSTAL W distance of 0.089, near-neighbor genomes successfully identified a high percentage of proteins within an organism. Application of this strategy to characterize environmental bacterial isolates lacking sequenced genomes, but having 16S rDNA sequence similarity to Shewanella resulted in the identification of 300-500 proteins in each strain. The majority of identified pathways mapped to core processes, as well as to processes unique to the Shewanellae, in particular to the presence of c-type cytochromes. Examples of core functional categories include energy metabolism, protein and nucleotide synthesis and cofactor biosynthesis, allowing classification of bacteria by observation of conserved processes. Additionally, within these core functionalities, we observed proteins involved in the alternative lactate utilization pathway, recently described in Shewanella.

  14. In-Yeast Engineering of a Bacterial Genome Using CRISPR/Cas9.

    Science.gov (United States)

    Tsarmpopoulos, Iason; Gourgues, Géraldine; Blanchard, Alain; Vashee, Sanjay; Jores, Joerg; Lartigue, Carole; Sirand-Pugnet, Pascal

    2016-01-15

    One remarkable achievement in synthetic biology was the reconstruction of mycoplasma genomes and their cloning in yeast where they can be modified using available genetic tools. Recently, CRISPR/Cas9 editing tools were developed for yeast mutagenesis. Here, we report their adaptation for the engineering of bacterial genomes cloned in yeast. A seamless deletion of the mycoplasma glycerol-3-phosphate oxidase-encoding gene (glpO) was achieved without selection in one step, using 90 nt paired oligonucleotides as templates to drive recombination. Screening of the resulting clones revealed that more than 20% contained the desired deletion. After manipulation, the overall integrity of the cloned mycoplasma genome was verified by multiplex PCR and PFGE. Finally, the edited genome was back-transplanted into a mycoplasma recipient cell. In accordance with the deletion of glpO, the mutant mycoplasma was affected in the production of H2O2. This work paves the way to high-throughput manipulation of natural or synthetic genomes in yeast.

  15. Genomic Analysis of Caldithrix abyssi, the Thermophilic Anaerobic Bacterium of the Novel Bacterial Phylum Calditrichaeota

    Science.gov (United States)

    Kublanov, Ilya V.; Sigalova, Olga M.; Gavrilov, Sergey N.; Lebedinsky, Alexander V.; Rinke, Christian; Kovaleva, Olga; Chernyh, Nikolai A.; Ivanova, Natalia; Daum, Chris; Reddy, T.B.K.; Klenk, Hans-Peter; Spring, Stefan; Göker, Markus; Reva, Oleg N.; Miroshnichenko, Margarita L.; Kyrpides, Nikos C.; Woyke, Tanja; Gelfand, Mikhail S.; Bonch-Osmolovskaya, Elizaveta A.

    2017-01-01

    The genome of Caldithrix abyssi, the first cultivated representative of a phylum-level bacterial lineage, was sequenced within the framework of Genomic Encyclopedia of Bacteria and Archaea (GEBA) project. The genomic analysis revealed mechanisms allowing this anaerobic bacterium to ferment peptides or to implement nitrate reduction with acetate or molecular hydrogen as electron donors. The genome encoded five different [NiFe]- and [FeFe]-hydrogenases, one of which, group 1 [NiFe]-hydrogenase, is presumably involved in lithoheterotrophic growth, three other produce H2 during fermentation, and one is apparently bidirectional. The ability to reduce nitrate is determined by a nitrate reductase of the Nap family, while nitrite reduction to ammonia is presumably catalyzed by an octaheme cytochrome c nitrite reductase εHao. The genome contained genes of respiratory polysulfide/thiosulfate reductase, however, elemental sulfur and thiosulfate were not used as the electron acceptors for anaerobic respiration with acetate or H2, probably due to the lack of the gene of the maturation protein. Nevertheless, elemental sulfur and thiosulfate stimulated growth on fermentable substrates (peptides), being reduced to sulfide, most probably through the action of the cytoplasmic sulfide dehydrogenase and/or NAD(P)-dependent [NiFe]-hydrogenase (sulfhydrogenase) encoded by the genome. Surprisingly, the genome of this anaerobic microorganism encoded all genes for cytochrome c oxidase, however, its maturation machinery seems to be non-operational due to genomic rearrangements of supplementary genes. Despite the fact that sugars were not among the substrates reported when C. abyssi was first described, our genomic analysis revealed multiple genes of glycoside hydrolases, and some of them were predicted to be secreted. This finding aided in bringing out four carbohydrates that supported the growth of C. abyssi: starch, cellobiose, glucomannan and xyloglucan. The genomic analysis

  16. Quantitative analysis of correlation between AT and GC biases among bacterial genomes

    Science.gov (United States)

    Zhang, Ge

    2017-01-01

    Due to different replication mechanisms between the leading and lagging strands, nucleotide composition asymmetries widely exist in bacterial genomes. A general consideration reveals that the leading strand is enriched in Guanine (G) and Thymine (T), and the lagging strand shows richness in Adenine (A) and Cytosine (C). However, some bacteria like Bacillus subtilis have been discovered composing more A than T in the leading strand. To investigate the difference, we analyze the nucleotide asymmetry from the aspect of AT and GC bias correlations. In this study, we propose a windowless method, the Z-curve Correlation Coefficient (ZCC) index, based on the Z-curve method, and analyzed more than 2000 bacterial genomes. We find that the majority of bacteria reveal negative correlations between AT and GC biases, while most genomes in Firmicutes and Tenericutes have positive ZCC indexes. The presence of PolC, purine asymmetry and stronger genes preference in the leading strand are not confined to Firmicutes, but also likely to happen in other phyla dominated by positive ZCC indexes. This method also provides a new insight into other relevant features like aerobism, and can be applied to analyze the correlation between RY (Purine and Pyrimidine) and MK (Amino and Keto) bias and so on. PMID:28158313

  17. Bacterial delivery of large intact genomic-DNA-containing BACs into mammalian cells.

    Science.gov (United States)

    Cheung, Wing; Kotzamanis, George; Abdulrazzak, Hassan; Goussard, Sylvie; Kaname, Tadashi; Kotsinas, Athanassios; Gorgoulis, Vassilis G; Grillot-Courvalin, Catherine; Huxley, Clare

    2012-01-01

    Efficient delivery of large intact vectors into mammalian cells remains problematical. Here we evaluate delivery by bacterial invasion of two large BACs of more than 150 kb in size into various cells. First, we determined the effect of several drugs on bacterial delivery of a small plasmid into different cell lines. Most drugs tested resulted in a marginal increase of the overall efficiency of delivery in only some cell lines, except the lysosomotropic drug chloroquine, which was found to increase the efficiency of delivery by 6-fold in B16F10 cells. Bacterial invasion was found to be significantly advantageous compared with lipofection in delivering large intact BACs into mouse cells, resulting in 100% of clones containing intact DNA. Furthermore, evaluation of expression of the human hypoxanthine phosphoribosyltransferase (HPRT) gene from its genomic locus, which was present in one of the BACs, showed that single copy integrations of the HPRT-containing BAC had occurred in mouse B16F10 cells and that expression of HPRT from each human copy was 0.33 times as much as from each endogenous mouse copy. These data provide new evidence that bacterial delivery is a convenient and efficient method to transfer large intact therapeutic genes into mammalian cells.

  18. Bacterial genes in the aphid genome: absence of functional gene transfer from Buchnera to its host.

    Directory of Open Access Journals (Sweden)

    Naruo Nikoh

    2010-02-01

    Full Text Available Genome reduction is typical of obligate symbionts. In cellular organelles, this reduction partly reflects transfer of ancestral bacterial genes to the host genome, but little is known about gene transfer in other obligate symbioses. Aphids harbor anciently acquired obligate mutualists, Buchnera aphidicola (Gammaproteobacteria, which have highly reduced genomes (420-650 kb, raising the possibility of gene transfer from ancestral Buchnera to the aphid genome. In addition, aphids often harbor other bacteria that also are potential sources of transferred genes. Previous limited sampling of genes expressed in bacteriocytes, the specialized cells that harbor Buchnera, revealed that aphids acquired at least two genes from bacteria. The newly sequenced genome of the pea aphid, Acyrthosiphon pisum, presents the first opportunity for a complete inventory of genes transferred from bacteria to the host genome in the context of an ancient obligate symbiosis. Computational screening of the entire A. pisum genome, followed by phylogenetic and experimental analyses, provided strong support for the transfer of 12 genes or gene fragments from bacteria to the aphid genome: three LD-carboxypeptidases (LdcA1, LdcA2,psiLdcA, five rare lipoprotein As (RlpA1-5, N-acetylmuramoyl-L-alanine amidase (AmiD, 1,4-beta-N-acetylmuramidase (bLys, DNA polymerase III alpha chain (psiDnaE, and ATP synthase delta chain (psiAtpH. Buchnera was the apparent source of two highly truncated pseudogenes (psiDnaE and psiAtpH. Most other transferred genes were closely related to genes from relatives of Wolbachia (Alphaproteobacteria. At least eight of the transferred genes (LdcA1, AmiD, RlpA1-5, bLys appear to be functional, and expression of seven (LdcA1, AmiD, RlpA1-5 are highly upregulated in bacteriocytes. The LdcAs and RlpAs appear to have been duplicated after transfer. Our results excluded the hypothesis that genome reduction in Buchnera has been accompanied by gene transfer to the

  19. Development and validation of an rDNA operon based primer walking strategy applicable to de novo bacterial genome finishing.

    Directory of Open Access Journals (Sweden)

    Alexander William Eastman

    2015-01-01

    Full Text Available Advances in sequencing technology have drastically increased the depth and feasibility of bacterial genome sequencing. However, little information is available that details the specific techniques and procedures employed during genome sequencing despite the large numbers of published genomes. Shotgun approaches employed by second-generation sequencing platforms has necessitated the development of robust bioinformatics tools for in silico assembly, and complete assembly is limited by the presence of repetitive DNA sequences and multi-copy operons. Typically, re-sequencing with multiple platforms and laborious, targeted Sanger sequencing are employed to finish a draft bacterial genome. Here we describe a novel strategy based on the identification and targeted sequencing of repetitive rDNA operons to expedite bacterial genome assembly and finishing. Our strategy was validated by finishing the genome of Paenibacillus polymyxa strain CR1, a bacterium with potential in sustainable agriculture and bio-based processes. An analysis of the 38 contigs contained in the P. polymyxa strain CR1 draft genome revealed 12 repetitive rDNA operons with varied intragenic and flanking regions of variable length, unanimously located at contig boundaries and within contig gaps. These highly similar but not identical rDNA operons were experimentally verified and sequenced simultaneously with multiple, specially designed primer sets. This approach also identified and corrected significant sequence rearrangement generated during the initial in silico assembly of sequencing reads. Our approach reduces the required effort associated with blind primer walking for contig assembly, increasing both the speed and feasibility of genome finishing. Our study further reinforces the notion that repetitive DNA elements are major limiting factors for genome finishing. Moreover, we provided a step-by-step workflow for genome finishing, which may guide future bacterial genome finishing

  20. Genome-wide identification of Streptococcus pneumoniae genes essential for bacterial replication during experimental meningitis

    DEFF Research Database (Denmark)

    Molzen, T E; Burghout, P; Bootsma, H J

    2010-01-01

    Meningitis is the most serious of invasive infections caused by the Gram-positive bacterium Streptococcus pneumoniae. Vaccines protect only against a limited number of serotypes, and evolving bacterial resistance to antimicrobials impedes treatment. Further insight into the molecular pathogenesis...... of invasive pneumococcal disease is required in order to enable the development of new or adjunctive treatments and/or pneumococcal vaccines that are efficient across serotypes. We applied genomic array footprinting (GAF) in the search for S. pneumoniae genes that are essential during experimental meningitis...

  1. 13C metabolic flux analysis at a genome-scale.

    Science.gov (United States)

    Gopalakrishnan, Saratram; Maranas, Costas D

    2015-11-01

    Metabolic models used in 13C metabolic flux analysis generally include a limited number of reactions primarily from central metabolism. They typically omit degradation pathways, complete cofactor balances, and atom transition contributions for reactions outside central metabolism. This study addresses the impact on prediction fidelity of scaling-up mapping models to a genome-scale. The core mapping model employed in this study accounts for (75 reactions and 65 metabolites) primarily from central metabolism. The genome-scale metabolic mapping model (GSMM) (697 reaction and 595 metabolites) is constructed using as a basis the iAF1260 model upon eliminating reactions guaranteed not to carry flux based on growth and fermentation data for a minimal glucose growth medium. Labeling data for 17 amino acid fragments obtained from cells fed with glucose labeled at the second carbon was used to obtain fluxes and ranges. Metabolic fluxes and confidence intervals are estimated, for both core and genome-scale mapping models, by minimizing the sum of square of differences between predicted and experimentally measured labeling patterns using the EMU decomposition algorithm. Overall, we find that both topology and estimated values of the metabolic fluxes remain largely consistent between core and GSM model. Stepping up to a genome-scale mapping model leads to wider flux inference ranges for 20 key reactions present in the core model. The glycolysis flux range doubles due to the possibility of active gluconeogenesis, the TCA flux range expanded by 80% due to the availability of a bypass through arginine consistent with labeling data, and the transhydrogenase reaction flux was essentially unresolved due to the presence of as many as five routes for the inter-conversion of NADPH to NADH afforded by the genome-scale model. By globally accounting for ATP demands in the GSMM model the unused ATP decreased drastically with the lower bound matching the maintenance ATP requirement. A non

  2. Coordination of genomic structure and transcription by the main bacterial nucleoid-associated protein HU.

    Science.gov (United States)

    Berger, Michael; Farcas, Anca; Geertz, Marcel; Zhelyazkova, Petya; Brix, Klaudia; Travers, Andrew; Muskhelishvili, Georgi

    2010-01-01

    The histone-like protein HU is a highly abundant DNA architectural protein that is involved in compacting the DNA of the bacterial nucleoid and in regulating the main DNA transactions, including gene transcription. However, the coordination of the genomic structure and function by HU is poorly understood. Here, we address this question by comparing transcript patterns and spatial distributions of RNA polymerase in Escherichia coli wild-type and hupA/B mutant cells. We demonstrate that, in mutant cells, upregulated genes are preferentially clustered in a large chromosomal domain comprising the ribosomal RNA operons organized on both sides of OriC. Furthermore, we show that, in parallel to this transcription asymmetry, mutant cells are also impaired in forming the transcription foci-spatially confined aggregations of RNA polymerase molecules transcribing strong ribosomal RNA operons. Our data thus implicate HU in coordinating the global genomic structure and function by regulating the spatial distribution of RNA polymerase in the nucleoid.

  3. Genomics reveals historic and contemporary transmission dynamics of a bacterial disease among wildlife and livestock

    Science.gov (United States)

    Kamath, Pauline L.; Foster, Jeffrey T.; Drees, Kevin P.; Luikart, Gordon; Quance, Christine; Anderson, Neil J.; Clarke, P. Ryan; Cole, Eric K.; Drew, Mark L.; Edwards, William H.; Rhyan, Jack C.; Treanor, John J.; Wallen, Rick L.; White, Patrick J.; Robbe-Austerman, Suelee; Cross, Paul C.

    2016-01-01

    Whole-genome sequencing has provided fundamental insights into infectious disease epidemiology, but has rarely been used for examining transmission dynamics of a bacterial pathogen in wildlife. In the Greater Yellowstone Ecosystem (GYE), outbreaks of brucellosis have increased in cattle along with rising seroprevalence in elk. Here we use a genomic approach to examine Brucella abortus evolution, cross-species transmission and spatial spread in the GYE. We find that brucellosis was introduced into wildlife in this region at least five times. The diffusion rate varies among Brucella lineages (B3 to 8 km per year) and over time. We also estimate 12 host transitions from bison to elk, and 5 from elk to bison. Our results support the notion that free-ranging elk are currently a self-sustaining brucellosis reservoir and the source of livestock infections, and that control measures in bison are unlikely to affect the dynamics of unrelated strains circulating in nearby elk populations.

  4. Computer models of bacterial cells: from generalized coarsegrained to genome-specific modular models

    Science.gov (United States)

    Nikolaev, Evgeni V.; Atlas, Jordan C.; Shuler, Michael L.

    2006-09-01

    We discuss a modular modelling framework to rapidly develop mathematical models of bacterial cells that would explicitly link genomic details to cell physiology and population response. An initial step in this approach is the development of a coarse-grained model, describing pseudo-chemical interactions between lumped species. A hybrid model of interest can then be constructed by embedding genome-specific detail for a particular cellular subsystem (e.g. central metabolism), called here a module, into the coarse-grained model. Specifically, a new strategy for sensitivity analysis of the cell division limit cycle is introduced to identify which pseudo-molecular processes should be delumped to implement a particular biological function in a growing cell (e.g. ethanol overproduction or pathogen viability). To illustrate the modeling principles and highlight computational challenges, the Cornell coarsegrained model of Escherichia coli B/r-A is used to benchmark the proposed framework.

  5. Evolution of Intra-specific Regulatory Networks in a Multipartite Bacterial Genome.

    Directory of Open Access Journals (Sweden)

    Marco Galardini

    2015-09-01

    Full Text Available Reconstruction of the regulatory network is an important step in understanding how organisms control the expression of gene products and therefore phenotypes. Recent studies have pointed out the importance of regulatory network plasticity in bacterial adaptation and evolution. The evolution of such networks within and outside the species boundary is however still obscure. Sinorhizobium meliloti is an ideal species for such study, having three large replicons, many genomes available and a significant knowledge of its transcription factors (TF. Each replicon has a specific functional and evolutionary mark; which might also emerge from the analysis of their regulatory signatures. Here we have studied the plasticity of the regulatory network within and outside the S. meliloti species, looking for the presence of 41 TFs binding motifs in 51 strains and 5 related rhizobial species. We have detected a preference of several TFs for one of the three replicons, and the function of regulated genes was found to be in accordance with the overall replicon functional signature: house-keeping functions for the chromosome, metabolism for the chromid, symbiosis for the megaplasmid. This therefore suggests a replicon-specific wiring of the regulatory network in the S. meliloti species. At the same time a significant part of the predicted regulatory network is shared between the chromosome and the chromid, thus adding an additional layer by which the chromid integrates itself in the core genome. Furthermore, the regulatory network distance was found to be correlated with both promoter regions and accessory genome evolution inside the species, indicating that both pangenome compartments are involved in the regulatory network evolution. We also observed that genes which are not included in the species regulatory network are more likely to belong to the accessory genome, indicating that regulatory interactions should also be considered to predict gene conservation in

  6. Evolution of Intra-specific Regulatory Networks in a Multipartite Bacterial Genome.

    Science.gov (United States)

    Galardini, Marco; Brilli, Matteo; Spini, Giulia; Rossi, Matteo; Roncaglia, Bianca; Bani, Alessia; Chiancianesi, Manuela; Moretto, Marco; Engelen, Kristof; Bacci, Giovanni; Pini, Francesco; Biondi, Emanuele G; Bazzicalupo, Marco; Mengoni, Alessio

    2015-09-01

    Reconstruction of the regulatory network is an important step in understanding how organisms control the expression of gene products and therefore phenotypes. Recent studies have pointed out the importance of regulatory network plasticity in bacterial adaptation and evolution. The evolution of such networks within and outside the species boundary is however still obscure. Sinorhizobium meliloti is an ideal species for such study, having three large replicons, many genomes available and a significant knowledge of its transcription factors (TF). Each replicon has a specific functional and evolutionary mark; which might also emerge from the analysis of their regulatory signatures. Here we have studied the plasticity of the regulatory network within and outside the S. meliloti species, looking for the presence of 41 TFs binding motifs in 51 strains and 5 related rhizobial species. We have detected a preference of several TFs for one of the three replicons, and the function of regulated genes was found to be in accordance with the overall replicon functional signature: house-keeping functions for the chromosome, metabolism for the chromid, symbiosis for the megaplasmid. This therefore suggests a replicon-specific wiring of the regulatory network in the S. meliloti species. At the same time a significant part of the predicted regulatory network is shared between the chromosome and the chromid, thus adding an additional layer by which the chromid integrates itself in the core genome. Furthermore, the regulatory network distance was found to be correlated with both promoter regions and accessory genome evolution inside the species, indicating that both pangenome compartments are involved in the regulatory network evolution. We also observed that genes which are not included in the species regulatory network are more likely to belong to the accessory genome, indicating that regulatory interactions should also be considered to predict gene conservation in bacterial

  7. Single-molecule approach to bacterial genomic comparisons via optical mapping.

    Energy Technology Data Exchange (ETDEWEB)

    Zhou, Shiguo [Univ. Wisc.-Madison; Kile, A. [Univ. Wisc.-Madison; Bechner, M. [Univ. Wisc.-Madison; Kvikstad, E. [Univ. Wisc.-Madison; Deng, W. [Univ. Wisc.-Madison; Wei, J. [Univ. Wisc.-Madison; Severin, J. [Univ. Wisc.-Madison; Runnheim, R. [Univ. Wisc.-Madison; Churas, C. [Univ. Wisc.-Madison; Forrest, D. [Univ. Wisc.-Madison; Dimalanta, E. [Univ. Wisc.-Madison; Lamers, C. [Univ. Wisc.-Madison; Burland, V. [Univ. Wisc.-Madison; Blattner, F. R. [Univ. Wisc.-Madison; Schwartz, David C. [Univ. Wisc.-Madison

    2004-01-01

    Modern comparative genomics has been established, in part, by the sequencing and annotation of a broad range of microbial species. To gain further insights, new sequencing efforts are now dealing with the variety of strains or isolates that gives a species definition and range; however, this number vastly outstrips our ability to sequence them. Given the availability of a large number of microbial species, new whole genome approaches must be developed to fully leverage this information at the level of strain diversity that maximize discovery. Here, we describe how optical mapping, a single-molecule system, was used to identify and annotate chromosomal alterations between bacterial strains represented by several species. Since whole-genome optical maps are ordered restriction maps, sequenced strains of Shigella flexneri serotype 2a (2457T and 301), Yersinia pestis (CO 92 and KIM), and Escherichia coli were aligned as maps to identify regions of homology and to further characterize them as possible insertions, deletions, inversions, or translocations. Importantly, an unsequenced Shigella flexneri strain (serotype Y strain AMC[328Y]) was optically mapped and aligned with two sequenced ones to reveal one novel locus implicated in serotype conversion and several other loci containing insertion sequence elements or phage-related gene insertions. Our results suggest that genomic rearrangements and chromosomal breakpoints are readily identified and annotated against a prototypic sequenced strain by using the tools of optical mapping.

  8. Bayesian prediction of bacterial growth temperature range based on genome sequences

    Directory of Open Access Journals (Sweden)

    Jensen Dan B

    2012-12-01

    Full Text Available Abstract Background The preferred habitat of a given bacterium can provide a hint of which types of enzymes of potential industrial interest it might produce. These might include enzymes that are stable and active at very high or very low temperatures. Being able to accurately predict this based on a genomic sequence, would thus allow for an efficient and targeted search for production organisms, reducing the need for culturing experiments. Results This study found a total of 40 protein families useful for distinction between three thermophilicity classes (thermophiles, mesophiles and psychrophiles. The predictive performance of these protein families were compared to those of 87 basic sequence features (relative use of amino acids and codons, genomic and 16S rDNA AT content and genome size. When using naïve Bayesian inference, it was possible to correctly predict the optimal temperature range with a Matthews correlation coefficient of up to 0.68. The best predictive performance was always achieved by including protein families as well as structural features, compared to either of these alone. A dedicated computer program was created to perform these predictions. Conclusions This study shows that protein families associated with specific thermophilicity classes can provide effective input data for thermophilicity prediction, and that the naïve Bayesian approach is effective for such a task. The program created for this study is able to efficiently distinguish between thermophilic, mesophilic and psychrophilic adapted bacterial genomes.

  9. Large-Scale Sequencing: The Future of Genomic Sciences Colloquium

    Energy Technology Data Exchange (ETDEWEB)

    Margaret Riley; Merry Buckley

    2009-01-01

    Genetic sequencing and the various molecular techniques it has enabled have revolutionized the field of microbiology. Examining and comparing the genetic sequences borne by microbes - including bacteria, archaea, viruses, and microbial eukaryotes - provides researchers insights into the processes microbes carry out, their pathogenic traits, and new ways to use microorganisms in medicine and manufacturing. Until recently, sequencing entire microbial genomes has been laborious and expensive, and the decision to sequence the genome of an organism was made on a case-by-case basis by individual researchers and funding agencies. Now, thanks to new technologies, the cost and effort of sequencing is within reach for even the smallest facilities, and the ability to sequence the genomes of a significant fraction of microbial life may be possible. The availability of numerous microbial genomes will enable unprecedented insights into microbial evolution, function, and physiology. However, the current ad hoc approach to gathering sequence data has resulted in an unbalanced and highly biased sampling of microbial diversity. A well-coordinated, large-scale effort to target the breadth and depth of microbial diversity would result in the greatest impact. The American Academy of Microbiology convened a colloquium to discuss the scientific benefits of engaging in a large-scale, taxonomically-based sequencing project. A group of individuals with expertise in microbiology, genomics, informatics, ecology, and evolution deliberated on the issues inherent in such an effort and generated a set of specific recommendations for how best to proceed. The vast majority of microbes are presently uncultured and, thus, pose significant challenges to such a taxonomically-based approach to sampling genome diversity. However, we have yet to even scratch the surface of the genomic diversity among cultured microbes. A coordinated sequencing effort of cultured organisms is an appropriate place to begin

  10. Genome-scale constraint-based modeling of Geobacter metallireducens

    Directory of Open Access Journals (Sweden)

    Famili Iman

    2009-01-01

    Full Text Available Abstract Background Geobacter metallireducens was the first organism that can be grown in pure culture to completely oxidize organic compounds with Fe(III oxide serving as electron acceptor. Geobacter species, including G. sulfurreducens and G. metallireducens, are used for bioremediation and electricity generation from waste organic matter and renewable biomass. The constraint-based modeling approach enables the development of genome-scale in silico models that can predict the behavior of complex biological systems and their responses to the environments. Such a modeling approach was applied to provide physiological and ecological insights on the metabolism of G. metallireducens. Results The genome-scale metabolic model of G. metallireducens was constructed to include 747 genes and 697 reactions. Compared to the G. sulfurreducens model, the G. metallireducens metabolic model contains 118 unique reactions that reflect many of G. metallireducens' specific metabolic capabilities. Detailed examination of the G. metallireducens model suggests that its central metabolism contains several energy-inefficient reactions that are not present in the G. sulfurreducens model. Experimental biomass yield of G. metallireducens growing on pyruvate was lower than the predicted optimal biomass yield. Microarray data of G. metallireducens growing with benzoate and acetate indicated that genes encoding these energy-inefficient reactions were up-regulated by benzoate. These results suggested that the energy-inefficient reactions were likely turned off during G. metallireducens growth with acetate for optimal biomass yield, but were up-regulated during growth with complex electron donors such as benzoate for rapid energy generation. Furthermore, several computational modeling approaches were applied to accelerate G. metallireducens research. For example, growth of G. metallireducens with different electron donors and electron acceptors were studied using the genome-scale

  11. A genome-scale metabolic reconstruction of Mycoplasma genitalium, iPS189.

    Directory of Open Access Journals (Sweden)

    Patrick F Suthers

    2009-02-01

    Full Text Available With a genome size of approximately 580 kb and approximately 480 protein coding regions, Mycoplasma genitalium is one of the smallest known self-replicating organisms and, additionally, has extremely fastidious nutrient requirements. The reduced genomic content of M. genitalium has led researchers to suggest that the molecular assembly contained in this organism may be a close approximation to the minimal set of genes required for bacterial growth. Here, we introduce a systematic approach for the construction and curation of a genome-scale in silico metabolic model for M. genitalium. Key challenges included estimation of biomass composition, handling of enzymes with broad specificities, and the lack of a defined medium. Computational tools were subsequently employed to identify and resolve connectivity gaps in the model as well as growth prediction inconsistencies with gene essentiality experimental data. The curated model, M. genitalium iPS189 (262 reactions, 274 metabolites, is 87% accurate in recapitulating in vivo gene essentiality results for M. genitalium. Approaches and tools described herein provide a roadmap for the automated construction of in silico metabolic models of other organisms.

  12. BPhyOG: An interactive server for genome-wide inference of bacterial phylogenies based on overlapping genes

    Directory of Open Access Journals (Sweden)

    Lin Kui

    2007-07-01

    Full Text Available Abstract Background Overlapping genes (OGs in bacterial genomes are pairs of adjacent genes of which the coding sequences overlap partly or entirely. With the rapid accumulation of sequence data, many OGs in bacterial genomes have now been identified. Indeed, these might prove a consistent feature across all microbial genomes. Our previous work suggests that OGs can be considered as robust markers at the whole genome level for the construction of phylogenies. An online, interactive web server for inferring phylogenies is needed for biologists to analyze phylogenetic relationships among a set of bacterial genomes of interest. Description BPhyOG is an online interactive server for reconstructing the phylogenies of completely sequenced bacterial genomes on the basis of their shared overlapping genes. It provides two tree-reconstruction methods: Neighbor Joining (NJ and Unweighted Pair-Group Method using Arithmetic averages (UPGMA. Users can apply the desired method to generate phylogenetic trees, which are based on an evolutionary distance matrix for the selected genomes. The distance between two genomes is defined by the normalized number of their shared OG pairs. BPhyOG also allows users to browse the OGs that were used to infer the phylogenetic relationships. It provides detailed annotation for each OG pair and the features of the component genes through hyperlinks. Users can also retrieve each of the homologous OG pairs that have been determined among 177 genomes. It is a useful tool for analyzing the tree of life and overlapping genes from a genomic standpoint. Conclusion BPhyOG is a useful interactive web server for genome-wide inference of any potential evolutionary relationship among the genomes selected by users. It currently includes 177 completely sequenced bacterial genomes containing 79,855 OG pairs, the annotation and homologous OG pairs of which are integrated comprehensively. The reliability of phylogenies complemented by

  13. Differential regulation of horizontally acquired and core genome genes by the bacterial modulator H-NS.

    Directory of Open Access Journals (Sweden)

    Rosa C Baños

    2009-06-01

    Full Text Available Horizontal acquisition of DNA by bacteria dramatically increases genetic diversity and hence successful bacterial colonization of several niches, including the human host. A relevant issue is how this newly acquired DNA interacts and integrates in the regulatory networks of the bacterial cell. The global modulator H-NS targets both core genome and HGT genes and silences gene expression in response to external stimuli such as osmolarity and temperature. Here we provide evidence that H-NS discriminates and differentially modulates core and HGT DNA. As an example of this, plasmid R27-encoded H-NS protein has evolved to selectively silence HGT genes and does not interfere with core genome regulation. In turn, differential regulation of both gene lineages by resident chromosomal H-NS requires a helper protein: the Hha protein. Tight silencing of HGT DNA is accomplished by H-NS-Hha complexes. In contrast, core genes are modulated by H-NS homoligomers. Remarkably, the presence of Hha-like proteins is restricted to the Enterobacteriaceae. In addition, conjugative plasmids encoding H-NS variants have hitherto been isolated only from members of the family. Thus, the H-NS system in enteric bacteria presents unique evolutionary features. The capacity to selectively discriminate between core and HGT DNA may help to maintain horizontally transmitted DNA in silent form and may give these bacteria a competitive advantage in adapting to new environments, including host colonization.

  14. Gain and loss of phototrophic genes revealed by comparison of two Citromicrobium bacterial genomes.

    Directory of Open Access Journals (Sweden)

    Qiang Zheng

    Full Text Available Proteobacteria are thought to have diverged from a phototrophic ancestor, according to the scattered distribution of phototrophy throughout the proteobacterial clade, and so the occurrence of numerous closely related phototrophic and chemotrophic microorganisms may be the result of the loss of genes for phototrophy. A widespread form of bacterial phototrophy is based on the photochemical reaction center, encoded by puf and puh operons that typically are in a 'photosynthesis gene cluster' (abbreviated as the PGC with pigment biosynthesis genes. Comparison of two closely related Citromicrobial genomes (98.1% sequence identity of complete 16S rRNA genes, Citromicrobium sp. JL354, which contains two copies of reaction center genes, and Citromicrobium strain JLT1363, which is chemotrophic, revealed evidence for the loss of phototrophic genes. However, evidence of horizontal gene transfer was found in these two bacterial genomes. An incomplete PGC (pufLMC-puhCBA in strain JL354 was located within an integrating conjugative element, which indicates a potential mechanism for the horizontal transfer of genes for phototrophy.

  15. A genomic scale map of genetic diversity in Trypanosoma cruzi

    Directory of Open Access Journals (Sweden)

    Ackermann Alejandro A

    2012-12-01

    Full Text Available Abstract Background Trypanosoma cruzi, the causal agent of Chagas Disease, affects more than 16 million people in Latin America. The clinical outcome of the disease results from a complex interplay between environmental factors and the genetic background of both the human host and the parasite. However, knowledge of the genetic diversity of the parasite, is currently limited to a number of highly studied loci. The availability of a number of genomes from different evolutionary lineages of T. cruzi provides an unprecedented opportunity to look at the genetic diversity of the parasite at a genomic scale. Results Using a bioinformatic strategy, we have clustered T. cruzi sequence data available in the public domain and obtained multiple sequence alignments in which one or two alleles from the reference CL-Brener were included. These data covers 4 major evolutionary lineages (DTUs: TcI, TcII, TcIII, and the hybrid TcVI. Using these set of alignments we have identified 288,957 high quality single nucleotide polymorphisms and 1,480 indels. In a reduced re-sequencing study we were able to validate ~ 97% of high-quality SNPs identified in 47 loci. Analysis of how these changes affect encoded protein products showed a 0.77 ratio of synonymous to non-synonymous changes in the T. cruzi genome. We observed 113 changes that introduce or remove a stop codon, some causing significant functional changes, and a number of tri-allelic and tetra-allelic SNPs that could be exploited in strain typing assays. Based on an analysis of the observed nucleotide diversity we show that the T. cruzi genome contains a core set of genes that are under apparent purifying selection. Interestingly, orthologs of known druggable targets show statistically significant lower nucleotide diversity values. Conclusions This study provides the first look at the genetic diversity of T. cruzi at a genomic scale. The analysis covers an estimated ~ 60% of the genetic diversity present in the

  16. BFAST: an alignment tool for large scale genome resequencing.

    Directory of Open Access Journals (Sweden)

    Nils Homer

    Full Text Available BACKGROUND: The new generation of massively parallel DNA sequencers, combined with the challenge of whole human genome resequencing, result in the need for rapid and accurate alignment of billions of short DNA sequence reads to a large reference genome. Speed is obviously of great importance, but equally important is maintaining alignment accuracy of short reads, in the 25-100 base range, in the presence of errors and true biological variation. METHODOLOGY: We introduce a new algorithm specifically optimized for this task, as well as a freely available implementation, BFAST, which can align data produced by any of current sequencing platforms, allows for user-customizable levels of speed and accuracy, supports paired end data, and provides for efficient parallel and multi-threaded computation on a computer cluster. The new method is based on creating flexible, efficient whole genome indexes to rapidly map reads to candidate alignment locations, with arbitrary multiple independent indexes allowed to achieve robustness against read errors and sequence variants. The final local alignment uses a Smith-Waterman method, with gaps to support the detection of small indels. CONCLUSIONS: We compare BFAST to a selection of large-scale alignment tools -- BLAT, MAQ, SHRiMP, and SOAP -- in terms of both speed and accuracy, using simulated and real-world datasets. We show BFAST can achieve substantially greater sensitivity of alignment in the context of errors and true variants, especially insertions and deletions, and minimize false mappings, while maintaining adequate speed compared to other current methods. We show BFAST can align the amount of data needed to fully resequence a human genome, one billion reads, with high sensitivity and accuracy, on a modest computer cluster in less than 24 hours. BFAST is available at (http://bfast.sourceforge.net.

  17. Modeling Lactococcus lactis using a genome-scale flux model

    Directory of Open Access Journals (Sweden)

    Nielsen Jens

    2005-06-01

    Full Text Available Abstract Background Genome-scale flux models are useful tools to represent and analyze microbial metabolism. In this work we reconstructed the metabolic network of the lactic acid bacteria Lactococcus lactis and developed a genome-scale flux model able to simulate and analyze network capabilities and whole-cell function under aerobic and anaerobic continuous cultures. Flux balance analysis (FBA and minimization of metabolic adjustment (MOMA were used as modeling frameworks. Results The metabolic network was reconstructed using the annotated genome sequence from L. lactis ssp. lactis IL1403 together with physiological and biochemical information. The established network comprised a total of 621 reactions and 509 metabolites, representing the overall metabolism of L. lactis. Experimental data reported in the literature was used to fit the model to phenotypic observations. Regulatory constraints had to be included to simulate certain metabolic features, such as the shift from homo to heterolactic fermentation. A minimal medium for in silico growth was identified, indicating the requirement of four amino acids in addition to a sugar. Remarkably, de novo biosynthesis of four other amino acids was observed even when all amino acids were supplied, which is in good agreement with experimental observations. Additionally, enhanced metabolic engineering strategies for improved diacetyl producing strains were designed. Conclusion The L. lactis metabolic network can now be used for a better understanding of lactococcal metabolic capabilities and potential, for the design of enhanced metabolic engineering strategies and for integration with other types of 'omic' data, to assist in finding new information on cellular organization and function.

  18. Genome Analysis of a Zygomycete Fungus Choanephora cucurbitarum Elucidates Necrotrophic Features Including Bacterial Genes Related to Plant Colonization

    Science.gov (United States)

    Min, Byoungnam; Park, Ji-Hyun; Park, Hongjae; Shin, Hyeon-Dong; Choi, In-Geol

    2017-01-01

    A zygomycete fungus, Choanephora cucurbitarum is a plant pathogen that causes blossom rot in cucurbits and other plants. Here we report the genome sequence of Choanephora cucurbitarum KUS-F28377 isolated from squash. The assembled genome has a size of 29.1 Mbp and 11,977 protein-coding genes. The genome analysis indicated that C. cucurbitarum may employ a plant pathogenic mechanism similar to that of bacterial plant pathogens. The genome contained 11 genes with a Streptomyces subtilisin inhibitor-like domain, which plays an important role in the defense against plant immunity. This domain has been found only in bacterial genomes. Carbohydrate active enzyme analysis detected 312 CAZymes in this genome where carbohydrate esterase family 6, rarely found in dikaryotic fungal genomes, was comparatively enriched. The comparative genome analysis showed that the genes related to sexual communication such as the biosynthesis of β-carotene and trisporic acid were conserved and diverged during the evolution of zygomycete genomes. Overall, these findings will help us to understand how zygomycetes are associated with plants. PMID:28091548

  19. First genomic insights into members of a candidate bacterial phylum responsible for wastewater bulking

    Directory of Open Access Journals (Sweden)

    Yuji Sekiguchi

    2015-01-01

    Full Text Available Filamentous cells belonging to the candidate bacterial phylum KSB3 were previously identified as the causative agent of fatal filament overgrowth (bulking in a high-rate industrial anaerobic wastewater treatment bioreactor. Here, we obtained near complete genomes from two KSB3 populations in the bioreactor, including the dominant bulking filament, using differential coverage binning of metagenomic data. Fluorescence in situ hybridization with 16S rRNA-targeted probes specific for the two populations confirmed that both are filamentous organisms. Genome-based metabolic reconstruction and microscopic observation of the KSB3 filaments in the presence of sugar gradients indicate that both filament types are Gram-negative, strictly anaerobic fermenters capable of non-flagellar based gliding motility, and have a strikingly large number of sensory and response regulator genes. We propose that the KSB3 filaments are highly sensitive to their surroundings and that cellular processes, including those causing bulking, are controlled by external stimuli. The obtained genomes lay the foundation for a more detailed understanding of environmental cues used by KSB3 filaments, which may lead to more robust treatment options to prevent bulking.

  20. Genome-wide Selective Sweeps in Natural Bacterial Populations Revealed by Time-series Metagenomics

    Energy Technology Data Exchange (ETDEWEB)

    Chan, Leong-Keat; Bendall, Matthew L.; Malfatti, Stephanie; Schwientek, Patrick; Tremblay, Julien; Schackwitz, Wendy; Martin, Joel; Pati, Amrita; Bushnell, Brian; Foster, Brian; Kang, Dongwan; Tringe, Susannah G.; Bertilsson, Stefan; Moran, Mary Ann; Shade, Ashley; Newton, Ryan J.; Stevens, Sarah; McMahon, Katherine D.; Malmstrom, Rex R.

    2014-06-18

    Multiple evolutionary models have been proposed to explain the formation of genetically and ecologically distinct bacterial groups. Time-series metagenomics enables direct observation of evolutionary processes in natural populations, and if applied over a sufficiently long time frame, this approach could capture events such as gene-specific or genome-wide selective sweeps. Direct observations of either process could help resolve how distinct groups form in natural microbial assemblages. Here, from a three-year metagenomic study of a freshwater lake, we explore changes in single nucleotide polymorphism (SNP) frequencies and patterns of gene gain and loss in populations of Chlorobiaceae and Methylophilaceae. SNP analyses revealed substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied considerably among closely related, co-occurring Methylophilaceae populations. SNP allele frequencies, as well as the relative abundance of certain genes, changed dramatically over time in each population. Interestingly, SNP diversity was purged at nearly every genome position in one of the Chlorobiaceae populations over the course of three years, while at the same time multiple genes either swept through or were swept from this population. These patterns were consistent with a genome-wide selective sweep, a process predicted by the ‘ecotype model’ of diversification, but not previously observed in natural populations.

  1. Genome-wide Selective Sweeps in Natural Bacterial Populations Revealed by Time-series Metagenomics

    Energy Technology Data Exchange (ETDEWEB)

    Chan, Leong-Keat; Bendall, Matthew L.; Malfatti, Stephanie; Schwientek, Patrick; Tremblay, Julien; Schackwitz, Wendy; Martin, Joel; Pati, Amrita; Bushnell, Brian; Foster, Brian; Kang, Dongwan; Tringe, Susannah G.; Bertilsson, Stefan; Moran, Mary Ann; Shade, Ashley; Newton, Ryan J.; Stevens, Sarah; McMcahon, Katherine D.; Mamlstrom, Rex R.

    2014-05-12

    Multiple evolutionary models have been proposed to explain the formation of genetically and ecologically distinct bacterial groups. Time-series metagenomics enables direct observation of evolutionary processes in natural populations, and if applied over a sufficiently long time frame, this approach could capture events such as gene-specific or genome-wide selective sweeps. Direct observations of either process could help resolve how distinct groups form in natural microbial assemblages. Here, from a three-year metagenomic study of a freshwater lake, we explore changes in single nucleotide polymorphism (SNP) frequencies and patterns of gene gain and loss in populations of Chlorobiaceae and Methylophilaceae. SNP analyses revealed substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied considerably among closely related, co-occurring Methylophilaceae populations. SNP allele frequencies, as well as the relative abundance of certain genes, changed dramatically over time in each population. Interestingly, SNP diversity was purged at nearly every genome position in one of the Chlorobiaceae populations over the course of three years, while at the same time multiple genes either swept through or were swept from this population. These patterns were consistent with a genome-wide selective sweep, a process predicted by the ecotype model? of diversification, but not previously observed in natural populations.

  2. Birth of a W sex chromosome by horizontal transfer of Wolbachia bacterial symbiont genome

    Science.gov (United States)

    Leclercq, Sébastien; Thézé, Julien; Chebbi, Mohamed Amine; Giraud, Isabelle; Moumen, Bouziane; Ernenwein, Lise; Grève, Pierre; Cordaux, Richard

    2016-01-01

    Sex determination is a fundamental developmental pathway governing male and female differentiation, with profound implications for morphology, reproductive strategies, and behavior. In animals, sex differences between males and females are generally determined by genetic factors carried by sex chromosomes. Sex chromosomes are remarkably variable in origin and can differ even between closely related species, indicating that transitions occur frequently and independently in different groups of organisms. The evolutionary causes underlying sex chromosome turnover are poorly understood, however. Here we provide evidence indicating that Wolbachia bacterial endosymbionts triggered the evolution of new sex chromosomes in the common pillbug Armadillidium vulgare. We identified a 3-Mb insert of a feminizing Wolbachia genome that was recently transferred into the pillbug nuclear genome. The Wolbachia insert shows perfect linkage to the female sex, occurs in a male genetic background (i.e., lacking the ancestral W female sex chromosome), and is hemizygous. Our results support the conclusion that the Wolbachia insert is now acting as a female sex-determining region in pillbugs, and that the chromosome carrying the insert is a new W sex chromosome. Thus, bacteria-to-animal horizontal genome transfer represents a remarkable mechanism underpinning the birth of sex chromosomes. We conclude that sex ratio distorters, such as Wolbachia endosymbionts, can be powerful agents of evolutionary transitions in sex determination systems in animals. PMID:27930295

  3. Multi-scale coding of genomic information: From DNA sequence to genome structure and function

    Energy Technology Data Exchange (ETDEWEB)

    Arneodo, Alain, E-mail: alain.arneodo@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); Vaillant, Cedric, E-mail: cedric.vaillant@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); Audit, Benjamin, E-mail: benjamin.audit@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); Argoul, Francoise, E-mail: francoise.argoul@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); D' Aubenton-Carafa, Yves, E-mail: daubenton@cgm.cnrs-gif.f [Centre de Genetique Moleculaire, CNRS, Allee de la Terrasse, 91198 Gif-sur-Yvette (France); Thermes, Claude, E-mail: claude.thermes@cgm.cnrs-gif.f [Centre de Genetique Moleculaire, CNRS, Allee de la Terrasse, 91198 Gif-sur-Yvette (France)

    2011-02-15

    Understanding how chromatin is spatially and dynamically organized in the nucleus of eukaryotic cells and how this affects genome functions is one of the main challenges of cell biology. Since the different orders of packaging in the hierarchical organization of DNA condition the accessibility of DNA sequence elements to trans-acting factors that control the transcription and replication processes, there is actually a wealth of structural and dynamical information to learn in the primary DNA sequence. In this review, we show that when using concepts, methodologies, numerical and experimental techniques coming from statistical mechanics and nonlinear physics combined with wavelet-based multi-scale signal processing, we are able to decipher the multi-scale sequence encoding of chromatin condensation-decondensation mechanisms that play a fundamental role in regulating many molecular processes involved in nuclear functions.

  4. Next-generation genome-scale models for metabolic engineering.

    Science.gov (United States)

    King, Zachary A; Lloyd, Colton J; Feist, Adam M; Palsson, Bernhard O

    2015-12-01

    Constraint-based reconstruction and analysis (COBRA) methods have become widely used tools for metabolic engineering in both academic and industrial laboratories. By employing a genome-scale in silico representation of the metabolic network of a host organism, COBRA methods can be used to predict optimal genetic modifications that improve the rate and yield of chemical production. A new generation of COBRA models and methods is now being developed--encompassing many biological processes and simulation strategies-and next-generation models enable new types of predictions. Here, three key examples of applying COBRA methods to strain optimization are presented and discussed. Then, an outlook is provided on the next generation of COBRA models and the new types of predictions they will enable for systems metabolic engineering.

  5. Identification of polymorphic tandem repeats by direct comparison of genome sequence from different bacterial strains : a web-based resource

    Directory of Open Access Journals (Sweden)

    Vergnaud Gilles

    2004-01-01

    Full Text Available Abstract Background Polymorphic tandem repeat typing is a new generic technology which has been proved to be very efficient for bacterial pathogens such as B. anthracis, M. tuberculosis, P. aeruginosa, L. pneumophila, Y. pestis. The previously developed tandem repeats database takes advantage of the release of genome sequence data for a growing number of bacteria to facilitate the identification of tandem repeats. The development of an assay then requires the evaluation of tandem repeat polymorphism on well-selected sets of isolates. In the case of major human pathogens, such as S. aureus, more than one strain is being sequenced, so that tandem repeats most likely to be polymorphic can now be selected in silico based on genome sequence comparison. Results In addition to the previously described general Tandem Repeats Database, we have developed a tool to automatically identify tandem repeats of a different length in the genome sequence of two (or more closely related bacterial strains. Genome comparisons are pre-computed. The results of the comparisons are parsed in a database, which can be conveniently queried over the internet according to criteria of practical value, including repeat unit length, predicted size difference, etc. Comparisons are available for 16 bacterial species, and the orthopox viruses, including the variola virus and three of its close neighbors. Conclusions We are presenting an internet-based resource to help develop and perform tandem repeats based bacterial strain typing. The tools accessible at http://minisatellites.u-psud.fr now comprise four parts. The Tandem Repeats Database enables the identification of tandem repeats across entire genomes. The Strain Comparison Page identifies tandem repeats differing between different genome sequences from the same species. The "Blast in the Tandem Repeats Database" facilitates the search for a known tandem repeat and the prediction of amplification product sizes. The "Bacterial

  6. Distinct soil bacterial communities along a small-scale elevational gradient in alpine tundra

    Directory of Open Access Journals (Sweden)

    Congcong eShen

    2015-06-01

    Full Text Available The elevational diversity pattern for microorganisms has received great attention recently but is still understudied, and phylogenetic relatedness is rarely studied for microbial elevational distributions. Using a bar-coded pyrosequencing technique, we examined the biodiversity patterns for soil bacterial communities of tundra ecosystem along 2000–2500 m elevations on Changbai Mountain in China. Bacterial taxonomic richness displayed a linear decreasing trend with increasing elevation. Phylogenetic diversity and mean nearest taxon distance (MNTD exhibited a unimodal pattern with elevation. Bacterial communities were more phylogenetically clustered than expected by chance at all elevations based on the standardized effect size of MNTD metric. The bacterial communities differed dramatically among elevations, and the community composition was significantly correlated with soil total carbon, total nitrogen, C:N ratio, and dissolved organic carbon. Multiple ordinary least squares regression analysis showed that the observed biodiversity patterns strongly correlated with soil total carbon and C:N ratio. Taken together, this is the first time that a significant bacterial diversity pattern has been observed across a small-scale elevational gradient. Our results indicated that soil carbon and nitrogen contents were the critical environmental factors affecting bacterial elevational distribution in Changbai Mountain tundra. This suggested that ecological niche-based environmental filtering processes related to soil carbon and nitrogen contents could play a dominant role in structuring bacterial communities along the elevational gradient.

  7. Dynamic bacterial communities on reverse-osmosis membranes in a full-scale desalination plant.

    Science.gov (United States)

    Manes, C-L de O; West, N; Rapenne, S; Lebaron, P

    2011-01-01

    To better understand biofouling of seawater reverse osmosis (SWRO) membranes, bacterial diversity was characterized in the intake water, in subsequently pretreated water and on SWRO membranes from a full-scale desalination plant (FSDP) during a 9 month period. 16S rRNA gene fingerprinting and sequencing revealed that bacterial communities in the water samples and on the SWRO membranes were very different. For the different sampling dates, the bacterial diversity of the active and the total bacterial fractions of the water samples remained relatively stable over the sampling period whereas the bacterial community structure on the four SWRO membrane samples was significantly different. The richness and evenness of the SWRO membrane bacterial communities increased with usage time with an increase in the Shannon diversity index of 2.2 to 3.7. In the oldest SWRO membrane (330 days), no single operational taxonomic unit (OTU) dominated and the majority of the OTUs fell into the Alphaproteobacteria or the Planctomycetes. In striking contrast, a Betaproteobacteria OTU affiliated to the genus Ideonella was dominant and exclusively found in the membrane used for the shortest time (10 days). This suggests that bacteria belonging to this genus could be one of the primary colonizers of the SWRO membrane. Knowledge of the dominant bacterial species on SWRO membranes and their dynamics should help guide culture studies for physiological characterization of biofilm forming species.

  8. Predicting effects of structural stress in a genome-reduced model bacterial metabolism

    Science.gov (United States)

    Güell, Oriol; Sagués, Francesc; Serrano, M. Ángeles

    2012-08-01

    Mycoplasma pneumoniae is a human pathogen recently proposed as a genome-reduced model for bacterial systems biology. Here, we study the response of its metabolic network to different forms of structural stress, including removal of individual and pairs of reactions and knockout of genes and clusters of co-expressed genes. Our results reveal a network architecture as robust as that of other model bacteria regarding multiple failures, although less robust against individual reaction inactivation. Interestingly, metabolite motifs associated to reactions can predict the propagation of inactivation cascades and damage amplification effects arising in double knockouts. We also detect a significant correlation between gene essentiality and damages produced by single gene knockouts, and find that genes controlling high-damage reactions tend to be expressed independently of each other, a functional switch mechanism that, simultaneously, acts as a genetic firewall to protect metabolism. Prediction of failure propagation is crucial for metabolic engineering or disease treatment.

  9. Distinguishing bacterial pathogens of potato using a genome-wide microarray approach.

    Science.gov (United States)

    Aittamaa, M; Somervuo, P; Pirhonen, M; Mattinen, L; Nissinen, R; Auvinen, P; Valkonen, J P T

    2008-09-01

    A set of 9676 probes was designed for the most harmful bacterial pathogens of potato and tested in a microarray format. Gene-specific probes could be designed for all genes of Pectobacterium atrosepticum, c. 50% of the genes of Streptomyces scabies and c. 30% of the genes of Clavibacter michiganensis ssp. sepedonicus utilizing the whole-genome sequence information available. For Streptomyces turgidiscabies, 226 probes were designed according to the sequences of a pathogenicity island containing important virulence genes. In addition, probes were designed for the virulence-associated nip (necrosis-inducing protein) genes of P. atrosepticum, P. carotovorum and Dickeya dadantii and for the intergenic spacer (IGS) sequences of the 16S-23S rRNA gene region. Ralstonia solanacearum was not included in the study, because it is a quarantine organism and is not presently found in Finland, but a few probes were also designed for this species. The probes contained on average 40 target-specific nucleotides and were synthesized on the array in situ, organized as eight sub-arrays with an identical set of probes which could be used for hybridization with different samples. All bacteria were readily distinguished using a single channel system for signal detection. Nearly all of the c. 1000 probes designed for C. michiganensis ssp. sepedonicus, c. 50% and 40% of the c. 4000 probes designed for the genes of S. scabies and P. atrosepticum, respectively, and over 100 probes for S. turgidiscabies showed significant signals only with the respective species. P. atrosepticum, P. carotovorum and Dickeya strains were all detected with 110 common probes. By contrast, the strains of these species were found to differ in their signal profiles. Probes targeting the IGS region and nip genes could be used to place strains of Dickeya to two groups, which correlated with differences in virulence. Taken together, the approach of using a custom-designed, genome-wide microarray provided a robust means

  10. MobilomeFINDER: Web-Based Tools for In Silico and Experimental Discovery of Bacterial Genomic Islands

    OpenAIRE

    Ou, Hong-Yu; He, Xinyi; Harrison, Ewan M.; Kulasekara, Bridget R.; Thani, Ali Bin; Kadioglu, Aras; Hinton, Jay C. D.; Barer, Michael R.; Deng, Zixin; Rajakumar, Kumar; Lory, Stephen

    2007-01-01

    MobilomeFINDER (http://mml.sjtu.edu.cn/MobilomeFINDER) is an interactive online tool that facilitates bacterial genomic island or ‘mobile genome’ (mobilome) discovery; it integrates the ArrayOme and tRNAcc software packages. ArrayOme utilizes a microarray-derived comparative genomic hybridization input data set to generate ‘inferred contigs’ produced by merging adjacent genes classified as ‘present’. Collectively these ‘fragments’ represent a hypothetical ‘microarray-visualized genome (MVG)’....

  11. Expression of lysozymes from Erwinia amylovora phages and Erwinia genomes and inhibition by a bacterial protein.

    Science.gov (United States)

    Müller, Ina; Gernold, Marina; Schneider, Bernd; Geider, Klaus

    2012-01-01

    Genes coding for lysozyme-inhibiting proteins (Ivy) were cloned from the chromosomes of the plant pathogens Erwinia amylovora and Erwinia pyrifoliae. The product interfered not only with activity of hen egg white lysozyme, but also with an enzyme from E. amylovora phage ΦEa1h. We have expressed lysozyme genes from the genomes of three Erwinia species in Escherichia coli. The lysozymes expressed from genes of the E. amylovora phages ΦEa104 and ΦEa116, Erwinia chromosomes and Arabidopsis thaliana were not affected by Ivy. The enzyme from bacteriophage ΦEa1h was fused at the N- or C-terminus to other peptides. Compared to the intact lysozyme, a His-tag reduced its lytic activity about 10-fold and larger fusion proteins abolished activity completely. Specific protease cleavage restored lysozyme activity of a GST-fusion. The bacteriophage-encoded lysozymes were more active than the enzymes from bacterial chromosomes. Viral lyz genes were inserted into a broad-host range vector, and transfer to E. amylovora inhibited cell growth. Inserted in the yeast Pichia pastoris, the ΦEa1h-lysozyme was secreted and also inhibited by Ivy. Here we describe expression of unrelated cloned 'silent' lyz genes from Erwinia chromosomes and a novel interference of bacterial Ivy proteins with a viral lysozyme.

  12. Transgenic Rice Plants Harboring Genomic DNA from Zizania latifolia Confer Bacterial Blight Resistance

    Institute of Scientific and Technical Information of China (English)

    SHEN Wei-wei; SONG Cheng-li; CHEN Jie; Fu Ya-ping; Wu Jian-li; JIANG Shao-mei

    2011-01-01

    Based on the sequence of a resistance gene analog FZ14 derived from Zizania latifolia (Griseb.),a pair of specific PCR primers FZ14P1/FZ14P2 was designed to isolate candidate disease resistance gene.The pooled-PCR approach was adopted using the primer pair to screen a genomic transformation-competent artificial chromosome (TAC) library derived from Z.latifolia.A positive TAC clone (ZR1) was obtained and confirmed by sequence analysis.The results indicated that ZR1 consisted of conserved motifs similar to P-loop (kinase 1a),kinase 2,kinase 3a and GLPL (Gly-Leu-Pro-Leu),suggesting that it could be a portion of NBS-LRR type of resistance gene.Using Agrobacterium-mediated transformation of Nipponbare mature embryo,a total of 48 independent transgenic T0 plants were obtained.Among them,36 plants were highly resistant to the virulent bacterial blight strain P×O71.The results indicate that ZR1 contains at least one functional bacterial blight resistance gene.

  13. Selection for Unequal Densities of Sigma70 Promoter-like Signalsin Different Regions of Large Bacterial Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Huerta, Araceli M.; Francino, M. Pilar; Morett, Enrique; Collado-Vides, Julio

    2006-03-01

    The evolutionary processes operating in the DNA regions that participate in the regulation of gene expression are poorly understood. In Escherichia coli, we have established a sequence pattern that distinguishes regulatory from nonregulatory regions. The density of promoter-like sequences, that are recognizable by RNA polymerase and may function as potential promoters, is high within regulatory regions, in contrast to coding regions and regions located between convergently-transcribed genes. Moreover, functional promoter sites identified experimentally are often found in the subregions of highest density of promoter-like signals, even when individual sites with higher binding affinity for RNA polymerase exist elsewhere within the regulatory region. In order to investigate the generality of this pattern, we have used position weight matrices describing the -35 and -10 promoter boxes of E. coli to search for these motifs in 43 additional genomes belonging to most established bacterial phyla, after specific calibration of the matrices according to the base composition of the noncoding regions of each genome. We have found that all bacterial species analyzed contain similar promoter-like motifs, and that, in most cases, these motifs follow the same genomic distribution observed in E. coli. Differential densities between regulatory and nonregulatory regions are detectable in most bacterial genomes, with the exception of those that have experienced evolutionary extreme genome reduction. Thus, the phylogenetic distribution of this pattern mirrors that of genes and other genomic features that require weak selection to be effective in order to persist. On this basis, we suggest that the loss of differential densities in the reduced genomes of host-restricted pathogens and symbionts is the outcome of a process of genome degradation resulting from the decreased efficiency of purifying selection in highly structured small populations. This implies that the differential

  14. antiSMASH : rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences

    NARCIS (Netherlands)

    Medema, Marnix H.; Blin, Kai; Cimermancic, Peter; de Jager, Victor; Zakrzewski, Piotr; Fischbach, Michael A.; Weber, Tilmann; Takano, Eriko; Breitling, Rainer

    2011-01-01

    Bacterial and fungal secondary metabolism is a rich source of novel bioactive compounds with potential pharmaceutical applications as antibiotics, anti-tumor drugs or cholesterol-lowering drugs. To find new drug candidates, microbiologists are increasingly relying on sequencing genomes of a wide var

  15. Complete Genome Sequence of Gluconacetobacter hansenii Strain NQ5 (ATCC 53582), an Efficient Producer of Bacterial Cellulose.

    Science.gov (United States)

    Pfeffer, Sarah; Mehta, Kalpa; Brown, R Malcolm

    2016-08-11

    This study reports the release of the complete nucleotide sequence of Gluconacetobacter hansenii strain NQ5 (ATCC 53582). This strain was isolated by R. Malcolm Brown, Jr. in a sugar mill in North Queensland, Australia, and is an efficient producer of bacterial cellulose. The elucidation of the genome will contribute to the study of the molecular mechanisms necessary for cellulose biosynthesis.

  16. SigmoID: a user-friendly tool for improving bacterial genome annotation through analysis of transcription control signals.

    Science.gov (United States)

    Nikolaichik, Yevgeny; Damienikan, Aliaksandr U

    2016-01-01

    The majority of bacterial genome annotations are currently automated and based on a 'gene by gene' approach. Regulatory signals and operon structures are rarely taken into account which often results in incomplete and even incorrect gene function assignments. Here we present SigmoID, a cross-platform (OS X, Linux and Windows) open-source application aiming at simplifying the identification of transcription regulatory sites (promoters, transcription factor binding sites and terminators) in bacterial genomes and providing assistance in correcting annotations in accordance with regulatory information. SigmoID combines a user-friendly graphical interface to well known command line tools with a genome browser for visualising regulatory elements in genomic context. Integrated access to online databases with regulatory information (RegPrecise and RegulonDB) and web-based search engines speeds up genome analysis and simplifies correction of genome annotation. We demonstrate some features of SigmoID by constructing a series of regulatory protein binding site profiles for two groups of bacteria: Soft Rot Enterobacteriaceae (Pectobacterium and Dickeya spp.) and Pseudomonas spp. Furthermore, we inferred over 900 transcription factor binding sites and alternative sigma factor promoters in the annotated genome of Pectobacterium atrosepticum. These regulatory signals control putative transcription units covering about 40% of the P. atrosepticum chromosome. Reviewing the annotation in cases where it didn't fit with regulatory information allowed us to correct product and gene names for over 300 loci.

  17. SigmoID: a user-friendly tool for improving bacterial genome annotation through analysis of transcription control signals

    Directory of Open Access Journals (Sweden)

    Yevgeny Nikolaichik

    2016-05-01

    Full Text Available The majority of bacterial genome annotations are currently automated and based on a ‘gene by gene’ approach. Regulatory signals and operon structures are rarely taken into account which often results in incomplete and even incorrect gene function assignments. Here we present SigmoID, a cross-platform (OS X, Linux and Windows open-source application aiming at simplifying the identification of transcription regulatory sites (promoters, transcription factor binding sites and terminators in bacterial genomes and providing assistance in correcting annotations in accordance with regulatory information. SigmoID combines a user-friendly graphical interface to well known command line tools with a genome browser for visualising regulatory elements in genomic context. Integrated access to online databases with regulatory information (RegPrecise and RegulonDB and web-based search engines speeds up genome analysis and simplifies correction of genome annotation. We demonstrate some features of SigmoID by constructing a series of regulatory protein binding site profiles for two groups of bacteria: Soft Rot Enterobacteriaceae (Pectobacterium and Dickeya spp. and Pseudomonas spp. Furthermore, we inferred over 900 transcription factor binding sites and alternative sigma factor promoters in the annotated genome of Pectobacterium atrosepticum. These regulatory signals control putative transcription units covering about 40% of the P. atrosepticum chromosome. Reviewing the annotation in cases where it didn’t fit with regulatory information allowed us to correct product and gene names for over 300 loci.

  18. Optimization of Mutation Pressure in Relation to Properties of Protein-Coding Sequences in Bacterial Genomes.

    Directory of Open Access Journals (Sweden)

    Paweł Błażej

    Full Text Available Most mutations are deleterious and require energetically costly repairs. Therefore, it seems that any minimization of mutation rate is beneficial. On the other hand, mutations generate genetic diversity indispensable for evolution and adaptation of organisms to changing environmental conditions. Thus, it is expected that a spontaneous mutational pressure should be an optimal compromise between these two extremes. In order to study the optimization of the pressure, we compared mutational transition probability matrices from bacterial genomes with artificial matrices fulfilling the same general features as the real ones, e.g., the stationary distribution and the speed of convergence to the stationarity. The artificial matrices were optimized on real protein-coding sequences based on Evolutionary Strategies approach to minimize or maximize the probability of non-synonymous substitutions and costs of amino acid replacements depending on their physicochemical properties. The results show that the empirical matrices have a tendency to minimize the effects of mutations rather than maximize their costs on the amino acid level. They were also similar to the optimized artificial matrices in the nucleotide substitution pattern, especially the high transitions/transversions ratio. We observed no substantial differences between the effects of mutational matrices on protein-coding sequences in genomes under study in respect of differently replicated DNA strands, mutational cost types and properties of the referenced artificial matrices. The findings indicate that the empirical mutational matrices are rather adapted to minimize mutational costs in the studied organisms in comparison to other matrices with similar mathematical constraints.

  19. Optimization of Mutation Pressure in Relation to Properties of Protein-Coding Sequences in Bacterial Genomes.

    Science.gov (United States)

    Błażej, Paweł; Miasojedow, Błażej; Grabińska, Małgorzata; Mackiewicz, Paweł

    2015-01-01

    Most mutations are deleterious and require energetically costly repairs. Therefore, it seems that any minimization of mutation rate is beneficial. On the other hand, mutations generate genetic diversity indispensable for evolution and adaptation of organisms to changing environmental conditions. Thus, it is expected that a spontaneous mutational pressure should be an optimal compromise between these two extremes. In order to study the optimization of the pressure, we compared mutational transition probability matrices from bacterial genomes with artificial matrices fulfilling the same general features as the real ones, e.g., the stationary distribution and the speed of convergence to the stationarity. The artificial matrices were optimized on real protein-coding sequences based on Evolutionary Strategies approach to minimize or maximize the probability of non-synonymous substitutions and costs of amino acid replacements depending on their physicochemical properties. The results show that the empirical matrices have a tendency to minimize the effects of mutations rather than maximize their costs on the amino acid level. They were also similar to the optimized artificial matrices in the nucleotide substitution pattern, especially the high transitions/transversions ratio. We observed no substantial differences between the effects of mutational matrices on protein-coding sequences in genomes under study in respect of differently replicated DNA strands, mutational cost types and properties of the referenced artificial matrices. The findings indicate that the empirical mutational matrices are rather adapted to minimize mutational costs in the studied organisms in comparison to other matrices with similar mathematical constraints.

  20. Bacterial community structure of a full-scale biofilter treating pig house exhaust air

    DEFF Research Database (Denmark)

    Kristiansen, Anja; Pedersen, Kristina Hadulla; Nielsen, Per Halkjær;

    2011-01-01

    Biological air filters represent a promising tool for treating emissions of ammonia and odor from pig facilities. Quantitative fluorescence in situ hybridization (FISH) and 16S rRNA gene sequencing were used to investigate the bacterial community structure and diversity in a full-scale biofilter...

  1. LLNL Genomic Assessment: Viral and Bacterial Sequencing Needs for TMTI, Task 1.4.2 Report

    Energy Technology Data Exchange (ETDEWEB)

    Slezak, T; Borucki, M; Lam, M; Lenhoff, R; Vitalis, E

    2010-01-26

    Good progress has been made on both bacterial and viral sequencing by the TMTI centers. While access to appropriate samples is a limiting factor to throughput, excellent progress has been made with respect to getting agreements in place with key sources of relevant materials. Sharing of sequenced genomes funded by TMTI has been extremely limited to date. The April 2010 exercise should force a resolution to this, but additional managerial pressures may be needed to ensure that rapid sharing of TMTI-funded sequencing occurs, regardless of collaborator constraints concerning ultimate publication(s). Policies to permit TMTI-internal rapid sharing of sequenced genomes should be written into all TMTI agreements with collaborators now being negotiated. TMTI needs to establish a Web-based system for tracking samples destined for sequencing. This includes metadata on sample origins and contributor, information on sample shipment/receipt, prioritization by TMTI, assignment to one or more sequencing centers (including possible TMTI-sponsored sequencing at a contributor site), and status history of the sample sequencing effort. While this system could be a component of the AFRL system, it is not part of any current development effort. Policy and standardized procedures are needed to ensure appropriate verification of all TMTI samples prior to the investment in sequencing. PCR, arrays, and classical biochemical tests are examples of potential verification methods. Verification is needed to detect miss-labeled, degraded, mixed or contaminated samples. Regular QC exercises are needed to ensure that the TMTI-funded centers are meeting all standards for producing quality genomic sequence data.

  2. A sensitive, support-vector-machine method for the detection of horizontal gene transfers in viral, archaeal and bacterial genomes.

    Science.gov (United States)

    Tsirigos, Aristotelis; Rigoutsos, Isidore

    2005-01-01

    In earlier work, we introduced and discussed a generalized computational framework for identifying horizontal transfers. This framework relied on a gene's nucleotide composition, obviated the need for knowledge of codon boundaries and database searches, and was shown to perform very well across a wide range of archaeal and bacterial genomes when compared with previously published approaches, such as Codon Adaptation Index and C + G content. Nonetheless, two considerations remained outstanding: we wanted to further increase the sensitivity of detecting horizontal transfers and also to be able to apply the method to increasingly smaller genomes. In the discussion that follows, we present such a method, Wn-SVM, and show that it exhibits a very significant improvement in sensitivity compared with earlier approaches. Wn-SVM uses a one-class support-vector machine and can learn using rather small training sets. This property makes Wn-SVM particularly suitable for studying small-size genomes, similar to those of viruses, as well as the typically larger archaeal and bacterial genomes. We show experimentally that the new method results in a superior performance across a wide range of organisms and that it improves even upon our own earlier method by an average of 10% across all examined genomes. As a small-genome case study, we analyze the genome of the human cytomegalovirus and demonstrate that Wn-SVM correctly identifies regions that are known to be conserved and prototypical of all beta-herpesvirinae, regions that are known to have been acquired horizontally from the human host and, finally, regions that had not up to now been suspected to be horizontally transferred. Atypical region predictions for many eukaryotic viruses, including the alpha-, beta- and gamma-herpesvirinae, and 123 archaeal and bacterial genomes, have been made available online at http://cbcsrv.watson.ibm.com/HGT_SVM/.

  3. Repetitive genome elements in a European corn borer, Ostrinia nubilalis, bacterial artificial chromosome library were indicated by bacterial artificial chromosome end sequencing and development of sequence tag site markers: implications for lepidopteran genomic research.

    Science.gov (United States)

    Coates, Brad S; Sumerford, Douglas V; Hellmich, Richard L; Lewis, Leslie C

    2009-01-01

    The European corn borer, Ostrinia nubilalis, is a serious pest of food, fiber, and biofuel crops in Europe, North America, and Asia and a model system for insect olfaction and speciation. A bacterial artificial chromosome library constructed for O. nubilalis contains 36 864 clones with an estimated average insert size of >or=120 kb and genome coverage of 8.8-fold. Screening OnB1 clones comprising approximately 2.76 genome equivalents determined the physical position of 24 sequence tag site markers, including markers linked to ecologically important and Bacillus thuringiensis toxin resistance traits. OnB1 bacterial artificial chromosome end sequence reads (GenBank dbGSS accessions ET217010 to ET217273) showed homology to annotated genes or expressed sequence tags and identified repetitive genome elements, O. nubilalis miniature subterminal inverted repeat transposable elements (OnMITE01 and OnMITE02), and ezi-like long interspersed nuclear elements. Mobility of OnMITE01 was demonstrated by the presence or absence in O. nubilalis of introns at two different loci. A (GTCT)n tetranucleotide repeat at the 5' ends of OnMITE01 and OnMITE02 are evidence for transposon-mediated movement of lepidopteran microsatellite loci. The number of repetitive elements in lepidopteran genomes will affect genome assembly and marker development. Single-locus sequence tag site markers described here have downstream application for integration within linkage maps and comparative genomic studies.

  4. Amplification of pico-scale DNA mediated by bacterial carrier DNA for small-cell-number transcription factor ChIP-seq

    DEFF Research Database (Denmark)

    Jakobsen, Janus S; Bagger, Frederik O; Hasemann, Marie S;

    2015-01-01

    BACKGROUND: Chromatin-Immunoprecipitation coupled with deep sequencing (ChIP-seq) is used to map transcription factor occupancy and generate epigenetic profiles genome-wide. The requirement of nano-scale ChIP DNA for generation of sequencing libraries has impeded ChIP-seq on in vivo tissues of low...... transcription factor (CEBPA) and histone mark (H3K4me3) ChIP. We further demonstrate that genomic profiles are highly resilient to changes in carrier DNA to ChIP DNA ratios. CONCLUSIONS: This represents a significant advance compared to existing technologies, which involve either complex steps of pre...... cell numbers. RESULTS: We describe a robust, simple and scalable methodology for ChIP-seq of low-abundant cell populations, verified down to 10,000 cells. By employing non-mammalian genome mapping bacterial carrier DNA during amplification, we reliably amplify down to 50 pg of ChIP DNA from...

  5. Predicting relatedness of bacterial genomes using the chaperonin-60 universal target (cpn60 UT): application to Thermoanaerobacter species.

    Science.gov (United States)

    Verbeke, Tobin J; Sparling, Richard; Hill, Janet E; Links, Matthew G; Levin, David; Dumonceaux, Tim J

    2011-05-01

    D.R. Zeigler determined that the sequence identity of bacterial genomes can be predicted accurately using the sequence identities of a corresponding set of genes that meet certain criteria [32]. This three-gene model for comparing bacterial genome pairs requires the determination of the sequence identities for recN, thdF, and rpoA. This involves the generation of approximately 4.2kb of genomic DNA sequence from each organism to be compared, and also normally requires that oligonucleotide primers be designed for amplification and sequencing based on the sequences of closely related organisms. However, we have developed an analogous mathematical model for predicting the sequence identity of whole genomes based on the sequence identity of the 542-567 base pair chaperonin-60 universal target (cpn60 UT). The cpn60 UT is accessible in nearly all bacterial genomes with a single set of universal primers, and its length is such that it can be completely sequenced in one pair of overlapping sequencing reads via di-deoxy sequencing. These mathematical models were applied to a set of Thermoanaerobacter isolates from a wood chip compost pile and it was shown that both the one-gene cpn60 UT-based model and the three-gene model based on recN, rpoA, and thdF predicted that these isolates could be classified as Thermoanaerobacter thermohydrosulfuricus. Furthermore, it was found that the genomic prediction model using cpn60 UT gave similar results to whole-genome sequence alignments over a broad range of taxa, suggesting that this method may have general utility for screening isolates and predicting their taxonomic affiliations.

  6. Characterizing acetogenic metabolism using a genome-scale metabolic reconstruction of Clostridium ljungdahlii

    Energy Technology Data Exchange (ETDEWEB)

    Nagarajan, H; Sahin, M; Nogales, J; Latif, H; Lovley, DR; Ebrahim, A; Zengler, K

    2013-11-25

    Background: The metabolic capabilities of acetogens to ferment a wide range of sugars, to grow autotrophically on H-2/CO2, and more importantly on synthesis gas (H-2/CO/CO2) make them very attractive candidates as production hosts for biofuels and biocommodities. Acetogenic metabolism is considered one of the earliest modes of bacterial metabolism. A thorough understanding of various factors governing the metabolism, in particular energy conservation mechanisms, is critical for metabolic engineering of acetogens for targeted production of desired chemicals. Results: Here, we present the genome-scale metabolic network of Clostridium ljungdahlii, the first such model for an acetogen. This genome-scale model (iHN637) consisting of 637 genes, 785 reactions, and 698 metabolites captures all the major central metabolic and biosynthetic pathways, in particular pathways involved in carbon fixation and energy conservation. A combination of metabolic modeling, with physiological and transcriptomic data provided insights into autotrophic metabolism as well as aided the characterization of a nitrate reduction pathway in C. ljungdahlii. Analysis of the iHN637 metabolic model revealed that flavin based electron bifurcation played a key role in energy conservation during autotrophic growth and helped identify genes for some of the critical steps in this mechanism. Conclusions: iHN637 represents a predictive model that recapitulates experimental data, and provides valuable insights into the metabolic response of C. ljungdahlii to genetic perturbations under various growth conditions. Thus, the model will be instrumental in guiding metabolic engineering of C. ljungdahlii for the industrial production of biocommodities and biofuels.

  7. Ultrastructural and molecular characterization of a bacterial symbiosis in the ecologically important scale insect family Coelostomidiidae.

    Science.gov (United States)

    Dhami, Manpreet K; Turner, Adrian P; Deines, Peter; Beggs, Jacqueline R; Taylor, Michael W

    2012-09-01

    Scale insects are important ecologically and as agricultural pests. The majority of scale insect taxa feed exclusively on plant phloem sap, which is carbon rich but deficient in essential amino acids. This suggests that, as seen in the related aphids and psyllids, scale insect nutrition might also depend upon bacterial symbionts, yet very little is known about scale insect-bacteria symbioses. We report here the first identification and molecular characterization of symbiotic bacteria associated with the New Zealand giant scale Coelostomidia wairoensis, using fluorescence in situ hybridization (FISH), transmission electron microscopy (TEM) and 16S rRNA gene-based analysis. Dissection and FISH confirmed the location of the bacteria in large, paired, multilobate organs in the abdominal region of the insect. TEM indicated that the dominant pleomorphic bacteria were confined to bacteriocytes in the sheath-enclosed bacteriome. Phylogenetic analysis revealed the presence of three distinct bacterial types, the bacteriome-associated B-symbiont (Bacteroidetes), an Erwinia-related symbiont (Gammaproteobacteria) and Wolbachia sp. (Alphaproteobacteria). This study extends the current knowledge of scale insect symbionts and is the first microbiological investigation of the ecologically important coelostomidiid scales.

  8. Genome-scale genetic engineering in Escherichia coli.

    Science.gov (United States)

    Jeong, Jaehwan; Cho, Namjin; Jung, Daehee; Bang, Duhee

    2013-11-01

    Genome engineering has been developed to create useful strains for biological studies and industrial uses. However, a continuous challenge remained in the field: technical limitations in high-throughput screening and precise manipulation of strains. Today, technical improvements have made genome engineering more rapid and efficient. This review introduces recent advances in genome engineering technologies applied to Escherichia coli as well as multiplex automated genome engineering (MAGE), a recent technique proposed as a powerful toolkit due to its straightforward process, rapid experimental procedures, and highly efficient properties.

  9. Temporal scaling of bacterial taxa is influenced by both stochastic and deterministic ecological factors.

    Science.gov (United States)

    van der Gast, Christopher J; Ager, Duane; Lilley, Andrew K

    2008-06-01

    Microorganisms operate at a range of spatial and temporal scales acting as key drivers of ecosystem properties. Therefore, many key questions in microbial ecology require the consideration of both spatial and temporal scales. Spatial scaling, in particular the species-area relationship (SAR), has a long history in ecology and has recently been addressed in microbial ecology. However, the temporal analogue of the SAR, the species-time relationship, has received far less attention even in the science of general ecology. Here we focus upon the role of temporal scaling in microbial ecological patterns by coupling molecular characterization of bacterial communities in discrete island (bioreactor) systems with a macroecological approach. Our findings showed that the temporal scaling exponent (slope), and therefore taxa turnover of the bacterial taxa-time relationship decreased as selective pressure (industrial wastewater concentration) increased. Also, as the concentration of industrial wastewater increased across the bioreactors, we observed a gradual switch from stochastic community assembly to more deterministic (niche)-based considerations. The identification of broad-scale statistical patterns is particularly relevant to microbial ecology, as it is frequently difficult to identify individual species or their functions. In this study, we identify wide-reaching statistical patterns of diversity and show that they are shaped by the prevalent underlying ecological factors.

  10. Dissecting the energy metabolism in Mycoplasma pneumoniae through genome-scale metabolic modeling

    NARCIS (Netherlands)

    Wodke, J.A.; Puchalka, J.; Lluch-Senar, M.; Marcos, J.; Yus, E.; Godinho, M.; Gutierrez-Gallego, R.; Martins Dos Santos, V.A.P.; Serrano, L.; Klipp, E.; Maier, T.

    2013-01-01

    Mycoplasma pneumoniae, a threatening pathogen with a minimal genome, is a model organism for bacterial systems biology for which substantial experimental information is available. With the goal of understanding the complex interactions underlying its metabolism, we analyzed and characterized the met

  11. Construction of a nurse shark (Ginglymostoma cirratum bacterial artificial chromosome (BAC library and a preliminary genome survey

    Directory of Open Access Journals (Sweden)

    Inoko Hidetoshi

    2006-05-01

    Full Text Available Abstract Background Sharks are members of the taxonomic class Chondrichthyes, the oldest living jawed vertebrates. Genomic studies of this group, in comparison to representative species in other vertebrate taxa, will allow us to theorize about the fundamental genetic, developmental, and functional characteristics in the common ancestor of all jawed vertebrates. Aims In order to obtain mapping and sequencing data for comparative genomics, we constructed a bacterial artificial chromosome (BAC library for the nurse shark, Ginglymostoma cirratum. Results The BAC library consists of 313,344 clones with an average insert size of 144 kb, covering ~4.5 × 1010 bp and thus providing an 11-fold coverage of the haploid genome. BAC end sequence analyses revealed, in addition to LINEs and SINEs commonly found in other animal and plant genomes, two new groups of nurse shark-specific repetitive elements, NSRE1 and NSRE2 that seem to be major components of the nurse shark genome. Screening the library with single-copy or multi-copy gene probes showed 6–28 primary positive clones per probe of which 50–90% were true positives, demonstrating that the BAC library is representative of the different regions of the nurse shark genome. Furthermore, some BAC clones contained multiple genes, making physical mapping feasible. Conclusion We have constructed a deep-coverage, high-quality, large insert, and publicly available BAC library for a cartilaginous fish. It will be very useful to the scientific community interested in shark genomic structure, comparative genomics, and functional studies. We found two new groups of repetitive elements specific to the nurse shark genome, which may contribute to the architecture and evolution of the nurse shark genome.

  12. Two genome sequences of the same bacterial strain, Gluconacetobacter diazotrophicus PAl 5, suggest a new standard in genome sequence submission.

    Science.gov (United States)

    Giongo, Adriana; Tyler, Heather L; Zipperer, Ursula N; Triplett, Eric W

    2010-06-15

    Gluconacetobacter diazotrophicus PAl 5 is of agricultural significance due to its ability to provide fixed nitrogen to plants. Consequently, its genome sequence has been eagerly anticipated to enhance understanding of endophytic nitrogen fixation. Two groups have sequenced the PAl 5 genome from the same source (ATCC 49037), though the resulting sequences contain a surprisingly high number of differences. Therefore, an optical map of PAl 5 was constructed in order to determine which genome assembly more closely resembles the chromosomal DNA by aligning each sequence against a physical map of the genome. While one sequence aligned very well, over 98% of the second sequence contained numerous rearrangements. The many differences observed between these two genome sequences could be owing to either assembly errors or rapid evolutionary divergence. The extent of the differences derived from sequence assembly errors could be assessed if the raw sequencing reads were provided by both genome centers at the time of genome sequence submission. Hence, a new genome sequence standard is proposed whereby the investigator supplies the raw reads along with the closed sequence so that the community can make more accurate judgments on whether differences observed in a single stain may be of biological origin or are simply caused by differences in genome assembly procedures.

  13. Large-scale genomic 2D visualization reveals extensive CG-AT skew correlation in bird genomes

    Directory of Open Access Journals (Sweden)

    Deng Xuemei

    2007-11-01

    Full Text Available Abstract Background Bird genomes have very different compositional structure compared with other warm-blooded animals. The variation in the base skew rules in the vertebrate genomes remains puzzling, but it must relate somehow to large-scale genome evolution. Current research is inclined to relate base skew with mutations and their fixation. Here we wish to explore base skew correlations in bird genomes, to develop methods for displaying and quantifying such correlations at different scales, and to discuss possible explanations for the peculiarities of the bird genomes in skew correlation. Results We have developed a method called Base Skew Double Triangle (BSDT for exhibiting the genome-scale change of AT/CG skew as a two-dimensional square picture, showing base skews at many scales simultaneously in a single image. By this method we found that most chicken chromosomes have high AT/CG skew correlation (symmetry in 2D picture, except for some microchromosomes. No other organisms studied (18 species show such high skew correlations. This visualized high correlation was validated by three kinds of quantitative calculations with overlapping and non-overlapping windows, all indicating that chicken and birds in general have a special genome structure. Similar features were also found in some of the mammal genomes, but clearly much weaker than in chickens. We presume that the skew correlation feature evolved near the time that birds separated from other vertebrate lineages. When we eliminated the repeat sequences from the genomes, the AT and CG skews correlation increased for some mammal genomes, but were still clearly lower than in chickens. Conclusion Our results suggest that BSDT is an expressive visualization method for AT and CG skew and enabled the discovery of the very high skew correlation in bird genomes; this peculiarity is worth further study. Computational analysis indicated that this correlation might be a compositional characteristic

  14. Genome sequence and plasmid transformation of the model high-yield bacterial cellulose producer Gluconacetobacter hansenii ATCC 53582

    Science.gov (United States)

    Florea, Michael; Reeve, Benjamin; Abbott, James; Freemont, Paul S.; Ellis, Tom

    2016-03-01

    Bacterial cellulose is a strong, highly pure form of cellulose that is used in a range of applications in industry, consumer goods and medicine. Gluconacetobacter hansenii ATCC 53582 is one of the highest reported bacterial cellulose producing strains and has been used as a model organism in numerous studies of bacterial cellulose production and studies aiming to increased cellulose productivity. Here we present a high-quality draft genome sequence for G. hansenii ATCC 53582 and find that in addition to the previously described cellulose synthase operon, ATCC 53582 contains two additional cellulose synthase operons and several previously undescribed genes associated with cellulose production. In parallel, we also develop optimized protocols and identify plasmid backbones suitable for transformation of ATCC 53582, albeit with low efficiencies. Together, these results provide important information for further studies into cellulose synthesis and for future studies aiming to genetically engineer G. hansenii ATCC 53582 for increased cellulose productivity.

  15. FN-Identify: Novel Restriction Enzymes-Based Method for Bacterial Identification in Absence of Genome Sequencing.

    Science.gov (United States)

    Awad, Mohamed; Ouda, Osama; El-Refy, Ali; El-Feky, Fawzy A; Mosa, Kareem A; Helmy, Mohamed

    2015-01-01

    Sequencing and restriction analysis of genes like 16S rRNA and HSP60 are intensively used for molecular identification in the microbial communities. With aid of the rapid progress in bioinformatics, genome sequencing became the method of choice for bacterial identification. However, the genome sequencing technology is still out of reach in the developing countries. In this paper, we propose FN-Identify, a sequencing-free method for bacterial identification. FN-Identify exploits the gene sequences data available in GenBank and other databases and the two algorithms that we developed, CreateScheme and GeneIdentify, to create a restriction enzyme-based identification scheme. FN-Identify was tested using three different and diverse bacterial populations (members of Lactobacillus, Pseudomonas, and Mycobacterium groups) in an in silico analysis using restriction enzymes and sequences of 16S rRNA gene. The analysis of the restriction maps of the members of three groups using the fragment numbers information only or along with fragments sizes successfully identified all of the members of the three groups using a minimum of four and maximum of eight restriction enzymes. Our results demonstrate the utility and accuracy of FN-Identify method and its two algorithms as an alternative method that uses the standard microbiology laboratories techniques when the genome sequencing is not available.

  16. FN-Identify: Novel Restriction Enzymes-Based Method for Bacterial Identification in Absence of Genome Sequencing

    Science.gov (United States)

    Ouda, Osama; El-Refy, Ali; El-Feky, Fawzy A.; Mosa, Kareem A.

    2015-01-01

    Sequencing and restriction analysis of genes like 16S rRNA and HSP60 are intensively used for molecular identification in the microbial communities. With aid of the rapid progress in bioinformatics, genome sequencing became the method of choice for bacterial identification. However, the genome sequencing technology is still out of reach in the developing countries. In this paper, we propose FN-Identify, a sequencing-free method for bacterial identification. FN-Identify exploits the gene sequences data available in GenBank and other databases and the two algorithms that we developed, CreateScheme and GeneIdentify, to create a restriction enzyme-based identification scheme. FN-Identify was tested using three different and diverse bacterial populations (members of Lactobacillus, Pseudomonas, and Mycobacterium groups) in an in silico analysis using restriction enzymes and sequences of 16S rRNA gene. The analysis of the restriction maps of the members of three groups using the fragment numbers information only or along with fragments sizes successfully identified all of the members of the three groups using a minimum of four and maximum of eight restriction enzymes. Our results demonstrate the utility and accuracy of FN-Identify method and its two algorithms as an alternative method that uses the standard microbiology laboratories techniques when the genome sequencing is not available. PMID:26880910

  17. Incorporating Protein Biosynthesis into the Saccharomyces cerevisiae Genome-scale Metabolic Model

    DEFF Research Database (Denmark)

    Olivares Hernandez, Roberto

    Based on stoichiometric biochemical equations that occur into the cell, the genome-scale metabolic models can quantify the metabolic fluxes, which are regarded as the final representation of the physiological state of the cell. For Saccharomyces Cerevisiae the genome scale model has been......, translation initiation, translation elongation, translation termination, translation elongation, and mRNA decay. Considering these information from the mechanisms of transcription and translation, we will include this stoichiometric reactions into the genome scale model for S. Cerevisiae to obtain the first...

  18. The 19 genomes of Drosophila: a BAC library resource for genus-wide and genome-scale comparative evolutionary research.

    Science.gov (United States)

    Song, Xiang; Goicoechea, Jose Luis; Ammiraju, Jetty S S; Luo, Meizhong; He, Ruifeng; Lin, Jinke; Lee, So-Jeong; Sisneros, Nicholas; Watts, Tom; Kudrna, David A; Golser, Wolfgang; Ashley, Elizabeth; Collura, Kristi; Braidotti, Michele; Yu, Yeisoo; Matzkin, Luciano M; McAllister, Bryant F; Markow, Therese Ann; Wing, Rod A

    2011-04-01

    The genus Drosophila has been the subject of intense comparative phylogenomics characterization to provide insights into genome evolution under diverse biological and ecological contexts and to functionally annotate the Drosophila melanogaster genome, a model system for animal and insect genetics. Recent sequencing of 11 additional Drosophila species from various divergence points of the genus is a first step in this direction. However, to fully reap the benefits of this resource, the Drosophila community is faced with two critical needs: i.e., the expansion of genomic resources from a much broader range of phylogenetic diversity and the development of additional resources to aid in finishing the existing draft genomes. To address these needs, we report the first synthesis of a comprehensive set of bacterial artificial chromosome (BAC) resources for 19 Drosophila species from all three subgenera. Ten libraries were derived from the exact source used to generate 10 of the 12 draft genomes, while the rest were generated from a strategically selected set of species on the basis of salient ecological and life history features and their phylogenetic positions. The majority of the new species have at least one sequenced reference genome for immediate comparative benefit. This 19-BAC library set was rigorously characterized and shown to have large insert sizes (125-168 kb), low nonrecombinant clone content (0.3-5.3%), and deep coverage (9.1-42.9×). Further, we demonstrated the utility of this BAC resource for generating physical maps of targeted loci, refining draft sequence assemblies and identifying potential genomic rearrangements across the phylogeny.

  19. Comparative genome-scale modelling of Staphylococcus aureus strains identifies strain-specific metabolic capabilities linked to pathogenicity

    Science.gov (United States)

    Bosi, Emanuele; Monk, Jonathan M.; Aziz, Ramy K.; Fondi, Marco; Nizet, Victor; Palsson, Bernhard Ø.

    2016-01-01

    Staphylococcus aureus is a preeminent bacterial pathogen capable of colonizing diverse ecological niches within its human host. We describe here the pangenome of S. aureus based on analysis of genome sequences from 64 strains of S. aureus spanning a range of ecological niches, host types, and antibiotic resistance profiles. Based on this set, S. aureus is expected to have an open pangenome composed of 7,411 genes and a core genome composed of 1,441 genes. Metabolism was highly conserved in this core genome; however, differences were identified in amino acid and nucleotide biosynthesis pathways between the strains. Genome-scale models (GEMs) of metabolism were constructed for the 64 strains of S. aureus. These GEMs enabled a systems approach to characterizing the core metabolic and panmetabolic capabilities of the S. aureus species. All models were predicted to be auxotrophic for the vitamins niacin (vitamin B3) and thiamin (vitamin B1), whereas strain-specific auxotrophies were predicted for riboflavin (vitamin B2), guanosine, leucine, methionine, and cysteine, among others. GEMs were used to systematically analyze growth capabilities in more than 300 different growth-supporting environments. The results identified metabolic capabilities linked to pathogenic traits and virulence acquisitions. Such traits can be used to differentiate strains responsible for mild vs. severe infections and preference for hosts (e.g., animals vs. humans). Genome-scale analysis of multiple strains of a species can thus be used to identify metabolic determinants of virulence and increase our understanding of why certain strains of this deadly pathogen have spread rapidly throughout the world. PMID:27286824

  20. Comparative genome-scale modelling of Staphylococcus aureus strains identifies strain-specific metabolic capabilities linked to pathogenicity.

    Science.gov (United States)

    Bosi, Emanuele; Monk, Jonathan M; Aziz, Ramy K; Fondi, Marco; Nizet, Victor; Palsson, Bernhard Ø

    2016-06-28

    Staphylococcus aureus is a preeminent bacterial pathogen capable of colonizing diverse ecological niches within its human host. We describe here the pangenome of S. aureus based on analysis of genome sequences from 64 strains of S. aureus spanning a range of ecological niches, host types, and antibiotic resistance profiles. Based on this set, S. aureus is expected to have an open pangenome composed of 7,411 genes and a core genome composed of 1,441 genes. Metabolism was highly conserved in this core genome; however, differences were identified in amino acid and nucleotide biosynthesis pathways between the strains. Genome-scale models (GEMs) of metabolism were constructed for the 64 strains of S. aureus These GEMs enabled a systems approach to characterizing the core metabolic and panmetabolic capabilities of the S. aureus species. All models were predicted to be auxotrophic for the vitamins niacin (vitamin B3) and thiamin (vitamin B1), whereas strain-specific auxotrophies were predicted for riboflavin (vitamin B2), guanosine, leucine, methionine, and cysteine, among others. GEMs were used to systematically analyze growth capabilities in more than 300 different growth-supporting environments. The results identified metabolic capabilities linked to pathogenic traits and virulence acquisitions. Such traits can be used to differentiate strains responsible for mild vs. severe infections and preference for hosts (e.g., animals vs. humans). Genome-scale analysis of multiple strains of a species can thus be used to identify metabolic determinants of virulence and increase our understanding of why certain strains of this deadly pathogen have spread rapidly throughout the world.

  1. Rapid genome-scale mapping of chromatin accessibility in tissue

    DEFF Research Database (Denmark)

    Grøntved, Lars; Bandle, Russell; John, Sam;

    2012-01-01

    BACKGROUND: The challenge in extracting genome-wide chromatin features from limiting clinical samples poses a significant hurdle in identification of regulatory marks that impact the physiological or pathological state. Current methods that identify nuclease accessible chromatin are reliant...

  2. Genome-scale engineering for systems and synthetic biology

    OpenAIRE

    Esvelt, Kevin Michael; Wang, Harris H.

    2013-01-01

    Genome-modification technologies enable the rational engineering and perturbation of biological systems. Historically, these methods have been limited to gene insertions or mutations at random or at a few pre-defined locations across the genome. The handful of methods capable of targeted gene editing suffered from low efficiencies, significant labor costs, or both. Recent advances have dramatically expanded our ability to engineer cells in a directed and combinatorial manner. Here, we review ...

  3. Large-scale prokaryotic gene prediction and comparison to genome annotation

    DEFF Research Database (Denmark)

    Nielsen, Pernille; Krogh, Anders Stærmose

    2005-01-01

    Motivation: Prokaryotic genomes are sequenced and annotated at an increasing rate. The methods of annotation vary between sequencing groups. It makes genome comparison difficult and may lead to propagation of errors when questionable assignments are adapted from one genome to another. Genome...... comparison either on a large or small scale would be facilitated by using a single standard for annotation, which incorporates a transparency of why an open reading frame (ORF) is considered to be a gene. Results: A total of 143 prokaryotic genomes were scored with an updated version of the prokaryotic...... genefinder EasyGene. Comparison of the GenBank and RefSeq annotations with the EasyGene predictions reveals that in some genomes up to 60% of the genes may have been annotated with a wrong start codon, especially in the GC-rich genomes. The fractional difference between annotated and predicted confirms...

  4. Construction and Characterization of a Bacterial Artificial Chromosome Library for the A-Genome of Cotton (G. arboreum L.

    Directory of Open Access Journals (Sweden)

    Yan Hu

    2011-01-01

    Full Text Available A bacterial artificial chromosome (BAC library for the A-genome of cotton has been constructed from the leaves of G. arboreum L cv. Jianglinzhongmian. It is used as elite A-genome germplasm resources in the present cotton breeding program and has been used to build a genetic reference map of cotton. The BAC library consists of 123,648 clones stored in 322 384-well plates. Statistical analysis of a set of 103 randomly selected BAC clones indicated that each clone has an average insert length of 100.2 kb per plasmid, with a range of 30 to 190 kb. Theoretically, this represents 7.2 haploid genome equivalents based on an A-genome size of 1697 Mb. The BAC library has been arranged in column pools and superpools allowing screening with various PCR-based markers. In the future, the A-genome cotton BAC library will serve as both a giant gene resource and a valuable tool for map-based gene isolation, physical mapping and comparative genome analysis.

  5. Comparative genomics of non-pseudomonal bacterial species colonising paediatric cystic fibrosis patients

    Directory of Open Access Journals (Sweden)

    Kate L. Ormerod

    2015-09-01

    Full Text Available The genetic disorder cystic fibrosis is a life-limiting condition affecting ∼70,000 people worldwide. Targeted, early, treatment of the dominant infecting species, Pseudomonas aeruginosa, has improved patient outcomes; however, there is concern that other species are now stepping in to take its place. In addition, the necessarily long-term antibiotic therapy received by these patients may be providing a suitable environment for the emergence of antibiotic resistance. To investigate these issues, we employed whole-genome sequencing of 28 non-Pseudomonas bacterial strains isolated from three paediatric patients. We did not find any trend of increasing antibiotic resistance (either by mutation or lateral gene transfer in these isolates in comparison with other examples of the same species. In addition, each isolate contained a virulence gene repertoire that was similar to other examples of the relevant species. These results support the impaired clearance of the CF lung not demanding extensive virulence for survival in this habitat. By analysing serial isolates of the same species we uncovered several examples of strain persistence. The same strain of Staphylococcus aureus persisted for nearly a year, despite administration of antibiotics to which it was shown to be sensitive. This is consistent with previous studies showing antibiotic therapy to be inadequate in cystic fibrosis patients, which may also explain the lack of increasing antibiotic resistance over time. Serial isolates of two naturally multi-drug resistant organisms, Achromobacter xylosoxidans and Stenotrophomonas maltophilia, revealed that while all S. maltophilia strains were unique, A. xylosoxidans persisted for nearly five years, making this a species of particular concern. The data generated by this study will assist in developing an understanding of the non-Pseudomonas species associated with cystic fibrosis.

  6. Genome-wide dynamics of a bacterial response to antibiotics that target the cell envelope

    Directory of Open Access Journals (Sweden)

    Tran Ngat

    2011-05-01

    Full Text Available Abstract Background A decline in the discovery of new antibacterial drugs, coupled with a persistent rise in the occurrence of drug-resistant bacteria, has highlighted antibiotics as a diminishing resource. The future development of new drugs with novel antibacterial activities requires a detailed understanding of adaptive responses to existing compounds. This study uses Streptomyces coelicolor A3(2 as a model system to determine the genome-wide transcriptional response following exposure to three antibiotics (vancomycin, moenomycin A and bacitracin that target distinct stages of cell wall biosynthesis. Results A generalised response to all three antibiotics was identified which involves activation of transcription of the cell envelope stress sigma factor σE, together with elements of the stringent response, and of the heat, osmotic and oxidative stress regulons. Attenuation of this system by deletion of genes encoding the osmotic stress sigma factor σB or the ppGpp synthetase RelA reduced resistance to both vancomycin and bacitracin. Many antibiotic-specific transcriptional changes were identified, representing cellular processes potentially important for tolerance to each antibiotic. Sensitivity studies using mutants constructed on the basis of the transcriptome profiling confirmed a role for several such genes in antibiotic resistance, validating the usefulness of the approach. Conclusions Antibiotic inhibition of bacterial cell wall biosynthesis induces both common and compound-specific transcriptional responses. Both can be exploited to increase antibiotic susceptibility. Regulatory networks known to govern responses to environmental and nutritional stresses are also at the core of the common antibiotic response, and likely help cells survive until any specific resistance mechanisms are fully functional.

  7. Comparative Genomics Analysis and Phenotypic Characterization of Shewanella putrefaciens W3-18-1: Anaerobic Respiration, Bacterial Microcompartments, and Lateral Flagella

    Energy Technology Data Exchange (ETDEWEB)

    Qiu, D.; Tu, Q.; He, Zhili; Zhou, Jizhong

    2010-05-17

    Respiratory versatility and psychrophily are the hallmarks of Shewanella. The ability to utilize a wide range of electron acceptors for respiration is due to the large number of c-type cytochrome genes present in the genome of Shewanella strains. More recently the dissimilatory metal reduction of Shewanella species has been extensively and intensively studied for potential applications in the bioremediation of radioactive wastes of groundwater and subsurface environments. Multiple Shewanella genome sequences are now available in the public databases (Fredrickson et al., 2008). Most of the sequenced Shewanella strains were isolated from marine environments and this genus was believed to be of marine origin (Hau and Gralnick, 2007). However, the well-characterized model strain, S. oneidensis MR-1, was isolated from the freshwater lake sediment of Lake Oneida, New York (Myers and Nealson, 1988) and similar bacteria have also been isolated from other freshwater environments (Venkateswaran et al., 1999). Here we comparatively analyzed the genome sequence and physiological characteristics of S. putrefaciens W3-18-1 and S. oneidensis MR-1, isolated from the marine and freshwater lake sediments, respectively. The anaerobic respirations, carbon source utilization, and cell motility have been experimentally investigated. Large scale horizontal gene transfers have been revealed and the genetic divergence between these two strains was considered to be critical to the bacterial adaptation to specific habitats, freshwater or marine sediments.

  8. Phylogenetic Relationships of 3/3 and 2/2 Hemoglobins in Archaeplastida Genomes to Bacterial and Other Eukaryote Hemoglobins

    Institute of Scientific and Technical Information of China (English)

    Serge N. Vinogradov; Iván Fernández; David Hoogewijs; Raúl Arredondo-Peter

    2011-01-01

    Land plants and algae form a supergroup, the Archaeplastida, believed to be monophyletic. We report the results of an analysis of the phylogeny of putative globins in the currently available genomes to bacterial and other eu-karyote hemoglobins (Hbs). Archaeplastida genomes have 3/3 and 2/2 Hbs, with the land plant genomes having group 2 2/2 Hbs, except for the unexpected occurrence of two group 1 2/2 Hbs in Ricinus communis. Bayesian analysis shows that plant 3/3 Hbs are related to vertebrate neuroglobins and bacterial flavohemoglobins (FHbs). We sought to define the bacterial groups, whose ancestors shared the precursors of Archaeplastida Hbs, via Bayesian and neighbor-joining anal-yses based on COBALTalignment of representative sets of bacterial 3/3 FHb-like globins and group 1 and 2 2/2 Hbs with the corresponding Archaeplastida Hbs. The results suggest that the Archaeplastida 3/3 and group 1 2/2 Hbs could have orig-inated from the horizontal gene transfers (HGTs) that accompanied the two generally accepted endosymbioses of a pro-teobacterium and a cyanobacterium with a eukaryote ancestor. In contrast, the origin of the group 2 2/2 Hbs unexpectedly appears to involve HGT from a bacterium ancestral to Chloroflexi, Deinococcales, Bacilli, and Actinomycetes. Furthermore,although intron positions and phases are mostly conserved among the land plant 3/3 and 2/2 globin genes, introns are absent in the algal 3/3 genes and intron positions and phases are highly variable in their 2/2 genes. Thus, introns are irrelevant to globin evolution in Archaeplastida.

  9. Structural genomics of eukaryotic targets at a laboratory scale.

    Science.gov (United States)

    Busso, Didier; Poussin-Courmontagne, Pierre; Rosé, David; Ripp, Raymond; Litt, Alain; Thierry, Jean-Claude; Moras, Dino

    2005-01-01

    Structural genomics programs are distributed worldwide and funded by large institutions such as the NIH in United-States, the RIKEN in Japan or the European Commission through the SPINE network in Europe. Such initiatives, essentially managed by large consortia, led to technology and method developments at the different steps required to produce biological samples compatible with structural studies. Besides specific applications, method developments resulted mainly upon miniaturization and parallelization. The challenge that academic laboratories faces to pursue structural genomics programs is to produce, at a higher rate, protein samples. The Structural Biology and Genomics Department (IGBMC - Illkirch - France) is implicated in a structural genomics program of high eukaryotes whose goal is solving crystal structures of proteins and their complexes (including large complexes) related to human health and biotechnology. To achieve such a challenging goal, the Department has established a medium-throughput pipeline for producing protein samples suitable for structural biology studies. Here, we describe the setting up of our initiative from cloning to crystallization and we demonstrate that structural genomics may be manageable by academic laboratories by strategic investments in robotic and by adapting classical bench protocols and new developments, in particular in the field of protein expression, to parallelization.

  10. Efficient PAHs biodegradation by a bacterial consortium at flask and bioreactor scale.

    Science.gov (United States)

    Moscoso, F; Teijiz, I; Deive, F J; Sanromán, M A

    2012-09-01

    In this work, the biodegradation of three polycyclic aromatic hydrocarbons (PAHs) such as Phenanthrene (PHE), Pyrene (PYR) and Benzo[a]anthracene (BaA) has been investigated. A bacterial consortium consisting of two strains was used for the first time based on preliminary promising biodegradation data. They were tentatively identified as Staphylococcus warneri and Bacillus pumilus. Degradation values higher than 85% were obtained for each single PAH when operating at flask scale, whereas minimum levels of 90% of PAHs removal were obtained after just 3 days of cultivation at bioreactor scale. The operation in cometabolic conditions led to maximum levels about 75% and 100% at flask and bioreactor scale, respectively. All the experimental data were analyzed in the light of logistic and Luedeking and Piret type models, with the purpose to better characterize the biodegradation process by S. warneri and B. pumilus. Finally, the metabolic pathway followed to degrade each PAH was ascertained.

  11. Large-scale genomic analysis of ovarian carcinomas.

    Science.gov (United States)

    Gorringe, Kylie L; Campbell, Ian G

    2009-04-01

    Epithelial ovarian cancers are typified by frequent genomic aberrations that have been difficult to unravel. Recently, high-resolution array technologies have provided the first glimpse of the remarkable complexity of these aberrations with some ovarian cancers containing hundreds of copy number breakpoints, micro-deletions and amplifications. Many of these alterations contain cancer-related genes suggesting that the majority is disease-associated and not just the product of random genomic instability. Future developments such as next-generation sequencing and integrated analysis of data from multiple array platforms on large numbers of samples are poised to revolutionize our understanding of this complex disease.

  12. Genomic Encyclopedia of Bacterial and Archaeal Type Strains, Phase III: the genomes of soil and plant-associated and newly described type strains.

    Science.gov (United States)

    Whitman, William B; Woyke, Tanja; Klenk, Hans-Peter; Zhou, Yuguang; Lilburn, Timothy G; Beck, Brian J; De Vos, Paul; Vandamme, Peter; Eisen, Jonathan A; Garrity, George; Hugenholtz, Philip; Kyrpides, Nikos C

    2015-01-01

    The Genomic Encyclopedia of Bacteria and Archaea (GEBA) project was launched by the JGI in 2007 as a pilot project to sequence about 250 bacterial and archaeal genomes of elevated phylogenetic diversity. Herein, we propose to extend this approach to type strains of prokaryotes associated with soil or plants and their close relatives as well as type strains from newly described species. Understanding the microbiology of soil and plants is critical to many DOE mission areas, such as biofuel production from biomass, biogeochemistry, and carbon cycling. We are also targeting type strains of novel species while they are being described. Since 2006, about 630 new species have been described per year, many of which are closely aligned to DOE areas of interest in soil, agriculture, degradation of pollutants, biofuel production, biogeochemical transformation, and biodiversity.

  13. Pore-scale simulation of microbial growth using a genome-scale metabolic model: Implications for Darcy-scale reactive transport

    Science.gov (United States)

    Tartakovsky, G. D.; Tartakovsky, A. M.; Scheibe, T. D.; Fang, Y.; Mahadevan, R.; Lovley, D. R.

    2013-09-01

    Recent advances in microbiology have enabled the quantitative simulation of microbial metabolism and growth based on genome-scale characterization of metabolic pathways and fluxes. We have incorporated a genome-scale metabolic model of the iron-reducing bacteria Geobacter sulfurreducens into a pore-scale simulation of microbial growth based on coupling of iron reduction to oxidation of a soluble electron donor (acetate). In our model, fluid flow and solute transport is governed by a combination of the Navier-Stokes and advection-diffusion-reaction equations. Microbial growth occurs only on the surface of soil grains where solid-phase mineral iron oxides are available. Mass fluxes of chemical species associated with microbial growth are described by the genome-scale microbial model, implemented using a constraint-based metabolic model, and provide the Robin-type boundary condition for the advection-diffusion equation at soil grain surfaces. Conventional models of microbially-mediated subsurface reactions use a lumped reaction model that does not consider individual microbial reaction pathways, and describe reactions rates using empirically-derived rate formulations such as the Monod-type kinetics. We have used our pore-scale model to explore the relationship between genome-scale metabolic models and Monod-type formulations, and to assess the manifestation of pore-scale variability (microenvironments) in terms of apparent Darcy-scale microbial reaction rates. The genome-scale model predicted lower biomass yield, and different stoichiometry for iron consumption, in comparison to prior Monod formulations based on energetics considerations. We were able to fit an equivalent Monod model, by modifying the reaction stoichiometry and biomass yield coefficient, that could effectively match results of the genome-scale simulation of microbial behaviors under excess nutrient conditions, but predictions of the fitted Monod model deviated from those of the genome-scale model

  14. Pore-scale simulation of microbial growth using a genome-scale metabolic model: Implications for Darcy-scale reactive transport

    Energy Technology Data Exchange (ETDEWEB)

    Tartakovsky, Guzel D.; Tartakovsky, Alexandre M.; Scheibe, Timothy D.; Fang, Yilin; Mahadevan, Radhakrishnan; Lovley, Derek R.

    2013-09-07

    Recent advances in microbiology have enabled the quantitative simulation of microbial metabolism and growth based on genome-scale characterization of metabolic pathways and fluxes. We have incorporated a genome-scale metabolic model of the iron-reducing bacteria Geobacter sulfurreducens into a pore-scale simulation of microbial growth based on coupling of iron reduction to oxidation of a soluble electron donor (acetate). In our model, fluid flow and solute transport is governed by a combination of the Navier-Stokes and advection-diffusion-reaction equations. Microbial growth occurs only on the surface of soil grains where solid-phase mineral iron oxides are available. Mass fluxes of chemical species associated with microbial growth are described by the genome-scale microbial model, implemented using a constraint-based metabolic model, and provide the Robin-type boundary condition for the advection-diffusion equation at soil grain surfaces. Conventional models of microbially-mediated subsurface reactions use a lumped reaction model that does not consider individual microbial reaction pathways, and describe reactions rates using empirically-derived rate formulations such as the Monod-type kinetics. We have used our pore-scale model to explore the relationship between genome-scale metabolic models and Monod-type formulations, and to assess the manifestation of pore-scale variability (microenvironments) in terms of apparent Darcy-scale microbial reaction rates. The genome-scale model predicted lower biomass yield, and different stoichiometry for iron consumption, in comparisonto prior Monod formulations based on energetics considerations. We were able to fit an equivalent Monod model, by modifying the reaction stoichiometry and biomass yield coefficient, that could effectively match results of the genome-scale simulation of microbial behaviors under excess nutrient conditions, but predictions of the fitted Monod model deviated from those of the genome-scale model under

  15. Purification and partial genome characterization of the bacterial endosymbiont Blattabacterium cuenoti from the fat bodies of cockroaches

    Directory of Open Access Journals (Sweden)

    Yamada Akinori

    2008-11-01

    Full Text Available Abstract Background Symbiotic relationships between intracellular bacteria and eukaryotes are widespread in nature. Genome sequencing of the bacterial partner has provided a number of key insights into the basis of these symbioses. A challenging aspect of sequencing symbiont genomes is separating the bacteria from the host tissues. In the present study, we describe a simple method of endosymbiont purification from complex environment, using Blattabacterium cuenoti inhabiting in cockroaches as a model system. Findings B. cuenoti cells were successfully purified from the fat bodies of the cockroach Panesthia angustipennis by a combination of slow- and fast-speed centrifugal fractionations, nylon-membrane filtration, and centrifugation with Percoll solutions. We performed pulse-field electrophoresis, diagnostic PCR and random sequencing of the shoutgun library. These experiments confirmed minimal contamination of host and mitochondrial DNA. The genome size and the G+C content of B. cuenoti were inferred to be 650 kb and 32.1 ± 7.6%, respectively. Conclusion The present study showed successful purification and characterization of the genome of B. cuenoti. Our methodology should be applicable for future symbiont genome sequencing projects. An advantage of the present purification method is that each step is easily performed with ordinary microtubes and a microcentrifuge, and without DNase treatment.

  16. Mapping copy number variation by population-scale genome sequencing

    DEFF Research Database (Denmark)

    Mills, Ryan E.; Walter, Klaudia; Stewart, Chip;

    2011-01-01

    Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, ...

  17. An empirical strategy for characterizing bacterial proteomes across species in the absence of genomic sequences

    Energy Technology Data Exchange (ETDEWEB)

    Turse, Joshua E.; Marshall, Matthew J.; Fredrickson, Jim K.; Lipton, Mary S.; Callister, Stephen J.

    2010-11-12

    Current methods in proteomics are dependent on the availability of sequenced genomes to identify proteins. However, genomic sequences are not always available for bacteria or microbial communities, even with high throughput sequencing technology becoming more readily available. Nevertheless, the homology that exists between related bacteria makes possible the extraction of meaningful biological information from an organism’s, or community’s proteome using the genomic sequence of a near neighbor. Here, a cross-organism search strategy was used to look at the amount of proteomics information obtainable with relative genetic distance from a near neighbor organism and to identify proteins in the proteome of minimally characterized environmental isolates. We conclude that closely related organisms with sequenced genomes, can be used to characterize proteomes of organisms with unsequenced genomes. In general, a cross-organism search strategy demonstrates the first step to use of sequences genomes to evaluate the proteomes of environmental bacteria and microbial communities that have no sequenced genome

  18. LLNL Genomic Assessment: Viral and Bacterial Sequencing Needs for TMTI, Tier 1 Report

    Energy Technology Data Exchange (ETDEWEB)

    Slezak, T; Borucki, M; Lenhoff, R; Vitalis, E

    2009-09-29

    The Lawrence Livermore National Lab Bioinformatics group has recently taken on a role in DTRA's Transformation Medical Technologies Initiative (TMTI). The high-level goal of TMTI is to accelerate the development of broad-spectrum countermeasures. To achieve those goals, TMTI has a near term need to obtain more sequence information across a large range of pathogens, near neighbors, and across a broad geographical and host range. Our role in this project is to research available sequence data for the organisms of interest and identify critical microbial sequence and knowledge gaps that need to be filled to meet TMTI objectives. This effort includes: (1) assessing current genomic sequence for each agent including phylogenetic and geographical diversity, host range, date of isolation range, virulence, sequence availability of key near neighbors, and other characteristics; (2) identifying Subject Matter Experts (SME's) and potential holders of isolate collections, contacting appropriate SME's with known expertise and isolate collections to obtain information on isolate availability and specific recommendations; (3) identifying sequence as well as knowledge gaps (eg virulence, host range, and antibiotic resistance determinants); (4) providing specific recommendations as to the most valuable strains to be placed on the DTRA sequencing queue. We acknowledge that criteria for prioritization of isolates for sequencing falls into two categories aligning with priority queues 1 and 2 as described in the summary. (Priority queue 0 relates to DTRA operational isolates whose availability is not predictable in advance.) 1. Selection of isolates that appear to have likelihood to provide information on virulence and antibiotic resistance. This will include sequence of known virulent strains. Particularly valuable would be virulent strains that have genetically similar yet avirulent, or non human transmissible, counterparts that can be used for comparison to help

  19. LLNL Genomic Assessment: Viral and Bacterial Sequencing Needs for TMTI, Tier 1 Report

    Energy Technology Data Exchange (ETDEWEB)

    Slezak, T; Borucki, M; Lenhoff, R; Vitalis, E

    2009-09-29

    The Lawrence Livermore National Lab Bioinformatics group has recently taken on a role in DTRA's Transformation Medical Technologies Initiative (TMTI). The high-level goal of TMTI is to accelerate the development of broad-spectrum countermeasures. To achieve those goals, TMTI has a near term need to obtain more sequence information across a large range of pathogens, near neighbors, and across a broad geographical and host range. Our role in this project is to research available sequence data for the organisms of interest and identify critical microbial sequence and knowledge gaps that need to be filled to meet TMTI objectives. This effort includes: (1) assessing current genomic sequence for each agent including phylogenetic and geographical diversity, host range, date of isolation range, virulence, sequence availability of key near neighbors, and other characteristics; (2) identifying Subject Matter Experts (SME's) and potential holders of isolate collections, contacting appropriate SME's with known expertise and isolate collections to obtain information on isolate availability and specific recommendations; (3) identifying sequence as well as knowledge gaps (eg virulence, host range, and antibiotic resistance determinants); (4) providing specific recommendations as to the most valuable strains to be placed on the DTRA sequencing queue. We acknowledge that criteria for prioritization of isolates for sequencing falls into two categories aligning with priority queues 1 and 2 as described in the summary. (Priority queue 0 relates to DTRA operational isolates whose availability is not predictable in advance.) 1. Selection of isolates that appear to have likelihood to provide information on virulence and antibiotic resistance. This will include sequence of known virulent strains. Particularly valuable would be virulent strains that have genetically similar yet avirulent, or non human transmissible, counterparts that can be used for comparison to help

  20. The channel catfish genome sequence provides insights into the evolution of scale formation in teleosts

    Science.gov (United States)

    Liu, Zhanjiang; Liu, Shikai; Yao, Jun; Bao, Lisui; Zhang, Jiaren; Li, Yun; Jiang, Chen; Sun, Luyang; Wang, Ruijia; Zhang, Yu; Zhou, Tao; Zeng, Qifan; Fu, Qiang; Gao, Sen; Li, Ning; Koren, Sergey; Jiang, Yanliang; Zimin, Aleksey; Xu, Peng; Phillippy, Adam M.; Geng, Xin; Song, Lin; Sun, Fanyue; Li, Chao; Wang, Xiaozhu; Chen, Ailu; Jin, Yulin; Yuan, Zihao; Yang, Yujia; Tan, Suxu; Peatman, Eric; Lu, Jianguo; Qin, Zhenkui; Dunham, Rex; Li, Zhaoxia; Sonstegard, Tad; Feng, Jianbin; Danzmann, Roy G.; Schroeder, Steven; Scheffler, Brian; Duke, Mary V.; Ballard, Linda; Kucuktas, Huseyin; Kaltenboeck, Ludmilla; Liu, Haixia; Armbruster, Jonathan; Xie, Yangjie; Kirby, Mona L.; Tian, Yi; Flanagan, Mary Elizabeth; Mu, Weijie; Waldbieser, Geoffrey C.

    2016-01-01

    Catfish represent 12% of teleost or 6.3% of all vertebrate species, and are of enormous economic value. Here we report a high-quality reference genome sequence of channel catfish (Ictalurus punctatus), the major aquaculture species in the US. The reference genome sequence was validated by genetic mapping of 54,000 SNPs, and annotated with 26,661 predicted protein-coding genes. Through comparative analysis of genomes and transcriptomes of scaled and scaleless fish and scale regeneration experiments, we address the genomic basis for the most striking physical characteristic of catfish, the evolutionary loss of scales and provide evidence that lack of secretory calcium-binding phosphoproteins accounts for the evolutionary loss of scales in catfish. The channel catfish reference genome sequence, along with two additional genome sequences and transcriptomes of scaled catfishes, provide crucial resources for evolutionary and biological studies. This work also demonstrates the power of comparative subtraction of candidate genes for traits of structural significance. PMID:27249958

  1. Genome-Scale Metabolic Modeling in the Simulation of Field-Scale Uranium Bioremediation

    Science.gov (United States)

    Yabusaki, S.; Wilkins, M.; Fang, Y.; Williams, K. H.; Waichler, S.; Long, P. E.

    2015-12-01

    Coupled variably saturated flow and biogeochemical reactive transport modeling is used to improve understanding of the processes, properties, and conditions controlling uranium bio-immobilization in a field experiment where uranium-contaminated groundwater was amended with acetate and bicarbonate. The acetate stimulates indigenous microorganisms that catalyze metal reduction, including the conversion of aqueous U(VI) to solid-phase U(IV), which effectively removes uranium from solution. The initiation of the bicarbonate amendment prior to biostimulation was designed to promote U(VI) desorption that would increase the aqueous U(VI) available for bioreduction. The three-dimensional simulations were able to largely reproduce the timing and magnitude of the physical, chemical and biological responses to the acetate and bicarbonate amendment in the context of changing water table elevation and gradient. A time series of groundwater proteomic samples exhibited correlations between the most abundant Geobacter metallireducens proteins and the genome-scale metabolic model-predicted fluxes of intra-cellular reactions associated with each of those proteins. The desorption of U(VI) induced by the bicarbonate amendment led to initially higher rates of bioreduction compared to locations with minimal bicarbonate exposure. After bicarbonate amendment ceased, bioreduction continued at these locations whereas U(VI) sorption was the dominant removal mechanism at the bicarbonate-impacted sites.

  2. A short-time scale colloidal system reveals early bacterial adhesion dynamics.

    Directory of Open Access Journals (Sweden)

    Christophe Beloin

    2008-07-01

    Full Text Available The development of bacteria on abiotic surfaces has important public health and sanitary consequences. However, despite several decades of study of bacterial adhesion to inert surfaces, the biophysical mechanisms governing this process remain poorly understood, due, in particular, to the lack of methodologies covering the appropriate time scale. Using micrometric colloidal surface particles and flow cytometry analysis, we developed a rapid multiparametric approach to studying early events in adhesion of the bacterium Escherichia coli. This approach simultaneously describes the kinetics and amplitude of early steps in adhesion, changes in physicochemical surface properties within the first few seconds of adhesion, and the self-association state of attached and free-floating cells. Examination of the role of three well-characterized E. coli surface adhesion factors upon attachment to colloidal surfaces--curli fimbriae, F-conjugative pilus, and Ag43 adhesin--showed clear-cut differences in the very initial phases of surface colonization for cell-bearing surface structures, all known to promote biofilm development. Our multiparametric analysis revealed a correlation in the adhesion phase with cell-to-cell aggregation properties and demonstrated that this phenomenon amplified surface colonization once initial cell-surface attachment was achieved. Monitoring of real-time physico-chemical particle surface properties showed that surface-active molecules of bacterial origin quickly modified surface properties, providing new insight into the intricate relations connecting abiotic surface physicochemical properties and bacterial adhesion. Hence, the biophysical analytical method described here provides a new and relevant approach to quantitatively and kinetically investigating bacterial adhesion and biofilm development.

  3. Analysis of Aspergillus nidulans metabolism at the genome-scale

    DEFF Research Database (Denmark)

    David, Helga; Ozcelik, İlknur Ş; Hofmann, Gerald

    2008-01-01

    a function. Results: In this work, we have manually assigned functions to 472 orphan genes in the metabolism of A. nidulans, by using a pathway-driven approach and by employing comparative genomics tools based on sequence similarity. The central metabolism of A. nidulans, as well as biosynthetic pathways...... of relevant secondary metabolites, was reconstructed based on detailed metabolic reconstructions available for A. niger and Saccharomyces cerevisiae, and information on the genetics, biochemistry and physiology of A. nidulans. Thereby, it was possible to identify metabolic functions without a gene associated......, and to look for candidate ORFs in the genome of A. nidulans by comparing its sequence to sequences of well-characterized genes in other species encoding the function of interest. A classification system, based on defined criteria, was developed for evaluating and selecting the ORFs among the candidates...

  4. Cycling Transcriptional Networks Optimize Energy Utilization on a Genome Scale.

    Science.gov (United States)

    Wang, Guang-Zhong; Hickey, Stephanie L; Shi, Lei; Huang, Hung-Chung; Nakashe, Prachi; Koike, Nobuya; Tu, Benjamin P; Takahashi, Joseph S; Konopka, Genevieve

    2015-12-01

    Genes expressing circadian RNA rhythms are enriched for metabolic pathways, but the adaptive significance of cyclic gene expression remains unclear. We estimated the genome-wide synthetic and degradative cost of transcription and translation in three organisms and found that the cost of cycling genes is strikingly higher compared to non-cycling genes. Cycling genes are expressed at high levels and constitute the most costly proteins to synthesize in the genome. We demonstrate that metabolic cycling is accelerated in yeast grown under higher nutrient flux and the number of cycling genes increases ∼40%, which are achieved by increasing the amplitude and not the mean level of gene expression. These results suggest that rhythmic gene expression optimizes the metabolic cost of global gene expression and that highly expressed genes have been selected to be downregulated in a cyclic manner for energy conservation.

  5. Scaling up genome annotation using MAKER and work queue.

    Science.gov (United States)

    Thrasher, Andrew; Musgrave, Zachary; Kachmarck, Brian; Thain, Douglas; Emrich, Scott

    2014-01-01

    Next generation sequencing technologies have enabled sequencing many genomes. Because of the overall increasing demand and the inherent parallelism available in many required analyses, these bioinformatics applications should ideally run on clusters, clouds and/or grids. We present a modified annotation framework that achieves a speed-up of 45x using 50 workers using a Caenorhabditis japonica test case. We also evaluate these modifications within the Amazon EC2 cloud framework. The underlying genome annotation (MAKER) is parallelised as an MPI application. Our framework enables it to now run without MPI while utilising a wide variety of distributed computing resources. This parallel framework also allows easy explicit data transfer, which helps overcome a major limitation of bioinformatics tools that often rely on shared file systems. Combined, our proposed framework can be used, even during early stages of development, to easily run sequence analysis tools on clusters, grids and clouds.

  6. Direct-to-consumer genomics on the scales of autonomy

    Science.gov (United States)

    Vayena, Effy

    2015-01-01

    Direct-to-consumer (DTC) genetic services have generated enormous controversy from their first emergence. A dramatic recent manifestation of this is the Food and Drug Administration's (FDA) cease and desist order against 23andMe, the leading provider in the market. Critics have argued for the restrictive regulation of such services, and even their prohibition, on the grounds of the harm they pose to consumers. Their advocates, by contrast, defend them as a means of enhancing the autonomy of those same consumers. Autonomy emerges as a key battle-field in this debate, because many of the ‘harm’ arguments can be interpreted as identifying threats to autonomy. This paper assesses whether DTC genomic services are a threat to, or instead, an enhancement of, personal autonomy. It deploys Joseph Raz's account of personal autonomy, with its emphasis on choice from a range of valuable options. It then seeks to counter claims that DTC genomics threatens autonomy because it involves manipulation in contravention of consumers’ independence or because it does not generate valuable options which can be meaningfully engaged with by consumers. It is stressed that the value of the options generated by DTC genomics should not be judged exclusively from the perspective of medical actionability, but should take into consideration plural utilities. Finally, the paper ends by broaching policy recommendations, suggesting that there is a strong autonomy-based argument for permitting DTC genomic services, and that the key question is the nature of the regulatory conditions under which they should be permitted. The discussion of autonomy in this paper helps illuminate some of these conditions. PMID:24797610

  7. Genome-scale reconstruction of the Saccharomyces cerevisiae metabolic network

    DEFF Research Database (Denmark)

    Förster, Jochen; Famili, I.; Fu, P.;

    2003-01-01

    The metabolic network in the yeast Saccharomyces cerevisiae was reconstructed using currently available genomic, biochemical, and physiological information. The metabolic reactions were compartmentalized between the cytosol and the mitochondria, and transport steps between the compartments...... containing 1175 metabolic reactions and 584 metabolites. The number of gene functions included in the reconstructed network corresponds to similar to16% of all characterized ORFs in S. cerevisiae. Using the reconstructed network, the metabolic capabilities of S. cerevisiae were calculated and compared...

  8. Direct-to-consumer genomics on the scales of autonomy.

    Science.gov (United States)

    Vayena, Effy

    2015-04-01

    Direct-to-consumer (DTC) genetic services have generated enormous controversy from their first emergence. A dramatic recent manifestation of this is the Food and Drug Administration's (FDA) cease and desist order against 23andMe, the leading provider in the market. Critics have argued for the restrictive regulation of such services, and even their prohibition, on the grounds of the harm they pose to consumers. Their advocates, by contrast, defend them as a means of enhancing the autonomy of those same consumers. Autonomy emerges as a key battle-field in this debate, because many of the 'harm' arguments can be interpreted as identifying threats to autonomy. This paper assesses whether DTC genomic services are a threat to, or instead, an enhancement of, personal autonomy. It deploys Joseph Raz's account of personal autonomy, with its emphasis on choice from a range of valuable options. It then seeks to counter claims that DTC genomics threatens autonomy because it involves manipulation in contravention of consumers' independence or because it does not generate valuable options which can be meaningfully engaged with by consumers. It is stressed that the value of the options generated by DTC genomics should not be judged exclusively from the perspective of medical actionability, but should take into consideration plural utilities. Finally, the paper ends by broaching policy recommendations, suggesting that there is a strong autonomy-based argument for permitting DTC genomic services, and that the key question is the nature of the regulatory conditions under which they should be permitted. The discussion of autonomy in this paper helps illuminate some of these conditions.

  9. Symmetry and scale orient Min protein patterns in shaped bacterial sculptures

    Science.gov (United States)

    Wu, Fabai; van Schie, Bas G. C.; Keymer, Juan E.; Dekker, Cees

    2015-08-01

    The boundary of a cell defines the shape and scale of its subcellular organization. However, the effects of the cell's spatial boundaries as well as the geometry sensing and scale adaptation of intracellular molecular networks remain largely unexplored. Here, we show that living bacterial cells can be ‘sculpted’ into defined shapes, such as squares and rectangles, which are used to explore the spatial adaptation of Min proteins that oscillate pole-to-pole in rod-shaped Escherichia coli to assist cell division. In a wide geometric parameter space, ranging from 2 × 1 × 1 to 11 × 6 × 1 μm3, Min proteins exhibit versatile oscillation patterns, sustaining rotational, longitudinal, diagonal, stripe and even transversal modes. These patterns are found to directly capture the symmetry and scale of the cell boundary, and the Min concentration gradients scale with the cell size within a characteristic length range of 3-6 μm. Numerical simulations reveal that local microscopic Turing kinetics of Min proteins can yield global symmetry selection, gradient scaling and an adaptive range, when and only when facilitated by the three-dimensional confinement of the cell boundary. These findings cannot be explained by previous geometry-sensing models based on the longest distance, membrane area or curvature, and reveal that spatial boundaries can facilitate simple molecular interactions to result in far more versatile functions than previously understood.

  10. Dynamics of bacterial communities before and after distribution in a full-scale drinking water network

    KAUST Repository

    El Chakhtoura, Joline

    2015-05-01

    Understanding the biological stability of drinking water distribution systems is imperative in the framework of process control and risk management. The objective of this research was to examine the dynamics of the bacterial community during drinking water distribution at high temporal resolution. Water samples (156 in total) were collected over short time-scales (minutes/hours/days) from the outlet of a treatment plant and a location in its corresponding distribution network. The drinking water is treated by biofiltration and disinfectant residuals are absent during distribution. The community was analyzed by 16S rRNA gene pyrosequencing and flow cytometry as well as conventional, culture-based methods. Despite a random dramatic event (detected with pyrosequencing and flow cytometry but not with plate counts), the bacterial community profile at the two locations did not vary significantly over time. A diverse core microbiome was shared between the two locations (58-65% of the taxa and 86-91% of the sequences) and found to be dependent on the treatment strategy. The bacterial community structure changed during distribution, with greater richness detected in the network and phyla such as Acidobacteria and Gemmatimonadetes becoming abundant. The rare taxa displayed the highest dynamicity, causing the major change during water distribution. This change did not have hygienic implications and is contingent on the sensitivity of the applied methods. The concept of biological stability therefore needs to be revised. Biostability is generally desired in drinking water guidelines but may be difficult to achieve in large-scale complex distribution systems that are inherently dynamic.

  11. TIGER: Toolbox for integrating genome-scale metabolic models, expression data, and transcriptional regulatory networks

    Directory of Open Access Journals (Sweden)

    Jensen Paul A

    2011-09-01

    Full Text Available Abstract Background Several methods have been developed for analyzing genome-scale models of metabolism and transcriptional regulation. Many of these methods, such as Flux Balance Analysis, use constrained optimization to predict relationships between metabolic flux and the genes that encode and regulate enzyme activity. Recently, mixed integer programming has been used to encode these gene-protein-reaction (GPR relationships into a single optimization problem, but these techniques are often of limited generality and lack a tool for automating the conversion of rules to a coupled regulatory/metabolic model. Results We present TIGER, a Toolbox for Integrating Genome-scale Metabolism, Expression, and Regulation. TIGER converts a series of generalized, Boolean or multilevel rules into a set of mixed integer inequalities. The package also includes implementations of existing algorithms to integrate high-throughput expression data with genome-scale models of metabolism and transcriptional regulation. We demonstrate how TIGER automates the coupling of a genome-scale metabolic model with GPR logic and models of transcriptional regulation, thereby serving as a platform for algorithm development and large-scale metabolic analysis. Additionally, we demonstrate how TIGER's algorithms can be used to identify inconsistencies and improve existing models of transcriptional regulation with examples from the reconstructed transcriptional regulatory network of Saccharomyces cerevisiae. Conclusion The TIGER package provides a consistent platform for algorithm development and extending existing genome-scale metabolic models with regulatory networks and high-throughput data.

  12. Bacterial toxicity comparison between nano- and micro-scaled oxide particles

    Energy Technology Data Exchange (ETDEWEB)

    Jiang Wei; Mashayekhi, Hamid [Department of Plant, Soil and Insect Sciences, University of Massachusetts, Stockbridge Hall, Amherst, MA 01003 (United States); Xing Baoshan, E-mail: bx@pssci.umass.ed [Department of Plant, Soil and Insect Sciences, University of Massachusetts, Stockbridge Hall, Amherst, MA 01003 (United States)

    2009-05-15

    Toxicity of nano-scaled aluminum, silicon, titanium and zinc oxides to bacteria (Bacillus subtilis, Escherichia coli and Pseudomonas fluorescens) was examined and compared to that of their respective bulk (micro-scaled) counterparts. All nanoparticles but titanium oxide showed higher toxicity (at 20 mg/L) than their bulk counterparts. Toxicity of released metal ions was differentiated from that of the oxide particles. ZnO was the most toxic among the three nanoparticles, causing 100% mortality to the three tested bacteria. Al{sub 2}O{sub 3} nanoparticles had a mortality rate of 57% to B. subtilis, 36% to E. coli, and 70% to P. fuorescens. SiO{sub 2} nanoparticles killed 40% of B. subtilis, 58% of E. coli, and 70% of P. fluorescens. TEM images showed attachment of nanoparticles to the bacteria, suggesting that the toxicity was affected by bacterial attachment. Bacterial responses to nanoparticles were different from their bulk counterparts; hence nanoparticle toxicity mechanisms need to be studied thoroughly. - Oxide nanoparticles show higher toxicity than their bulk counterparts

  13. On the limits of computational functional genomics for bacterial lifestyle prediction

    DEFF Research Database (Denmark)

    Barbosa, Eudes; Röttger, Richard; Hauschild, Anne-Christin

    2014-01-01

    We review the level of genomic specificity regarding actinobacterial pathogenicity. As they occupy various niches in diverse habitats, one may assume the existence of lifestyle-specific genomic features. We include 240 actinobacteria classified into four pathogenicity classes: human pathogens (HP...... in the post-genome era and despite next-generation sequencing technology, our ability to efficiently deduce real-world conclusions, such as pathogenicity classification, remains quite limited....

  14. On the road to synthetic life: the minimal cell and genome-scale engineering.

    Science.gov (United States)

    Juhas, Mario

    2016-01-01

    Synthetic biology employs rational engineering principles to build biological systems from the libraries of standard, well characterized biological parts. Biological systems designed and built by synthetic biologists fulfill a plethora of useful purposes, ranging from better healthcare and energy production to biomanufacturing. Recent advancements in the synthesis, assembly and "booting-up" of synthetic genomes and in low and high-throughput genome engineering have paved the way for engineering on the genome-wide scale. One of the key goals of genome engineering is the construction of minimal genomes consisting solely of essential genes (genes indispensable for survival of living organisms). Besides serving as a toolbox to understand the universal principles of life, the cell encoded by minimal genome could be used to build a stringently controlled "cell factory" with a desired phenotype. This review provides an update on recent advances in the genome-scale engineering with particular emphasis on the engineering of minimal genomes. Furthermore, it presents an ongoing discussion to the scientific community for better suitability of minimal or robust cells for industrial applications.

  15. From amplification to gene in thyroid cancer: A high-resolution mapped bacterial-artificial-chromosome resource for cancer chromosome aberrations guides gene discovery after comparative genome hybridization

    Energy Technology Data Exchange (ETDEWEB)

    Chen, X.N.; Gonsky, R.; Korenberg, J.R. [UCLA School of Medicine, Los Angeles, CA (United States). Cedars-Sinai Research Inst.; Knauf, J.A.; Fagin, J.A. [Univ. of Cincinnati, OH (United States). Div. of Endocrinology/Metabolism; Wang, M.; Lai, E.H. [Univ. of North Carolina, Chapel Hill, NC (United States). Dept. of Pharmacology; Chissoe, S. [Washington Univ. School of Medicine, St. Louis, MO (United States). Genome Sequencing

    1998-08-01

    Chromosome rearrangements associated with neoplasms provide a rich resource for definition of the pathways of tumorigenesis. The power of comparative genome hybridization (CGH) to identify novel genes depends on the existence of suitable markers, which are lacking throughout most of the genome. The authors now report a general approach that translates CGH data into higher-resolution genomic-clone data that are then used to define the genes located in aneuploid regions. They used CGH to study 33 thyroid-tumor DNAs and two tumor-cell-line DNAs. The results revealed amplifications of chromosome band 2p21, with less-intense amplification on 2p13, 19q13.1, and 1p36 and with least-intense amplification on 1p34, 1q42, 5q31, 5q33-34, 9q32-34, and 14q32. To define the 2p21 region amplified, a dense array of 373 FISH-mapped chromosome 2 bacterial artificial chromosomes (BACs) was constructed, and 87 of these were hybridized to a tumor-cell line. Four BACs carried genomic DNA that was amplified in these cells. The maximum amplified region was narrowed to 3--6 Mb by multicolor FISH with the flanking BACs, and the minimum amplicon size was defined by a contig of 420 kb. Sequence analysis of the amplified BAC 1D9 revealed a fragment of the gene, encoding protein kinase C epsilon (PKC{epsilon}), that was then shown to be amplified and rearranged in tumor cells. In summary, CGH combined with a dense mapped resource of BACs and large-scale sequencing has led directly to the definition of PKC{epsilon} as a previously unmapped candidate gene involved in thyroid tumorigenesis.

  16. The Genomic Sequence of the Oral Pathobiont Strain NI1060 Reveals Unique Strategies for Bacterial Competition and Pathogenicity.

    Directory of Open Access Journals (Sweden)

    Youssef Darzi

    Full Text Available Strain NI1060 is an oral bacterium responsible for periodontitis in a murine ligature-induced disease model. To better understand its pathogenicity, we have determined the complete sequence of its 2,553,982 bp genome. Although closely related to Pasteurella pneumotropica, a pneumonia-associated rodent commensal based on its 16S rRNA, the NI1060 genomic content suggests that they are different species thriving on different energy sources via alternative metabolic pathways. Genomic and phylogenetic analyses showed that strain NI1060 is distinct from the genera currently described in the family Pasteurellaceae, and is likely to represent a novel species. In addition, we found putative virulence genes involved in lipooligosaccharide synthesis, adhesins and bacteriotoxic proteins. These genes are potentially important for host adaption and for the induction of dysbiosis through bacterial competition and pathogenicity. Importantly, strain NI1060 strongly stimulates Nod1, an innate immune receptor, but is defective in two peptidoglycan recycling genes due to a frameshift mutation. The in-depth analysis of its genome thus provides critical insights for the development of NI1060 as a prime model system for infectious disease.

  17. Genome Sequences of 12 Bacterial Isolates Obtained from the Urine of Pregnant Women

    Science.gov (United States)

    Weimer, Cory M.; Deitzler, Grace E.; Robinson, Lloyd S.; Park, SoEun; Hallsworth-Pepin, Kymberlie; Wollam, Aye; Mitreva, Makedonka

    2016-01-01

    The presence of bacteria in urine can pose significant risks during pregnancy. However, there are few reference genome strains for many common urinary bacteria. We isolated 12 urinary strains of Streptococcus, Staphylococcus, Citrobacter, Gardnerella, and Lactobacillus. These strains and their genomes are now available to the research community. PMID:27688327

  18. FDA Bioinformatics Tool for Microbial Genomics Research on Molecular Characterization of Bacterial Foodborne Pathogens Using Microarrays

    Science.gov (United States)

    Background: Advances in microbial genomics and bioinformatics are offering greater insights into the emergence and spread of foodborne pathogens in outbreak scenarios. The Food and Drug Administration (FDA) has developed the genomics tool ArrayTrackTM, which provides extensive functionalities to man...

  19. Draft Genome Sequence of the Shellfish Bacterial Pathogen Vibrio sp. Strain B183.

    Science.gov (United States)

    Schreier, Harold J; Schott, Eric J

    2014-09-18

    We report the draft genome sequence of Vibrio sp. strain B183, a Gram-negative marine bacterium isolated from shellfish that causes mortality in larval mariculture. The availability of this genome sequence will facilitate the study of its virulence mechanisms and add to our knowledge of Vibrio sp. diversity and evolution.

  20. Whole-Genome Sequencing of Invasion-Resistant Cells Identifies Laminin α2 as a Host Factor for Bacterial Invasion

    Science.gov (United States)

    van Wijk, Xander M.; Döhrmann, Simon; Hallström, Björn M.; Li, Shangzhong; Voldborg, Bjørn G.; Meng, Brandon X.; McKee, Karen K.; van Kuppevelt, Toin H.; Yurchenco, Peter D.; Palsson, Bernhard O.; Lewis, Nathan E.; Nizet, Victor

    2017-01-01

    ABSTRACT To understand the role of glycosaminoglycans in bacterial cellular invasion, xylosyltransferase-deficient mutants of Chinese hamster ovary (CHO) cells were created using clustered regularly interspaced short palindromic repeat (CRISPR) and CRISPR-associated gene 9 (CRISPR-cas9) gene targeting. When these mutants were compared to the pgsA745 cell line, a CHO xylosyltransferase mutant generated previously using chemical mutagenesis, an unexpected result was obtained. Bacterial invasion of pgsA745 cells by group B Streptococcus (GBS), group A Streptococcus, and Staphylococcus aureus was markedly reduced compared to the invasion of wild-type cells, but newly generated CRISPR-cas9 mutants were only resistant to GBS. Invasion of pgsA745 cells was not restored by transfection with xylosyltransferase, suggesting that an additional mutation conferring panresistance to multiple bacteria was present in pgsA745 cells. Whole-genome sequencing and transcriptome sequencing (RNA-Seq) uncovered a deletion in the gene encoding the laminin subunit α2 (Lama2) that eliminated much of domain L4a. Silencing of the long Lama2 isoform in wild-type cells strongly reduced bacterial invasion, whereas transfection with human LAMA2 cDNA significantly enhanced invasion in pgsA745 cells. The addition of exogenous laminin-α2β1γ1/laminin-α2β2γ1 strongly increased bacterial invasion in CHO cells, as well as in human alveolar basal epithelial and human brain microvascular endothelial cells. Thus, the L4a domain in laminin α2 is important for cellular invasion by a number of bacterial pathogens. PMID:28074024

  1. Genomics, evolution, and crystal structure of a new family of bacterial spore kinases

    OpenAIRE

    2009-01-01

    Bacterial spore formation is a complex process of fundamental relevance to biology and human disease. The spore coat structure is complex and poorly understood, and the roles of many of the protein components remain unclear. We describe a new family of spore coat proteins, the bacterial spore kinases (BSKs), and the first crystal structure of a BSK, YtaA (CotI) from Bacillus subtilis. BSKs are widely distributed in spore-forming Bacillus and Clostridium species, and have a dynamic evolutionar...

  2. Genomic DNA fingerprint analysis of biotype 1 Gardnerella vaginalis from patients with and without bacterial vaginosis.

    Science.gov (United States)

    Wu, S R; Hillier, S L; Nath, K

    1996-01-01

    Of the 20 biotype 1 Gardnerella vaginalis isolates analyzed, 10 from patients with bacterial vaginosis and 10 from patients without bacterial vaginosis, none shared the same DNA fingerprint. However, a 1.18-kb HindIII fragment was common among 18 of the 20 biotype 1 isolates in a restriction fragment length polymorphism analysis with a 7.9-kb G. vaginalis DNA probe. PMID:8748302

  3. From Environment to Man: Genome Evolution and Adaptation of Human Opportunistic Bacterial Pathogens

    OpenAIRE

    Estelle Jumas-Bilak; Hélène Marchandin; Brigitte Lamy; Anne Lotthé; Fabien Aujoulat; Frédéric Roger; Alice Bourdier

    2012-01-01

    Environment is recognized as a huge reservoir for bacterial species and a source of human pathogens. Some environmental bacteria have an extraordinary range of activities that include promotion of plant growth or disease, breakdown of pollutants, production of original biomolecules, but also multidrug resistance and human pathogenicity. The versatility of bacterial life-style involves adaptation to various niches. Adaptation to both open environment and human specific niches is a major challe...

  4. Savant Genome Browser 2: visualization and analysis for population-scale genomics.

    Science.gov (United States)

    Fiume, Marc; Smith, Eric J M; Brook, Andrew; Strbenac, Dario; Turner, Brian; Mezlini, Aziz M; Robinson, Mark D; Wodak, Shoshana J; Brudno, Michael

    2012-07-01

    High-throughput sequencing (HTS) technologies are providing an unprecedented capacity for data generation, and there is a corresponding need for efficient data exploration and analysis capabilities. Although most existing tools for HTS data analysis are developed for either automated (e.g. genotyping) or visualization (e.g. genome browsing) purposes, such tools are most powerful when combined. For example, integration of visualization and computation allows users to iteratively refine their analyses by updating computational parameters within the visual framework in real-time. Here we introduce the second version of the Savant Genome Browser, a standalone program for visual and computational analysis of HTS data. Savant substantially improves upon its predecessor and existing tools by introducing innovative visualization modes and navigation interfaces for several genomic datatypes, and synergizing visual and automated analyses in a way that is powerful yet easy even for non-expert users. We also present a number of plugins that were developed by the Savant Community, which demonstrate the power of integrating visual and automated analyses using Savant. The Savant Genome Browser is freely available (open source) at www.savantbrowser.com.

  5. A versatile genome-scale PCR-based pipeline for high-definition DNA FISH.

    Science.gov (United States)

    Bienko, Magda; Crosetto, Nicola; Teytelman, Leonid; Klemm, Sandy; Itzkovitz, Shalev; van Oudenaarden, Alexander

    2013-02-01

    We developed a cost-effective genome-scale PCR-based method for high-definition DNA FISH (HD-FISH). We visualized gene loci with diffraction-limited resolution, chromosomes as spot clusters and single genes together with transcripts by combining HD-FISH with single-molecule RNA FISH. We provide a database of over 4.3 million primer pairs targeting the human and mouse genomes that is readily usable for rapid and flexible generation of probes.

  6. In Silico Genome-Scale Reconstruction and Validation of the Corynebacterium glutamicum Metabolic Network

    DEFF Research Database (Denmark)

    Kjeldsen, Kjeld Raunkjær; Nielsen, J.

    2009-01-01

    A genome-scale metabolic model of the Gram-positive bacteria Corynebacterium glutamicum ATCC 13032 was constructed comprising 446 reactions and 411 metabolite, based on the annotated genome and available biochemical information. The network was analyzed using constraint based methods. The model...... and lactate. Comparable flux values between in silico model and experimental values were seen, although some differences in the phenotypic behavior between the model and the experimental data were observed,...

  7. In situ spatial patterns of soil bacterial populations, mapped at multiple scales, in an arable soil.

    Science.gov (United States)

    Nunan, N; Wu, K; Young, I M; Crawford, J W; Ritz, K

    2002-11-01

    Very little is known about the spatial organization of soil microbes across scales that are relevant both to microbial function and to field-based processes. The spatial distributions of microbes and microbially mediated activity have a high intrinsic variability. This can present problems when trying to quantify the effects of disturbance, management practices, or climate change on soil microbial systems and attendant function. A spatial sampling regime was implemented in an arable field. Cores of undisturbed soil were sampled from a 3 x 3 x 0.9 m volume of soil (topsoil and subsoil) and a biological thin section, in which the in situ distribution of bacteria could be quantified, prepared from each core. Geostatistical analysis was used to quantify the nature of spatial structure from micrometers to meters and spatial point pattern analysis to test for deviations from complete spatial randomness of mapped bacteria. Spatial structure in the topsoil was only found at the microscale (micrometers), whereas evidence for nested scales of spatial structure was found in the subsoil (at the microscale, and at the centimeter to meter scale). Geostatistical ranges of spatial structure at the micro scale were greater in the topsoil and tended to decrease with depth in the subsoil. Evidence for spatial aggregation in bacteria was stronger in the topsoil and also decreased with depth in the subsoil, though extremely high degrees of aggregation were found at very short distances in the deep subsoil. The data suggest that factors that regulate the distribution of bacteria in the subsoil operate at two scales, in contrast to one scale in the topsoil, and that bacterial patches are larger and more prevalent in the topsoil.

  8. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions

    Science.gov (United States)

    Burton, Joshua N.; Adey, Andrew; Patwardhan, Rupali P.; Qiu, Ruolan; Kitzman, Jacob O.; Shendure, Jay

    2014-01-01

    Genomes assembled de novo from short reads are highly fragmented relative to the finished chromosomes of H. sapiens and key model organisms generated by the Human Genome Project. To address this, we need scalable, cost-effective methods enabling chromosome-scale contiguity. Here we show that genome-wide chromatin interaction datasets, such as those generated by Hi-C, are a rich source of long-range information for assigning, ordering and orienting genomic sequences to chromosomes, including across centromeres. To exploit this, we developed an algorithm that uses Hi-C data for ultra-long-range scaffolding of de novo genome assemblies. We demonstrate the approach by combining shotgun fragment and short jump mate-pair sequences with Hi-C data to generate chromosome-scale de novo assemblies of the human, mouse and Drosophila genomes, achieving – for human – 98% accuracy in assigning scaffolds to chromosome groups and 99% accuracy in ordering and orienting scaffolds within chromosome groups. Hi-C data can also be used to validate chromosomal translocations in cancer genomes. PMID:24185095

  9. Insights into the strategies used by related group II introns to adapt successfully for the colonisation of a bacterial genome.

    Science.gov (United States)

    Martínez-Rodríguez, Laura; García-Rodríguez, Fernando M; Molina-Sánchez, María Dolores; Toro, Nicolás; Martínez-Abarca, Francisco

    2014-01-01

    Group II introns are self-splicing RNAs and site-specific mobile retroelements found in bacterial and organellar genomes. The group II intron RmInt1 is present at high copy number in Sinorhizobium meliloti species, and has a multifunctional intron-encoded protein (IEP) with reverse transcriptase/maturase activities, but lacking the DNA-binding and endonuclease domains. We characterized two RmInt1-related group II introns RmInt2 from S. meliloti strain GR4 and Sr.md.I1 from S. medicae strain WSM419 in terms of splicing and mobility activities. We used both wild-type and engineered intron-donor constructs based on ribozyme ΔORF-coding sequence derivatives, and we determined the DNA target requirements for RmInt2, the element most distantly related to RmInt1. The excision and mobility patterns of intron-donor constructs expressing different combinations of IEP and intron RNA provided experimental evidence for the co-operation of IEPs and intron RNAs from related elements in intron splicing and, in some cases, in intron homing. We were also able to identify the DNA target regions recognized by these IEPs lacking the DNA endonuclease domain. Our results provide new insight into the versatility of related group II introns and the possible co-operation between these elements to facilitate the colonization of bacterial genomes.

  10. MED: a new non-supervised gene prediction algorithm for bacterial and archaeal genomes

    Directory of Open Access Journals (Sweden)

    Yang Yi-Fan

    2007-03-01

    Full Text Available Abstract Background Despite a remarkable success in the computational prediction of genes in Bacteria and Archaea, a lack of comprehensive understanding of prokaryotic gene structures prevents from further elucidation of differences among genomes. It continues to be interesting to develop new ab initio algorithms which not only accurately predict genes, but also facilitate comparative studies of prokaryotic genomes. Results This paper describes a new prokaryotic genefinding algorithm based on a comprehensive statistical model of protein coding Open Reading Frames (ORFs and Translation Initiation Sites (TISs. The former is based on a linguistic "Entropy Density Profile" (EDP model of coding DNA sequence and the latter comprises several relevant features related to the translation initiation. They are combined to form a so-called Multivariate Entropy Distance (MED algorithm, MED 2.0, that incorporates several strategies in the iterative program. The iterations enable us to develop a non-supervised learning process and to obtain a set of genome-specific parameters for the gene structure, before making the prediction of genes. Conclusion Results of extensive tests show that MED 2.0 achieves a competitive high performance in the gene prediction for both 5' and 3' end matches, compared to the current best prokaryotic gene finders. The advantage of the MED 2.0 is particularly evident for GC-rich genomes and archaeal genomes. Furthermore, the genome-specific parameters given by MED 2.0 match with the current understanding of prokaryotic genomes and may serve as tools for comparative genomic studies. In particular, MED 2.0 is shown to reveal divergent translation initiation mechanisms in archaeal genomes while making a more accurate prediction of TISs compared to the existing gene finders and the current GenBank annotation.

  11. Unidimensional nonnegative scaling for genome-wide linkage disequilibrium maps.

    Science.gov (United States)

    Liao, Haiyong; Ng, Michael; Fung, Eric; Sham, Pak C

    2008-01-01

    The main aim of this paper is to propose and develop a unidimensional nonnegative scaling model to construct Linkage Disequilibrium (LD) maps. The proposed constrained scaling model can be efficiently solved by transforming it to an unconstrained model. The method is implemented in PC Clusters at Hong Kong Baptist University. The LD maps are constructed for four populations from Hapmap data sets with chromosomes of several ten thousand Single Nucleotide Polymorphisms (SNPs). The similarities and dissimilarities of the LD maps are studied and analysed. Computational results are also reported to show the effectiveness of the method using parallel computation.

  12. Modeling Method for Increased Precision and Scope of Directly Measurable Fluxes at a Genome-Scale

    DEFF Research Database (Denmark)

    McCloskey, Douglas; Young, Jamey D.; Xu, Sibei

    2016-01-01

    Metabolic flux analysis (MFA) is considered to be the gold standard for determining the intracellular flux distribution of biological systems. The majority of work using MFA has been limited to core models of metabolism due to challenges in implementing genome-scale MFA and the undesirable trade...... distributions (MIDs),(1) it was found that a total of 232 net fluxes of central and peripheral metabolism could be resolved in the E. coli network. The increase in scope was shown to cover the full biosynthetic route to an expanded set of bioproduction pathways, which should facilitate applications......-off between increased scope and decreased precision in flux estimations. This work presents a tunable workflow for expanding the scope of MFA to the genome-scale without trade-offs in flux precision. The genome-scale MFA model presented here, iDM2014, accounts for 537 net reactions, which includes the core...

  13. Mapping condition-dependent regulation of metabolism in yeast through genome-scale modeling

    DEFF Research Database (Denmark)

    Österlund, Tobias; Nookaew, Intawat; Bordel, Sergio

    2013-01-01

    ABSTRACT: BACKGROUND: The genome-scale metabolic model of Saccharomyces cerevisiae, first presented in 2003, was the first genome-scale network reconstruction for a eukaryotic organism. Since then continuous efforts have been made in order to improve and expand the yeast metabolic network. RESULTS......: Here we present iTO977, a comprehensive genome-scale metabolic model that contains more reactions, metabolites and genes than previous models. The model was constructed based on two earlier reconstructions, namely iIN800 and the consensus network, and then improved and expanded using gap......-filling methods and by introducing new reactions and pathways based on studies of the literature and databases. The model was shown to perform well both for growth simulations in different media and gene essentiality analysis for single and double knock-outs. Further, the model was used as a scaffold...

  14. A tandem repeats database for bacterial genomes: application to the genotyping of Yersinia pestis and Bacillus anthracis

    Directory of Open Access Journals (Sweden)

    Denoeud France

    2001-03-01

    Full Text Available Abstract Background Some pathogenic bacteria are genetically very homogeneous, making strain discrimination difficult. In the last few years, tandem repeats have been increasingly recognized as markers of choice for genotyping a number of pathogens. The rapid evolution of these structures appears to contribute to the phenotypic flexibility of pathogens. The availability of whole-genome sequences has opened the way to the systematic evaluation of tandem repeats diversity and application to epidemiological studies. Results This report presents a database (http://minisatellites.u-psud.fr of tandem repeats from publicly available bacterial genomes which facilitates the identification and selection of tandem repeats. We illustrate the use of this database by the characterization of minisatellites from two important human pathogens, Yersinia pestis and Bacillus anthracis. In order to avoid simple sequence contingency loci which may be of limited value as epidemiological markers, and to provide genotyping tools amenable to ordinary agarose gel electrophoresis, only tandem repeats with repeat units at least 9 bp long were evaluated. Yersinia pestis contains 64 such minisatellites in which the unit is repeated at least 7 times. An additional collection of 12 loci with at least 6 units, and a high internal conservation were also evaluated. Forty-nine are polymorphic among five Yersinia strains (twenty-five among three Y. pestis strains. Bacillus anthracis contains 30 comparable structures in which the unit is repeated at least 10 times. Half of these tandem repeats show polymorphism among the strains tested. Conclusions Analysis of the currently available bacterial genome sequences classifies Bacillus anthracis and Yersinia pestis as having an average (approximately 30 per Mb density of tandem repeat arrays longer than 100 bp when compared to the other bacterial genomes analysed to date. In both cases, testing a fraction of these sequences for

  15. From genetic circuits to industrial-scale biomanufacturing: bacterial promoters as a cornerstone of biotechnology

    Directory of Open Access Journals (Sweden)

    Pawel Jajesniak

    2015-08-01

    Full Text Available Since the advent of genetic engineering, Escherichia coli, the most widely studied prokaryotic model organism, and other bacterial species have remained at the forefront of biological research. These ubiquitous microorganisms play an essential role in deciphering complex gene regulation mechanisms, large-scale recombinant protein production, and lately the two emerging areas of biotechnology—synthetic biology and metabolic engineering. Among a myriad of factors affecting prokaryotic gene expression, judicious choice of promoter remains one of the most challenging and impactful decisions in many biological experiments. This review provides a comprehensive overview of the current state of bacterial promoter engineering, with an emphasis on its applications in heterologous protein production, synthetic biology and metabolic engineering. In addition to highlighting relevant advances in these fields, the article facilitates the selection of an appropriate promoter by providing pertinent guidelines and explores the development of complementary databases, bioinformatics tools and promoter standardization procedures. The review ends by providing a quick overview of other emerging technologies and future prospects of this vital research area.

  16. Strain Dependent Genetic Networks for Antibiotic-Sensitivity in a Bacterial Pathogen with a Large Pan-Genome.

    Science.gov (United States)

    van Opijnen, Tim; Dedrick, Sandra; Bento, José

    2016-09-01

    The interaction between an antibiotic and bacterium is not merely restricted to the drug and its direct target, rather antibiotic induced stress seems to resonate through the bacterium, creating selective pressures that drive the emergence of adaptive mutations not only in the direct target, but in genes involved in many different fundamental processes as well. Surprisingly, it has been shown that adaptive mutations do not necessarily have the same effect in all species, indicating that the genetic background influences how phenotypes are manifested. However, to what extent the genetic background affects the manner in which a bacterium experiences antibiotic stress, and how this stress is processed is unclear. Here we employ the genome-wide tool Tn-Seq to construct daptomycin-sensitivity profiles for two strains of the bacterial pathogen Streptococcus pneumoniae. Remarkably, over half of the genes that are important for dealing with antibiotic-induced stress in one strain are dispensable in another. By confirming over 100 genotype-phenotype relationships, probing potassium-loss, employing genetic interaction mapping as well as temporal gene-expression experiments we reveal genome-wide conditionally important/essential genes, we discover roles for genes with unknown function, and uncover parts of the antibiotic's mode-of-action. Moreover, by mapping the underlying genomic network for two query genes we encounter little conservation in network connectivity between strains as well as profound differences in regulatory relationships. Our approach uniquely enables genome-wide fitness comparisons across strains, facilitating the discovery that antibiotic responses are complex events that can vary widely between strains, which suggests that in some cases the emergence of resistance could be strain specific and at least for species with a large pan-genome less predictable.

  17. Reliability and applications of statistical methods based on oligonucleotide frequencies in bacterial and archaeal genomes

    DEFF Research Database (Denmark)

    Bohlin, J; Skjerve, E; Ussery, David

    2008-01-01

    BACKGROUND: The increasing number of sequenced prokaryotic genomes contains a wealth of genomic data that needs to be effectively analysed. A set of statistical tools exists for such analysis, but their strengths and weaknesses have not been fully explored. The statistical methods we are concerned...... measure was a good measure to detect horizontally transferred regions, and when used to compare the phylogenetic relationships between plasmids and hosts, significant correlation (R2 = 0.4) was found with genomic GC content and intra-chromosomal homogeneity. CONCLUSION: The statistical methods examined......, or be based on specific statistical distributions. Advantages with these statistical methods include measurements of phylogenetic relationship with relatively small pieces of DNA sampled from almost anywhere within genomes, detection of foreign/conserved DNA, and homology searches. Our aim was to explore...

  18. Genome dynamics of short oligonucleotides: the example of bacterial DNA uptake enhancing sequences.

    Directory of Open Access Journals (Sweden)

    Mohammed Bakkali

    Full Text Available Among the many bacteria naturally competent for transformation by DNA uptake-a phenomenon with significant clinical and financial implications- Pasteurellaceae and Neisseriaceae species preferentially take up DNA containing specific short sequences. The genomic overrepresentation of these DNA uptake enhancing sequences (DUES causes preferential uptake of conspecific DNA, but the function(s behind this overrepresentation and its evolution are still a matter for discovery. Here I analyze DUES genome dynamics and evolution and test the validity of the results to other selectively constrained oligonucleotides. I use statistical methods and computer simulations to examine DUESs accumulation in Haemophilus influenzae and Neisseria gonorrhoeae genomes. I analyze DUESs sequence and nucleotide frequencies, as well as those of all their mismatched forms, and prove the dependence of DUESs genomic overrepresentation on their preferential uptake by quantifying and correlating both characteristics. I then argue that mutation, uptake bias, and weak selection against DUESs in less constrained parts of the genome combined are sufficient enough to cause DUESs accumulation in susceptible parts of the genome with no need for other DUES function. The distribution of overrepresentation values across sequences with different mismatch loads compared to the DUES suggests a gradual yet not linear molecular drive of DNA sequences depending on their similarity to the DUES. Other genomically overrepresented sequences, both pro- and eukaryotic, show similar distribution of frequencies suggesting that the molecular drive reported above applies to other frequent oligonucleotides. Rare oligonucleotides, however, seem to be gradually drawn to genomic underrepresentation, thus, suggesting a molecular drag. To my knowledge this work provides the first clear evidence of the gradual evolution of selectively constrained oligonucleotides, including repeated, palindromic and protein

  19. A simple model for DNA bridging proteins and bacterial or human genomes: bridging-induced attraction and genome compaction

    Science.gov (United States)

    Johnson, J.; Brackley, C. A.; Cook, P. R.; Marenduzzo, D.

    2015-02-01

    We present computer simulations of the phase behaviour of an ensemble of proteins interacting with a polymer, mimicking non-specific binding to a piece of bacterial DNA or eukaryotic chromatin. The proteins can simultaneously bind to the polymer in two or more places to create protein bridges. Despite the lack of any explicit interaction between the proteins or between DNA segments, our simulations confirm previous results showing that when the protein-polymer interaction is sufficiently strong, the proteins come together to form clusters. Furthermore, a sufficiently large concentration of bridging proteins leads to the compaction of the swollen polymer into a globular phase. Here we characterise both the formation of protein clusters and the polymer collapse as a function of protein concentration, protein-polymer affinity and fibre flexibility.

  20. Genomic epidemiology and global diversity of the emerging bacterial pathogen Elizabethkingia anophelis.

    Science.gov (United States)

    Breurec, Sebastien; Criscuolo, Alexis; Diancourt, Laure; Rendueles, Olaya; Vandenbogaert, Mathias; Passet, Virginie; Caro, Valérie; Rocha, Eduardo P C; Touchon, Marie; Brisse, Sylvain

    2016-07-27

    Elizabethkingia anophelis is an emerging pathogen involved in human infections and outbreaks in distinct world regions. We investigated the phylogenetic relationships and pathogenesis-associated genomic features of two neonatal meningitis isolates isolated 5 years apart from one hospital in Central African Republic and compared them with Elizabethkingia from other regions and sources. Average nucleotide identity firmly confirmed that E. anophelis, E. meningoseptica and E. miricola represent demarcated genomic species. A core genome multilocus sequence typing scheme, broadly applicable to Elizabethkingia species, was developed and made publicly available (http://bigsdb.pasteur.fr/elizabethkingia). Phylogenetic analysis revealed distinct E. anophelis sublineages and demonstrated high genetic relatedness between the African isolates, compatible with persistence of the strain in the hospital environment. CRISPR spacer variation between the African isolates was mirrored by the presence of a large mobile genetic element. The pan-genome of E. anophelis comprised 6,880 gene families, underlining genomic heterogeneity of this species. African isolates carried unique resistance genes acquired by horizontal transfer. We demonstrated the presence of extensive variation of the capsular polysaccharide synthesis gene cluster in E. anophelis. Our results demonstrate the dynamic evolution of this emerging pathogen and the power of genomic approaches for Elizabethkingia identification, population biology and epidemiology.

  1. Comparative genomics of the bacterial genus Streptococcus illuminates evolutionary implications of species groups.

    Directory of Open Access Journals (Sweden)

    Xiao-Yang Gao

    Full Text Available Members of the genus Streptococcus within the phylum Firmicutes are among the most diverse and significant zoonotic pathogens. This genus has gone through considerable taxonomic revision due to increasing improvements of chemotaxonomic approaches, DNA hybridization and 16S rRNA gene sequencing. It is proposed to place the majority of streptococci into "species groups". However, the evolutionary implications of species groups are not clear presently. We use comparative genomic approaches to yield a better understanding of the evolution of Streptococcus through genome dynamics, population structure, phylogenies and virulence factor distribution of species groups. Genome dynamics analyses indicate that the pan-genome size increases with the addition of newly sequenced strains, while the core genome size decreases with sequential addition at the genus level and species group level. Population structure analysis reveals two distinct lineages, one including Pyogenic, Bovis, Mutans and Salivarius groups, and the other including Mitis, Anginosus and Unknown groups. Phylogenetic dendrograms show that species within the same species group cluster together, and infer two main clades in accordance with population structure analysis. Distribution of streptococcal virulence factors has no obvious patterns among the species groups; however, the evolution of some common virulence factors is congruous with the evolution of species groups, according to phylogenetic inference. We suggest that the proposed streptococcal species groups are reasonable from the viewpoints of comparative genomics; evolution of the genus is congruent with the individual evolutionary trajectories of different species groups.

  2. Whole genome amplification and de novo assembly of single bacterial cells.

    Directory of Open Access Journals (Sweden)

    Sébastien Rodrigue

    Full Text Available BACKGROUND: Single-cell genome sequencing has the potential to allow the in-depth exploration of the vast genetic diversity found in uncultured microbes. We used the marine cyanobacterium Prochlorococcus as a model system for addressing important challenges facing high-throughput whole genome amplification (WGA and complete genome sequencing of individual cells. METHODOLOGY/PRINCIPAL FINDINGS: We describe a pipeline that enables single-cell WGA on hundreds of cells at a time while virtually eliminating non-target DNA from the reactions. We further developed a post-amplification normalization procedure that mitigates extreme variations in sequencing coverage associated with multiple displacement amplification (MDA, and demonstrated that the procedure increased sequencing efficiency and facilitated genome assembly. We report genome recovery as high as 99.6% with reference-guided assembly, and 95% with de novo assembly starting from a single cell. We also analyzed the impact of chimera formation during MDA on de novo assembly, and discuss strategies to minimize the presence of incorrectly joined regions in contigs. CONCLUSIONS/SIGNIFICANCE: The methods describe in this paper will be useful for sequencing genomes of individual cells from a variety of samples.

  3. Long-Term Bacterial Dynamics in a Full-Scale Drinking Water Distribution System

    KAUST Repository

    Prest, E. I.

    2016-10-28

    Large seasonal variations in microbial drinking water quality can occur in distribution networks, but are often not taken into account when evaluating results from short-term water sampling campaigns. Temporal dynamics in bacterial community characteristics were investigated during a two-year drinking water monitoring campaign in a full-scale distribution system operating without detectable disinfectant residual. A total of 368 water samples were collected on a biweekly basis at the water treatment plant (WTP) effluent and at one fixed location in the drinking water distribution network (NET). The samples were analysed for heterotrophic plate counts (HPC), Aeromonas plate counts, adenosine-tri-phosphate (ATP) concentrations, and flow cytometric (FCM) total and intact cell counts (TCC, ICC), water temperature, pH, conductivity, total organic carbon (TOC) and assimilable organic carbon (AOC). Multivariate analysis of the large dataset was performed to explore correlative trends between microbial and environmental parameters. The WTP effluent displayed considerable seasonal variations in TCC (from 90 × 103 cells mL-1 in winter time up to 455 × 103 cells mL-1 in summer time) and in bacterial ATP concentrations (<1–3.6 ng L-1), which were congruent with water temperature variations. These fluctuations were not detected with HPC and Aeromonas counts. The water in the network was predominantly influenced by the characteristics of the WTP effluent. The increase in ICC between the WTP effluent and the network sampling location was small (34 × 103 cells mL-1 on average) compared to seasonal fluctuations in ICC in the WTP effluent. Interestingly, the extent of bacterial growth in the NET was inversely correlated to AOC concentrations in the WTP effluent (Pearson’s correlation factor r = -0.35), and positively correlated with water temperature (r = 0.49). Collecting a large dataset at high frequency over a two year period enabled the characterization of previously

  4. Long-Term Bacterial Dynamics in a Full-Scale Drinking Water Distribution System.

    Science.gov (United States)

    Prest, E I; Weissbrodt, D G; Hammes, F; van Loosdrecht, M C M; Vrouwenvelder, J S

    2016-01-01

    Large seasonal variations in microbial drinking water quality can occur in distribution networks, but are often not taken into account when evaluating results from short-term water sampling campaigns. Temporal dynamics in bacterial community characteristics were investigated during a two-year drinking water monitoring campaign in a full-scale distribution system operating without detectable disinfectant residual. A total of 368 water samples were collected on a biweekly basis at the water treatment plant (WTP) effluent and at one fixed location in the drinking water distribution network (NET). The samples were analysed for heterotrophic plate counts (HPC), Aeromonas plate counts, adenosine-tri-phosphate (ATP) concentrations, and flow cytometric (FCM) total and intact cell counts (TCC, ICC), water temperature, pH, conductivity, total organic carbon (TOC) and assimilable organic carbon (AOC). Multivariate analysis of the large dataset was performed to explore correlative trends between microbial and environmental parameters. The WTP effluent displayed considerable seasonal variations in TCC (from 90 × 103 cells mL-1 in winter time up to 455 × 103 cells mL-1 in summer time) and in bacterial ATP concentrations (water temperature variations. These fluctuations were not detected with HPC and Aeromonas counts. The water in the network was predominantly influenced by the characteristics of the WTP effluent. The increase in ICC between the WTP effluent and the network sampling location was small (34 × 103 cells mL-1 on average) compared to seasonal fluctuations in ICC in the WTP effluent. Interestingly, the extent of bacterial growth in the NET was inversely correlated to AOC concentrations in the WTP effluent (Pearson's correlation factor r = -0.35), and positively correlated with water temperature (r = 0.49). Collecting a large dataset at high frequency over a two year period enabled the characterization of previously undocumented seasonal dynamics in the distribution

  5. New approach for phylogenetic tree recovery based on genome-scale metabolic networks.

    Science.gov (United States)

    Gamermann, Daniel; Montagud, Arnaud; Conejero, J Alberto; Urchueguía, Javier F; de Córdoba, Pedro Fernández

    2014-07-01

    A wide range of applications and research has been done with genome-scale metabolic models. In this work, we describe an innovative methodology for comparing metabolic networks constructed from genome-scale metabolic models and how to apply this comparison in order to infer evolutionary distances between different organisms. Our methodology allows a quantification of the metabolic differences between different species from a broad range of families and even kingdoms. This quantification is then applied in order to reconstruct phylogenetic trees for sets of various organisms.

  6. Rapid prototyping of microbial cell factories via genome-scale engineering.

    Science.gov (United States)

    Si, Tong; Xiao, Han; Zhao, Huimin

    2015-11-15

    Advances in reading, writing and editing genetic materials have greatly expanded our ability to reprogram biological systems at the resolution of a single nucleotide and on the scale of a whole genome. Such capacity has greatly accelerated the cycles of design, build and test to engineer microbes for efficient synthesis of fuels, chemicals and drugs. In this review, we summarize the emerging technologies that have been applied, or are potentially useful for genome-scale engineering in microbial systems. We will focus on the development of high-throughput methodologies, which may accelerate the prototyping of microbial cell factories.

  7. Multi-scaling hierarchical structure analysis on the sequence of E. coli complete genome

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    We have applied the newly developed hierarchical structure theory for complex systems to analyze the multi-scaling structures of the nucleotide density distribution along a linear DNA sequence from the complete Escherichia coli genome. The hierarchical symmetry in the nucleotide density distribution was demonstrated. In particular, we have shown that the G, C density distribution that represents a strong H-bonding between the two DNA chains is more coherent with smaller similarity parameter compared to that of A, T density distribution, indicating a better organized multi-scaling fluctuation field for G, C density distribution along the genome sequence. The biological significance of these findings is under investigation.

  8. Expression cloning of different bacterial phosphatase-encoding genes by histochemical screening of genomic libraries onto an indicator medium containing phenolphthalein diphosphate and methyl green.

    Science.gov (United States)

    Riccio, M L; Rossolini, G M; Lombardi, G; Chiesurin, A; Satta, G

    1997-02-01

    A system for expression cloning of bacterial phosphatase-encoding genes has been developed, and its potential has been investigated. The system is based on histochemical screening of bacterial genomic libraries, constructed in an Escherichia coli multicopy plasmid vector, for phosphatase-producing clones using an indicator medium (named TPMG) made of Tryptose-Phosphate agar supplemented with the phosphatase substrate phenolphthalein diphosphate and the stain methyl green. To test the performance of this system, three genomic libraries were constructed from bacterial strains of different species which showed different patterns of phosphatase activity, and were screened using the TPMG medium. Following a partial screening, three different phosphatase-encoding genes (respectively encoding a class A non-specific acid phosphatase, an acid-hexose phosphatase and a non-specific alkaline phosphatase) were shotgun-cloned from the above libraries, indicating that the TPMG-based expression cloning system can be useful for rapid isolation of different bacterial phosphatase-encoding genes.

  9. Exploring massive, genome scale datasets with the genometricorr package

    KAUST Repository

    Favorov, Alexander

    2012-05-31

    We have created a statistically grounded tool for determining the correlation of genomewide data with other datasets or known biological features, intended to guide biological exploration of high-dimensional datasets, rather than providing immediate answers. The software enables several biologically motivated approaches to these data and here we describe the rationale and implementation for each approach. Our models and statistics are implemented in an R package that efficiently calculates the spatial correlation between two sets of genomic intervals (data and/or annotated features), for use as a metric of functional interaction. The software handles any type of pointwise or interval data and instead of running analyses with predefined metrics, it computes the significance and direction of several types of spatial association; this is intended to suggest potentially relevant relationships between the datasets. Availability and implementation: The package, GenometriCorr, can be freely downloaded at http://genometricorr.sourceforge.net/. Installation guidelines and examples are available from the sourceforge repository. The package is pending submission to Bioconductor. © 2012 Favorov et al.

  10. Exploiting Bacterial Whole-Genome Sequencing Data for Evaluation of Diagnostic Assays: Campylobacter Species Identification as a Case Study

    Science.gov (United States)

    Jansen van Rensburg, Melissa J.; Swift, Craig; Cody, Alison J.; Jenkins, Claire

    2016-01-01

    The application of whole-genome sequencing (WGS) to problems in clinical microbiology has had a major impact on the field. Clinical laboratories are now using WGS for pathogen identification, antimicrobial susceptibility testing, and epidemiological typing. WGS data also represent a valuable resource for the development and evaluation of molecular diagnostic assays, which continue to play an important role in clinical microbiology. To demonstrate this application of WGS, this study used publicly available genomic data to evaluate a duplex real-time PCR (RT-PCR) assay that targets mapA and ceuE for the detection of Campylobacter jejuni and Campylobacter coli, leading global causes of bacterial gastroenteritis. In silico analyses of mapA and ceuE primer and probe sequences from 1,713 genetically diverse C. jejuni and C. coli genomes, supported by RT-PCR testing, indicated that the assay was robust, with 1,707 (99.7%) isolates correctly identified. The high specificity of the mapA-ceuE assay was the result of interspecies diversity and intraspecies conservation of the target genes in C. jejuni and C. coli. Rare instances of a lack of specificity among C. coli isolates were due to introgression in mapA or sequence diversity in ceuE. The results of this study illustrate how WGS can be exploited to evaluate molecular diagnostic assays by using publicly available data, online databases, and open-source software. PMID:27733632

  11. Comparative genomics analysis of the companion mechanisms of Bacillus thuringiensis Bc601 and Bacillus endophyticus Hbe603 in bacterial consortium.

    Science.gov (United States)

    Jia, Nan; Ding, Ming-Zhu; Gao, Feng; Yuan, Ying-Jin

    2016-06-29

    Bacillus thuringiensis and Bacillus endophyticus both act as the companion bacteria, which cooperate with Ketogulonigenium vulgare in vitamin C two-step fermentation. Two Bacillus species have different morphologies, swarming motility and 2-keto-L-gulonic acid productivities when they co-culture with K. vulgare. Here, we report the complete genome sequencing of B. thuringiensis Bc601 and eight plasmids of B. endophyticus Hbe603, and carry out the comparative genomics analysis. Consequently, B. thuringiensis Bc601, with greater ability of response to the external environment, has been found more two-component system, sporulation coat and peptidoglycan biosynthesis related proteins than B. endophyticus Hbe603, and B. endophyticus Hbe603, with greater ability of nutrients biosynthesis, has been found more alpha-galactosidase, propanoate, glutathione and inositol phosphate metabolism, and amino acid degradation related proteins than B. thuringiensis Bc601. Different ability of swarming motility, response to the external environment and nutrients biosynthesis may reflect different companion mechanisms of two Bacillus species. Comparative genomic analysis of B. endophyticus and B. thuringiensis enables us to further understand the cooperative mechanism with K. vulgare, and facilitate the optimization of bacterial consortium.

  12. Construction of a Bacterial Artificial Chromosome Library of TM-1, a Standard Line for Genetics and Genomics in Upland Cotton

    Institute of Scientific and Technical Information of China (English)

    Yan Hu; Wang-Zhen Guo; Tian-Zhen Zhang

    2009-01-01

    A bacterial artificial chromosome (BAC) library was constructed for Gossyplum hirsutum acc. TM-1, a genetic and genomic standard line for Upland cotton. The library consists of 147 456 clones with an average insert size of 122.8 kb ranging from 97 to 240 kb. About 96.0% of the clones have inserts over 100 kb. Therefore, this library represents theoretically 7.4 haploid genome equivalents based on an AD genome size of 2 425 Mb. Clones were stored in 384 384- well plates and arrayed into multiplex pools for rapid and reliable library screening. BAC screening was carded out by four-round polymerase chain reactions using 23 simple sequence repeats (SSR) markers, three sequence-related amplified polymorphism markers and one pair of pdmere for a gene associated with fiber development to test the quality of the library. Correspondingly, in total 92 positive BAC clones were Identified with an average four positive clones per SSR marker, ranging from one to eight hits. Additionally, since these SSR markers have been localized to chromosome 12 (A12) and 26 (D12) according to the genetic map, these BAC clonee are expected to serve as seeds for the physical mapping of these two homologous chromosomes, sequentially map-based cloning of quantitative trait loci or genes associated with Important agronomic traits.

  13. Exploiting Bacterial Whole-Genome Sequencing Data for Evaluation of Diagnostic Assays: Campylobacter Species Identification as a Case Study.

    Science.gov (United States)

    Jansen van Rensburg, Melissa J; Swift, Craig; Cody, Alison J; Jenkins, Claire; Maiden, Martin C J

    2016-12-01

    The application of whole-genome sequencing (WGS) to problems in clinical microbiology has had a major impact on the field. Clinical laboratories are now using WGS for pathogen identification, antimicrobial susceptibility testing, and epidemiological typing. WGS data also represent a valuable resource for the development and evaluation of molecular diagnostic assays, which continue to play an important role in clinical microbiology. To demonstrate this application of WGS, this study used publicly available genomic data to evaluate a duplex real-time PCR (RT-PCR) assay that targets mapA and ceuE for the detection of Campylobacter jejuni and Campylobacter coli, leading global causes of bacterial gastroenteritis. In silico analyses of mapA and ceuE primer and probe sequences from 1,713 genetically diverse C. jejuni and C. coli genomes, supported by RT-PCR testing, indicated that the assay was robust, with 1,707 (99.7%) isolates correctly identified. The high specificity of the mapA-ceuE assay was the result of interspecies diversity and intraspecies conservation of the target genes in C. jejuni and C. coli Rare instances of a lack of specificity among C. coli isolates were due to introgression in mapA or sequence diversity in ceuE The results of this study illustrate how WGS can be exploited to evaluate molecular diagnostic assays by using publicly available data, online databases, and open-source software.

  14. Analysis of five complete genome sequences for members of the class Peribacteria in the recently recognized Peregrinibacteria bacterial phylum

    Directory of Open Access Journals (Sweden)

    Karthik Anantharaman

    2016-01-01

    Full Text Available Five closely related populations of bacteria from the Candidate Phylum (CP Peregrinibacteria, part of the bacterial Candidate Phyla Radiation (CPR, were sampled from filtered groundwater obtained from an aquifer adjacent to the Colorado River near the town of Rifle, CO, USA. Here, we present the first complete genome sequences for organisms from this phylum. These bacteria have small genomes and, unlike most organisms from other lineages in the CPR, have the capacity for nucleotide synthesis. They invest significantly in biosynthesis of cell wall and cell envelope components, including peptidoglycan, isoprenoids via the mevalonate pathway, and a variety of amino sugars including perosamine and rhamnose. The genomes encode an intriguing set of large extracellular proteins, some of which are very cysteine-rich and may function in attachment, possibly to other cells. Strain variation in these proteins is an important source of genotypic variety. Overall, the cell envelope features, combined with the lack of biosynthesis capacities for many required cofactors, fatty acids, and most amino acids point to a symbiotic lifestyle. Phylogenetic analyses indicate that these bacteria likely represent a new class within the Peregrinibacteria phylum, although they ultimately may be recognized as members of a separate phylum. We propose the provisional taxonomic assignment as ‘Candidatus Peribacter riflensis’, Genus Peribacter, Family Peribacteraceae, Order Peribacterales, Class Peribacteria in the phylum Peregrinibacteria.

  15. Bacterial community structure and variation in a full-scale seawater desalination plant for drinking water production.

    Science.gov (United States)

    Belila, A; El-Chakhtoura, J; Otaibi, N; Muyzer, G; Gonzalez-Gil, G; Saikaly, P E; van Loosdrecht, M C M; Vrouwenvelder, J S

    2016-05-01

    Microbial processes inevitably play a role in membrane-based desalination plants, mainly recognized as membrane biofouling. We assessed the bacterial community structure and diversity during different treatment steps in a full-scale seawater desalination plant producing 40,000 m(3)/d of drinking water. Water samples were taken over the full treatment train consisting of chlorination, spruce media and cartridge filters, de-chlorination, first and second pass reverse osmosis (RO) membranes and final chlorine dosage for drinking water distribution. The water samples were analyzed for water quality parameters (total bacterial cell number, total organic carbon, conductivity, pH, etc.) and microbial community composition by 16S rRNA gene pyrosequencing. The planktonic microbial community was dominated by Proteobacteria (48.6%) followed by Bacteroidetes (15%), Firmicutes (9.3%) and Cyanobacteria (4.9%). During the pretreatment step, the spruce media filter did not impact the bacterial community composition dominated by Proteobacteria. In contrast, the RO and final chlorination treatment steps reduced the Proteobacterial relative abundance in the produced water where Firmicutes constituted the most dominant bacterial group. Shannon and Chao1 diversity indices showed that bacterial species richness and diversity decreased during the seawater desalination process. The two-stage RO filtration strongly reduced the water conductivity (>99%), TOC concentration (98.5%) and total bacterial cell number (>99%), albeit some bacterial DNA was found in the water after RO filtration. About 0.25% of the total bacterial operational taxonomic units (OTUs) were present in all stages of the desalination plant: the seawater, the RO permeates and the chlorinated drinking water, suggesting that these bacterial strains can survive in different environments such as high/low salt concentration and with/without residual disinfectant. These bacterial strains were not caused by contamination during

  16. Bacterial community structure and variation in a full-scale seawater desalination plant for drinking water production

    KAUST Repository

    Belila, Abdelaziz

    2016-02-18

    Microbial processes inevitably play a role in membrane-based desalination plants, mainly recognized as membrane biofouling. We assessed the bacterial community structure and diversity during different treatment steps in a full-scale seawater desalination plant producing 40,000 m3/d of drinking water. Water samples were taken over the full treatment train consisting of chlorination, spruce media and cartridge filters, de-chlorination, first and second pass reverse osmosis (RO) membranes and final chlorine dosage for drinking water distribution. The water samples were analyzed for water quality parameters (total bacterial cell number, total organic carbon, conductivity, pH, etc.) and microbial community composition by 16S rRNA gene pyrosequencing. The planktonic microbial community was dominated by Proteobacteria (48.6%) followed by Bacteroidetes (15%), Firmicutes (9.3%) and Cyanobacteria (4.9%). During the pretreatment step, the spruce media filter did not impact the bacterial community composition dominated by Proteobacteria. In contrast, the RO and final chlorination treatment steps reduced the Proteobacterial relative abundance in the produced water where Firmicutes constituted the most dominant bacterial group. Shannon and Chao1 diversity indices showed that bacterial species richness and diversity decreased during the seawater desalination process. The two-stage RO filtration strongly reduced the water conductivity (>99%), TOC concentration (98.5%) and total bacterial cell number (>99%), albeit some bacterial DNA was found in the water after RO filtration. About 0.25% of the total bacterial operational taxonomic units (OTUs) were present in all stages of the desalination plant: the seawater, the RO permeates and the chlorinated drinking water, suggesting that these bacterial strains can survive in different environments such as high/low salt concentration and with/without residual disinfectant. These bacterial strains were not caused by contamination during

  17. Complete genome sequence of the extremely acidophilic methanotroph isolate V4, Methylacidiphilum infernorum, a representative of the bacterial phylum Verrucomicrobia

    Directory of Open Access Journals (Sweden)

    Stott Matthew B

    2008-07-01

    Full Text Available Abstract Background The phylum Verrucomicrobia is a widespread but poorly characterized bacterial clade. Although cultivation-independent approaches detect representatives of this phylum in a wide range of environments, including soils, seawater, hot springs and human gastrointestinal tract, only few have been isolated in pure culture. We have recently reported cultivation and initial characterization of an extremely acidophilic methanotrophic member of the Verrucomicrobia, strain V4, isolated from the Hell's Gate geothermal area in New Zealand. Similar organisms were independently isolated from geothermal systems in Italy and Russia. Results We report the complete genome sequence of strain V4, the first one from a representative of the Verrucomicrobia. Isolate V4, initially named "Methylokorus infernorum" (and recently renamed Methylacidiphilum infernorum is an autotrophic bacterium with a streamlined genome of ~2.3 Mbp that encodes simple signal transduction pathways and has a limited potential for regulation of gene expression. Central metabolism of M. infernorum was reconstructed almost completely and revealed highly interconnected pathways of autotrophic central metabolism and modifications of C1-utilization pathways compared to other known methylotrophs. The M. infernorum genome does not encode tubulin, which was previously discovered in bacteria of the genus Prosthecobacter, or close homologs of any other signature eukaryotic proteins. Phylogenetic analysis of ribosomal proteins and RNA polymerase subunits unequivocally supports grouping Planctomycetes, Verrucomicrobia and Chlamydiae into a single clade, the PVC superphylum, despite dramatically different gene content in members of these three groups. Comparative-genomic analysis suggests that evolution of the M. infernorum lineage involved extensive horizontal gene exchange with a variety of bacteria. The genome of M. infernorum shows apparent adaptations for existence under extremely

  18. Development and experimental verification of a genome-scale metabolic model for Corynebacterium glutamicum

    Directory of Open Access Journals (Sweden)

    Hirasawa Takashi

    2009-08-01

    Full Text Available Abstract Background In silico genome-scale metabolic models enable the analysis of the characteristics of metabolic systems of organisms. In this study, we reconstructed a genome-scale metabolic model of Corynebacterium glutamicum on the basis of genome sequence annotation and physiological data. The metabolic characteristics were analyzed using flux balance analysis (FBA, and the results of FBA were validated using data from culture experiments performed at different oxygen uptake rates. Results The reconstructed genome-scale metabolic model of C. glutamicum contains 502 reactions and 423 metabolites. We collected the reactions and biomass components from the database and literatures, and made the model available for the flux balance analysis by filling gaps in the reaction networks and removing inadequate loop reactions. Using the framework of FBA and our genome-scale metabolic model, we first simulated the changes in the metabolic flux profiles that occur on changing the oxygen uptake rate. The predicted production yields of carbon dioxide and organic acids agreed well with the experimental data. The metabolic profiles of amino acid production phases were also investigated. A comprehensive gene deletion study was performed in which the effects of gene deletions on metabolic fluxes were simulated; this helped in the identification of several genes whose deletion resulted in an improvement in organic acid production. Conclusion The genome-scale metabolic model provides useful information for the evaluation of the metabolic capabilities and prediction of the metabolic characteristics of C. glutamicum. This can form a basis for the in silico design of C. glutamicum metabolic networks for improved bioproduction of desirable metabolites.

  19. Population genomic footprints of fine-scale differentiation between habitats in Mediterranean blue tits.

    Science.gov (United States)

    Szulkin, M; Gagnaire, P-A; Bierne, N; Charmantier, A

    2016-01-01

    Linking population genetic variation to the spatial heterogeneity of the environment is of fundamental interest to evolutionary biology and ecology, in particular when phenotypic differences between populations are observed at biologically small spatial scales. Here, we applied restriction-site associated DNA sequencing (RAD-Seq) to test whether phenotypically differentiated populations of wild blue tits (Cyanistes caeruleus) breeding in a highly heterogeneous environment exhibit genetic structure related to habitat type. Using 12 106 SNPs in 197 individuals from deciduous and evergreen oak woodlands, we applied complementary population genomic analyses, which revealed that genetic variation is influenced by both geographical distance and habitat type. A fine-scale genetic differentiation supported by genome- and transcriptome-wide analyses was found within Corsica, between two adjacent habitats where blue tits exhibit marked differences in breeding time while nesting < 6 km apart. Using redundancy analysis (RDA), we show that genomic variation remains associated with habitat type when controlling for spatial and temporal effects. Finally, our results suggest that the observed patterns of genomic differentiation were not driven by a small proportion of highly differentiated loci, but rather emerged through a process such as habitat choice, which reduces gene flow between habitats across the entire genome. The pattern of genomic isolation-by-environment closely matches differentiation observed at the phenotypic level, thereby offering significant potential for future inference of phenotype-genotype associations in a heterogeneous environment.

  20. Adaptation in Toxic Environments: Arsenic Genomic Islands in the Bacterial Genus Thiomonas.

    Directory of Open Access Journals (Sweden)

    Kelle C Freel

    Full Text Available Acid mine drainage (AMD is a highly toxic environment for most living organisms due to the presence of many lethal elements including arsenic (As. Thiomonas (Tm. bacteria are found ubiquitously in AMD and can withstand these extreme conditions, in part because they are able to oxidize arsenite. In order to further improve our knowledge concerning the adaptive capacities of these bacteria, we sequenced and assembled the genome of six isolates derived from the Carnoulès AMD, and compared them to the genomes of Tm. arsenitoxydans 3As (isolated from the same site and Tm. intermedia K12 (isolated from a sewage pipe. A detailed analysis of the Tm. sp. CB2 genome revealed various rearrangements had occurred in comparison to what was observed in 3As and K12 and over 20 genomic islands (GEIs were found in each of these three genomes. We performed a detailed comparison of the two arsenic-related islands found in CB2, carrying the genes required for arsenite oxidation and As resistance, with those found in K12, 3As, and five other Thiomonas strains also isolated from Carnoulès (CB1, CB3, CB6, ACO3 and ACO7. Our results suggest that these arsenic-related islands have evolved differentially in these closely related Thiomonas strains, leading to divergent capacities to survive in As rich environments.

  1. Rapid genome-scale mapping of chromatin accessibility in tissue

    Science.gov (United States)

    2012-01-01

    Background The challenge in extracting genome-wide chromatin features from limiting clinical samples poses a significant hurdle in identification of regulatory marks that impact the physiological or pathological state. Current methods that identify nuclease accessible chromatin are reliant on large amounts of purified nuclei as starting material. This complicates analysis of trace clinical tissue samples that are often stored frozen. We have developed an alternative nuclease based procedure to bypass nuclear preparation to interrogate nuclease accessible regions in frozen tissue samples. Results Here we introduce a novel technique that specifically identifies Tissue Accessible Chromatin (TACh). The TACh method uses pulverized frozen tissue as starting material and employs one of the two robust endonucleases, Benzonase or Cyansase, which are fully active under a range of stringent conditions such as high levels of detergent and DTT. As a proof of principle we applied TACh to frozen mouse liver tissue. Combined with massive parallel sequencing TACh identifies accessible regions that are associated with euchromatic features and accessibility at transcriptional start sites correlates positively with levels of gene transcription. Accessible chromatin identified by TACh overlaps to a large extend with accessible chromatin identified by DNase I using nuclei purified from freshly isolated liver tissue as starting material. The similarities are most pronounced at highly accessible regions, whereas identification of less accessible regions tends to be more divergence between nucleases. Interestingly, we show that some of the differences between DNase I and Benzonase relate to their intrinsic sequence biases and accordingly accessibility of CpG islands is probed more efficiently using TACh. Conclusion The TACh methodology identifies accessible chromatin derived from frozen tissue samples. We propose that this simple, robust approach can be applied across a broad range of

  2. Rapid genome-scale mapping of chromatin accessibility in tissue

    Directory of Open Access Journals (Sweden)

    Grøntved Lars

    2012-06-01

    Full Text Available Abstract Background The challenge in extracting genome-wide chromatin features from limiting clinical samples poses a significant hurdle in identification of regulatory marks that impact the physiological or pathological state. Current methods that identify nuclease accessible chromatin are reliant on large amounts of purified nuclei as starting material. This complicates analysis of trace clinical tissue samples that are often stored frozen. We have developed an alternative nuclease based procedure to bypass nuclear preparation to interrogate nuclease accessible regions in frozen tissue samples. Results Here we introduce a novel technique that specifically identifies Tissue Accessible Chromatin (TACh. The TACh method uses pulverized frozen tissue as starting material and employs one of the two robust endonucleases, Benzonase or Cyansase, which are fully active under a range of stringent conditions such as high levels of detergent and DTT. As a proof of principle we applied TACh to frozen mouse liver tissue. Combined with massive parallel sequencing TACh identifies accessible regions that are associated with euchromatic features and accessibility at transcriptional start sites correlates positively with levels of gene transcription. Accessible chromatin identified by TACh overlaps to a large extend with accessible chromatin identified by DNase I using nuclei purified from freshly isolated liver tissue as starting material. The similarities are most pronounced at highly accessible regions, whereas identification of less accessible regions tends to be more divergence between nucleases. Interestingly, we show that some of the differences between DNase I and Benzonase relate to their intrinsic sequence biases and accordingly accessibility of CpG islands is probed more efficiently using TACh. Conclusion The TACh methodology identifies accessible chromatin derived from frozen tissue samples. We propose that this simple, robust approach can be applied

  3. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models.

    Science.gov (United States)

    King, Zachary A; Lu, Justin; Dräger, Andreas; Miller, Philip; Federowicz, Stephen; Lerman, Joshua A; Ebrahim, Ali; Palsson, Bernhard O; Lewis, Nathan E

    2016-01-01

    Genome-scale metabolic models are mathematically-structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scale metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases. Reaction and metabolite identifiers have been standardized across models to conform to community standards and enable rapid comparison across models. Furthermore, BiGG Models provides a comprehensive application programming interface for accessing BiGG Models with modeling and analysis tools. As a resource for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data.

  4. Large-scale profiling of microRNAs for The Cancer Genome Atlas.

    Science.gov (United States)

    Chu, Andy; Robertson, Gordon; Brooks, Denise; Mungall, Andrew J; Birol, Inanc; Coope, Robin; Ma, Yussanne; Jones, Steven; Marra, Marco A

    2016-01-01

    The comprehensive multiplatform genomics data generated by The Cancer Genome Atlas (TCGA) Research Network is an enabling resource for cancer research. It includes an unprecedented amount of microRNA sequence data: ~11 000 libraries across 33 cancer types. Combined with initiatives like the National Cancer Institute Genomics Cloud Pilots, such data resources will make intensive analysis of large-scale cancer genomics data widely accessible. To support such initiatives, and to enable comparison of TCGA microRNA data to data from other projects, we describe the process that we developed and used to generate the microRNA sequence data, from library construction through to submission of data to repositories. In the context of this process, we describe the computational pipeline that we used to characterize microRNA expression across large patient cohorts.

  5. Improved annotation through genome-scale metabolic modeling of Aspergillus oryzae

    DEFF Research Database (Denmark)

    Vongsangnak, Wanwipa; Olsen, Peter; Hansen, Kim;

    2008-01-01

    to a genome scale metabolic model of A. oryzae. Results: Our assembled EST sequences we identified 1,046 newly predicted genes in the A. oryzae genome. Furthermore, it was possible to assign putative protein functions to 398 of the newly predicted genes. Noteworthy, our annotation strategy resulted......Background: Since ancient times the filamentous fungus Aspergillus oryzae has been used in the fermentation industry for the production of fermented sauces and the production of industrial enzymes. Recently, the genome sequence of A. oryzae with 12,074 annotated genes was released but the number...... of hypothetical proteins accounted for more than 50% of the annotated genes. Considering the industrial importance of this fungus, it is therefore valuable to improve the annotation and further integrate genomic information with biochemical and physiological information available for this microorganism and other...

  6. Homeostatic regulation of supercoiling sensitivity coordinates transcription of the bacterial genome.

    Science.gov (United States)

    Blot, Nicolas; Mavathur, Ramesh; Geertz, Marcel; Travers, Andrew; Muskhelishvili, Georgi

    2006-07-01

    Regulation of cellular growth implies spatiotemporally coordinated programmes of gene transcription. A central question, therefore, is how global transcription is coordinated in the genome. The growth of the unicellular organism Escherichia coli is associated with changes in both the global superhelicity modulated by cellular topoisomerase activity and the relative proportions of the abundant DNA-architectural chromatin proteins. Using a DNA-microarray-based approach that combines mutations in the genes of two important chromatin proteins with induced changes of DNA superhelicity, we demonstrate that genomic transcription is tightly associated with the spatial distribution of supercoiling sensitivity, which in turn depends on chromatin proteins. We further demonstrate that essential metabolic pathways involved in the maintenance of growth respond distinctly to changes of superhelicity. We infer that a homeostatic mechanism organizing the supercoiling sensitivity is coordinating the growth-phase-dependent transcription of the genome.

  7. Improved bacteriophage genome data is necessary for integrating viral and bacterial ecology.

    Science.gov (United States)

    Bibby, Kyle

    2014-02-01

    The recent rise in "omics"-enabled approaches has lead to improved understanding in many areas of microbial ecology. However, despite the importance that viruses play in a broad microbial ecology context, viral ecology remains largely not integrated into high-throughput microbial ecology studies. A fundamental hindrance to the integration of viral ecology into omics-enabled microbial ecology studies is the lack of suitable reference bacteriophage genomes in reference databases-currently, only 0.001% of bacteriophage diversity is represented in genome sequence databases. This commentary serves to highlight this issue and to promote bacteriophage genome sequencing as a valuable scientific undertaking to both better understand bacteriophage diversity and move towards a more holistic view of microbial ecology.

  8. Probabilistic Clustering of Sequences Inferring new bacterial regulons by comparative genomics

    CERN Document Server

    Van Nimwegen, E; Rajewsky, N; Siggia, E D; Nimwegen, Erik van; Zavolan, Mihaela; Rajewsky, Nikolaus; Siggia, Eric D.

    2002-01-01

    Genome wide comparisons between enteric bacteria yield large sets of conserved putative regulatory sites on a gene by gene basis that need to be clustered into regulons. Using the assumption that regulatory sites can be represented as samples from weight matrices we derive a unique probability distribution for assignments of sites into clusters. Our algorithm, 'PROCSE' (probabilistic clustering of sequences), uses Monte-Carlo sampling of this distribution to partition and align thousands of short DNA sequences into clusters. The algorithm internally determines the number of clusters from the data, and assigns significance to the resulting clusters. We place theoretical limits on the ability of any algorithm to correctly cluster sequences drawn from weight matrices (WMs) when these WMs are unknown. Our analysis suggests that the set of all putative sites for a single genome (e.g. E. coli) is largely inadequate for clustering. When sites from different genomes are combined and all the homologous sites from the ...

  9. Genome-wide fine-scale recombination rate variation in Drosophila melanogaster.

    Directory of Open Access Journals (Sweden)

    Andrew H Chan

    Full Text Available Estimating fine-scale recombination maps of Drosophila from population genomic data is a challenging problem, in particular because of the high background recombination rate. In this paper, a new computational method is developed to address this challenge. Through an extensive simulation study, it is demonstrated that the method allows more accurate inference, and exhibits greater robustness to the effects of natural selection and noise, compared to a well-used previous method developed for studying fine-scale recombination rate variation in the human genome. As an application, a genome-wide analysis of genetic variation data is performed for two Drosophila melanogaster populations, one from North America (Raleigh, USA and the other from Africa (Gikongoro, Rwanda. It is shown that fine-scale recombination rate variation is widespread throughout the D. melanogaster genome, across all chromosomes and in both populations. At the fine-scale, a conservative, systematic search for evidence of recombination hotspots suggests the existence of a handful of putative hotspots each with at least a tenfold increase in intensity over the background rate. A wavelet analysis is carried out to compare the estimated recombination maps in the two populations and to quantify the extent to which recombination rates are conserved. In general, similarity is observed at very broad scales, but substantial differences are seen at fine scales. The average recombination rate of the X chromosome appears to be higher than that of the autosomes in both populations, and this pattern is much more pronounced in the African population than the North American population. The correlation between various genomic features-including recombination rates, diversity, divergence, GC content, gene content, and sequence quality-is examined using the wavelet analysis, and it is shown that the most notable difference between D. melanogaster and humans is in the correlation between

  10. Generation and Evaluation of a Genome-Scale Metabolic Network Model of Synechococcus elongatus PCC7942

    Directory of Open Access Journals (Sweden)

    Julián Triana

    2014-08-01

    Full Text Available The reconstruction of genome-scale metabolic models and their applications represent a great advantage of systems biology. Through their use as metabolic flux simulation models, production of industrially-interesting metabolites can be predicted. Due to the growing number of studies of metabolic models driven by the increasing genomic sequencing projects, it is important to conceptualize steps of reconstruction and analysis. We have focused our work in the cyanobacterium Synechococcus elongatus PCC7942, for which several analyses and insights are unveiled. A comprehensive approach has been used, which can be of interest to lead the process of manual curation and genome-scale metabolic analysis. The final model, iSyf715 includes 851 reactions and 838 metabolites. A biomass equation, which encompasses elementary building blocks to allow cell growth, is also included. The applicability of the model is finally demonstrated by simulating autotrophic growth conditions of Synechococcus elongatus PCC7942.

  11. Generation and Evaluation of a Genome-Scale Metabolic Network Model of Synechococcus elongatus PCC7942

    Science.gov (United States)

    Triana, Julián; Montagud†, Arnau; Siurana, Maria; Fuente, David; Urchueguía, Arantxa; Gamermann, Daniel; Torres, Javier; Tena, Jose; de Córdoba, Pedro Fernández; Urchueguía, Javier F.

    2014-01-01

    The reconstruction of genome-scale metabolic models and their applications represent a great advantage of systems biology. Through their use as metabolic flux simulation models, production of industrially-interesting metabolites can be predicted. Due to the growing number of studies of metabolic models driven by the increasing genomic sequencing projects, it is important to conceptualize steps of reconstruction and analysis. We have focused our work in the cyanobacterium Synechococcus elongatus PCC7942, for which several analyses and insights are unveiled. A comprehensive approach has been used, which can be of interest to lead the process of manual curation and genome-scale metabolic analysis. The final model, iSyf715 includes 851 reactions and 838 metabolites. A biomass equation, which encompasses elementary building blocks to allow cell growth, is also included. The applicability of the model is finally demonstrated by simulating autotrophic growth conditions of Synechococcus elongatus PCC7942. PMID:25141288

  12. Genome-scale reconstruction of the sigma factor network in Escherichia coli: topology and functional states

    DEFF Research Database (Denmark)

    Cho, Byung-Kwan; Kim, Donghyuk; Knight, Eric M.

    2014-01-01

    to transcription units (TUs), representing an increase of more than 300% over what has been previously reported. The reconstructed network was used to investigate competition between alternative sigma-factors (the sigma(70) and sigma(38) regulons), confirming the competition model of sigma substitution......Background: At the beginning of the transcription process, the RNA polymerase (RNAP) core enzyme requires a sigma-factor to recognize the genomic location at which the process initiates. Although the crucial role of sigma-factors has long been appreciated and characterized for many individual...... promoters, we do not yet have a genome-scale assessment of their function. Results: Using multiple genome-scale measurements, we elucidated the network of s-factor and promoter interactions in Escherichia coli. The reconstructed network includes 4,724 sigma-factor-specific promoters corresponding...

  13. The architecture of ArgR-DNA complexes at the genome-scale in> Escherichia coli

    DEFF Research Database (Denmark)

    Cho, Suhyung; Cho, Yoo-Bok; Kang, Taek Jin;

    2015-01-01

    DNA-binding motifs that are recognized by transcription factors (TFs) have been well studied; however, challenges remain in determining the in vivo architecture of TF-DNA complexes on a genome-scale. Here, we determined the in vivo architecture of Escherichia coli arginine repressor (ArgR)-DNA co...

  14. A versatile genome-scale PCR-based pipeline for high-definition DNA FISH

    NARCIS (Netherlands)

    Bienko, M.; Crosetto, N.; Teytelman, L.; Klemm, S.; Itzkovitz, S.; van Oudenaarden, A.

    2013-01-01

    We developed a cost-effective genome-scale PCR-based method for high-definition DNA FISH (HD-FISH). We visualized gene loci with diffraction-limited resolution, chromosomes as spot clusters and single genes together with transcripts by combining HD-FISH with single-molecule RNA FISH. We provide a da

  15. Reliable and efficient solution of genome-scale models of Metabolism and macromolecular Expression

    DEFF Research Database (Denmark)

    Ma, Ding; Yang, Laurence; Fleming, Ronan M. T.

    2017-01-01

    Constraint-Based Reconstruction and Analysis (COBRA) is currently the only methodology that permits integrated modeling of Metabolism and macromolecular Expression (ME) at genome-scale. Linear optimization computes steady-state flux solutions to ME models, but flux values are spread over many...

  16. Interplay between Constraints, Objectives, and Optimality for Genome-Scale Stoichiometric Models

    NARCIS (Netherlands)

    Maarleveld, T.R.; Wortel, M.; Olivier, B.G.; Teusink, B.; Bruggeman, F.J.

    2015-01-01

    High-throughput data generation and genome-scale stoichiometric models have greatly facilitated the comprehensive study of metabolic networks. The computation of all feasible metabolic routes with these models, given stoichiometric, thermodynamic, and steady-state constraints, provides important ins

  17. MultiMetEval : Comparative and Multi-Objective Analysis of Genome-Scale Metabolic Models

    NARCIS (Netherlands)

    Zakrzewski, Piotr; Medema, Marnix H.; Gevorgyan, Albert; Kierzek, Andrzej M.; Breitling, Rainer; Takano, Eriko; Fong, Stephen S.

    2012-01-01

    Comparative metabolic modelling is emerging as a novel field, supported by the development of reliable and standardized approaches for constructing genome-scale metabolic models in high throughput. New software solutions are needed to allow efficient comparative analysis of multiple models in the co

  18. BiGG Models: A platform for integrating, standardizing and sharing genome-scale models

    DEFF Research Database (Denmark)

    2016-01-01

    Genome-scale metabolic models are mathematically-structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized...

  19. Comparative genome-scale metabolic modeling of actinomycetes : The topology of essential core metabolism

    NARCIS (Netherlands)

    Alam, Mohammad Tauqeer; Medema, Marnix H.; Takano, Eriko; Breitling, Rainer; Gojobori, Takashi

    2011-01-01

    Actinomycetes are highly important bacteria. On one hand, some of them cause severe human and plant diseases, on the other hand, many species are known for their ability to produce antibiotics. Here we report the results of a comparative analysis of genome-scale metabolic models of 37 species of act

  20. Comparative genome-scale metabolic modeling of actinomycetes: the topology of essential core metabolism.

    NARCIS (Netherlands)

    Alam, M.T.; Medema, M.H.; Takano, E.; Breitling, R.

    2011-01-01

    Actinomycetes are highly important bacteria. On one hand, some of them cause severe human and plant diseases, on the other hand, many species are known for their ability to produce antibiotics. Here we report the results of a comparative analysis of genome-scale metabolic models of 37 species of act

  1. A database of phylogenetically atypical genes in archaeal and bacterial genomes, identified using the DarkHorse algorithm

    Directory of Open Access Journals (Sweden)

    Allen Eric E

    2008-10-01

    large-scale HGT patterns among protein families and genome groups. Although the DarkHorse algorithm cannot, by itself, provide definitive proof of horizontal gene transfer, it is a flexible, powerful tool that can be combined with slower, more rigorous methods in situations where these other methods could not otherwise be applied.

  2. Characterization of bacterial community dynamics in a full-scale drinking water treatment plant.

    Science.gov (United States)

    Li, Cuiping; Ling, Fangqiong; Zhang, Minglu; Liu, Wen-Tso; Li, Yuxian; Liu, Wenjun

    2017-01-01

    Understanding the spatial and temporal dynamics of microbial communities in drinking water systems is vital to securing the microbial safety of drinking water. The objective of this study was to comprehensively characterize the dynamics of microbial biomass and bacterial communities at each step of a full-scale drinking water treatment plant in Beijing, China. Both bulk water and biofilm samples on granular activated carbon (GAC) were collected over 9months. The proportion of cultivable cells decreased during the treatment processes, and this proportion was higher in warm season than cool season, suggesting that treatment processes and water temperature probably had considerable impact on the R2A cultivability of total bacteria. 16s rRNA gene based 454 pyrosequencing analysis of the bacterial community revealed that Proteobacteria predominated in all samples. The GAC biofilm harbored a distinct population with a much higher relative abundance of Acidobacteria than water samples. Principle coordinate analysis and one-way analysis of similarity indicated that the dynamics of the microbial communities in bulk water and biofilm samples were better explained by the treatment processes rather than by sampling time, and distinctive changes of the microbial communities in water occurred after GAC filtration. Furthermore, 20 distinct OTUs contributing most to the dissimilarity among samples of different sampling locations and 6 persistent OTUs present in the entire treatment process flow were identified. Overall, our findings demonstrate the significant effects that treatment processes have on the microbial biomass and community fluctuation and provide implications for further targeted investigation on particular bacteria populations.

  3. Germinal transmission of site-specific excised genomic DNA by the bacterial ParA resolvase

    Science.gov (United States)

    Genome engineering is an essential tool in research and product development. Behind some of the recent advances in plant gene transfer is the development of site-specific recombination systems that enable the precise manipulation of DNA, e.g. the deletion, integration or translocation of DNA. DNA ...

  4. Rapid editing and evolution of bacterial genomes using libraries of synthetic DNA.

    Science.gov (United States)

    Gallagher, Ryan R; Li, Zhe; Lewis, Aaron O; Isaacs, Farren J

    2014-10-01

    Multiplex automated genome engineering (MAGE) is a powerful technology for in vivo genome editing that uses synthetic single-stranded DNA (ssDNA) to introduce targeted modifications directly into the Escherichia coli chromosome. MAGE is a cyclical process that involves transformation of ssDNA (by electroporation) followed by outgrowth, during which bacteriophage homologous recombination proteins mediate annealing of ssDNAs to their genomic targets. By iteratively introducing libraries of mutagenic ssDNAs targeting multiple sites, MAGE can generate combinatorial genetic diversity in a cell population. Alternatively, MAGE can introduce precise mutant alleles at many loci for genome-wide editing or for recoding projects that are not possible with other methods. In recent technological advances, MAGE has been improved by strain modifications and selection techniques that enhance allelic replacement. This protocol describes the manual execution of MAGE wherein each cycle takes ≈ 2.5 h, which, if carried out by two people, allows ≈ 10 continuous cycles of MAGE-based mutagenesis per day.

  5. Comparative genomics of bacterial and plant folate synthesis and salvage: predictions and validations

    Directory of Open Access Journals (Sweden)

    Noiriel Alexandre

    2007-07-01

    Full Text Available Abstract Background Folate synthesis and salvage pathways are relatively well known from classical biochemistry and genetics but they have not been subjected to comparative genomic analysis. The availability of genome sequences from hundreds of diverse bacteria, and from Arabidopsis thaliana, enabled such an analysis using the SEED database and its tools. This study reports the results of the analysis and integrates them with new and existing experimental data. Results Based on sequence similarity and the clustering, fusion, and phylogenetic distribution of genes, several functional predictions emerged from this analysis. For bacteria, these included the existence of novel GTP cyclohydrolase I and folylpolyglutamate synthase gene families, and of a trifunctional p-aminobenzoate synthesis gene. For plants and bacteria, the predictions comprised the identities of a 'missing' folate synthesis gene (folQ and of a folate transporter, and the absence from plants of a folate salvage enzyme. Genetic and biochemical tests bore out these predictions. Conclusion For bacteria, these results demonstrate that much can be learnt from comparative genomics, even for well-explored primary metabolic pathways. For plants, the findings particularly illustrate the potential for rapid functional assignment of unknown genes that have prokaryotic homologs, by analyzing which genes are associated with the latter. More generally, our data indicate how combined genomic analysis of both plants and prokaryotes can be more powerful than isolated examination of either group alone.

  6. Relative entropy differences in bacterial chromosomes, plasmids, phages and genomic islands

    DEFF Research Database (Denmark)

    Bohlin, Jon; van Passel, Mark W. J.; Snipen, Lars;

    2012-01-01

    Background: We sought to assess whether the concept of relative entropy (information capacity), could aid our understanding of the process of horizontal gene transfer in microbes. We analyzed the differences in information capacity between prokaryotic chromosomes, genomic islands (GI), phages, an...... chromosomes and stably incorporated GIs compared to the transient or independent replicons such as phages and plasmids....

  7. Relative entropy differences in bacterial chromosomes, plasmids, phages and genomic islands

    NARCIS (Netherlands)

    Bohlin, J.; Passel, van M.W.J.

    2012-01-01

    Background: We sought to assess whether the concept of relative entropy (information capacity), could aid our understanding of the process of horizontal gene transfer in microbes. We analyzed the differences in information capacity between prokaryotic chromosomes, genomic islands (GI), phages, and p

  8. Draft genome sequence of Erwinia tracheiphila, an economically important bacterial pathogen of cucurbits

    Science.gov (United States)

    Erwinia tracheiphila is one of the most economically important pathogen of cucumbers, melons, squashes, pumpkins, and gourds, in the Northeastern and Midwestern United States, yet the molecular pathology remains uninvestigated. Here we report the first draft genome sequence of an E. tracheiphila str...

  9. GMATA: An Integrated Software Package for Genome-Scale SSR Mining, Marker Development and Viewing

    Science.gov (United States)

    Wang, Xuewen; Wang, Le

    2016-01-01

    Simple sequence repeats (SSRs), also referred to as microsatellites, are highly variable tandem DNAs that are widely used as genetic markers. The increasing availability of whole-genome and transcript sequences provides information resources for SSR marker development. However, efficient software is required to efficiently identify and display SSR information along with other gene features at a genome scale. We developed novel software package Genome-wide Microsatellite Analyzing Tool Package (GMATA) integrating SSR mining, statistical analysis and plotting, marker design, polymorphism screening and marker transferability, and enabled simultaneously display SSR markers with other genome features. GMATA applies novel strategies for SSR analysis and primer design in large genomes, which allows GMATA to perform faster calculation and provides more accurate results than existing tools. Our package is also capable of processing DNA sequences of any size on a standard computer. GMATA is user friendly, only requires mouse clicks or types inputs on the command line, and is executable in multiple computing platforms. We demonstrated the application of GMATA in plants genomes and reveal a novel distribution pattern of SSRs in 15 grass genomes. The most abundant motifs are dimer GA/TC, the A/T monomer and the GCG/CGC trimer, rather than the rich G/C content in DNA sequence. We also revealed that SSR count is a linear to the chromosome length in fully assembled grass genomes. GMATA represents a powerful application tool that facilitates genomic sequence analyses. GAMTA is freely available at http://sourceforge.net/projects/gmata/?source=navbar. PMID:27679641

  10. GMATA: an integrated software package for genome-scale SSR mining, marker development and viewing

    Directory of Open Access Journals (Sweden)

    Xuewen Wang

    2016-09-01

    Full Text Available Simple sequence repeats (SSRs, also referred to as microsatellites, are highly variable tandem DNAs that are widely used as genetic markers. The increasing availability of whole-genome and transcript sequences provides information resources for SSR marker development. However, efficient software is required to efficiently identify and display SSR information along with other gene features at a genome scale. We developed novel software package Genome-wide Microsatellite Analyzing Tool Package (GMATA integrating SSR mining, statistical analysis and plotting, marker design, polymorphism screening and marker transferability, and enabled simultaneously display SSR markers with other genome features. GMATA applies novel strategies for SSR analysis and primer design in large genomes, which allows GMATA to perform faster calculation and provides more accurate results than existing tools. Our package is also capable of processing DNA sequences of any size on a standard computer. GMATA is user friendly, only requires mouse clicks or types inputs on the command line, and is executable in multiple computing platforms. We demonstrated the application of GMATA in plants genomes and reveal a novel distribution pattern of SSRs in 15 grass genomes. The most abundant motifs are dimer GA/TC, the A/T monomer and the GCG/CGC trimer, rather than the rich G/C content in DNA sequence. We also revealed that SSR count is a linear to the chromosome length in fully assembled grass genomes. GMATA represents a powerful application tool that facilitates genomic sequence analyses. GAMTA is freely available at http://sourceforge.net/projects/gmata/?source=navbar.

  11. Bacterial Responses and Genome Instability Induced by Subinhibitory Concentrations of Antibiotics

    Directory of Open Access Journals (Sweden)

    Arnaud Gutierrez

    2013-03-01

    Full Text Available Nowadays, the emergence and spread of antibiotic resistance have become an utmost medical and economical problem. It has also become evident that subinhibitory concentrations of antibiotics, which pollute all kind of terrestrial and aquatic environments, have a non-negligible effect on the evolution of antibiotic resistance in bacterial populations. Subinhibitory concentrations of antibiotics have a strong effect on mutation rates, horizontal gene transfer and biofilm formation, which may all contribute to the emergence and spread of antibiotic resistance. Therefore, the molecular mechanisms and the evolutionary pressures shaping the bacterial responses to subinhibitory concentrations of antibiotics merit to be extensively studied. Such knowledge is valuable for the development of strategies to increase the efficacy of antibiotic treatments and to extend the lifetime of antibiotics used in therapy by slowing down the emergence of antibiotic resistance.

  12. Genome-scale identification method applied to find cryptic aminoglycoside resistance genes in Pseudomonas aeruginosa.

    Directory of Open Access Journals (Sweden)

    Julie M Struble

    Full Text Available BACKGROUND: The ability of bacteria to rapidly evolve resistance to antibiotics is a critical public health problem. Resistance leads to increased disease severity and death rates, as well as imposes pressure towards the discovery and development of new antibiotic therapies. Improving understanding of the evolution and genetic basis of resistance is a fundamental goal in the field of microbiology. RESULTS: We have applied a new genomic method, Scalar Analysis of Library Enrichments (SCALEs, to identify genomic regions that, given increased copy number, may lead to aminoglycoside resistance in Pseudomonas aeruginosa at the genome scale. We report the result of selections on highly representative genomic libraries for three different aminoglycoside antibiotics (amikacin, gentamicin, and tobramycin. At the genome-scale, we show significant (p<0.05 overlap in genes identified for each aminoglycoside evaluated. Among the genomic segments identified, we confirmed increased resistance associated with an increased copy number of several genomic regions, including the ORF of PA5471, recently implicated in MexXY efflux pump related aminoglycoside resistance, PA4943-PA4946 (encoding a probable GTP-binding protein, a predicted host factor I protein, a delta 2-isopentenylpyrophosphate transferase, and DNA mismatch repair protein mutL, PA0960-PA0963 (encoding hypothetical proteins, a probable cold shock protein, a probable DNA-binding stress protein, and aspartyl-tRNA synthetase, a segment of PA4967 (encoding a topoisomerase IV subunit B, as well as a chimeric clone containing two inserts including the ORFs PA0547 and PA2326 (encoding a probable transcriptional regulator and a probable hypothetical protein, respectively. CONCLUSIONS: The studies reported here demonstrate the application of new a genomic method, SCALEs, which can be used to improve understanding of the evolution of antibiotic resistance in P. aeruginosa. In our demonstration studies, we

  13. Bacteriophage Resistance Mechanisms in the Fish Pathogen Flavobacterium psychrophilum: Linking Genomic Mutations to Changes in Bacterial Virulence Factors

    DEFF Research Database (Denmark)

    Castillo, Daniel; Christiansen, Rói Hammershaimb; Dalsgaard, Inger

    2015-01-01

    Flavobacterium psychrophilum is an important fish pathogen in salmonid aquaculture worldwide. Due to increased antibiotic resistance, pathogen control using bacteriophages has been explored as a possible alternative treatment. However, the effective use of bacteriophages in pathogen control...... requires overcoming the selection for phage resistance in the bacterial populations. Here, we analyzed resistance mechanisms in F. psychrophilum after phage exposure using whole-genome sequencing of the ancestral phage-sensitive strain 950106-1/1 and six phage-resistant isolates. The phage......-resistant strains had all obtained unique insertions and/or deletions and point mutations distributed among intergenic and genic regions. Mutations in genes related to cell surface properties, gliding motility, and biosynthesis of lipopolysaccharides and cell wall were found. The observed links between phage...

  14. A Year of Infection in the Intensive Care Unit: Prospective Whole Genome Sequencing of Bacterial Clinical Isolates Reveals Cryptic Transmissions and Novel Microbiota.

    Science.gov (United States)

    Roach, David J; Burton, Joshua N; Lee, Choli; Stackhouse, Bethany; Butler-Wu, Susan M; Cookson, Brad T; Shendure, Jay; Salipante, Stephen J

    2015-07-01

    Bacterial whole genome sequencing holds promise as a disruptive technology in clinical microbiology, but it has not yet been applied systematically or comprehensively within a clinical context. Here, over the course of one year, we performed prospective collection and whole genome sequencing of nearly all bacterial isolates obtained from a tertiary care hospital's intensive care units (ICUs). This unbiased collection of 1,229 bacterial genomes from 391 patients enables detailed exploration of several features of clinical pathogens. A sizable fraction of isolates identified as clinically relevant corresponded to previously undescribed species: 12% of isolates assigned a species-level classification by conventional methods actually qualified as distinct, novel genomospecies on the basis of genomic similarity. Pan-genome analysis of the most frequently encountered pathogens in the collection revealed substantial variation in pan-genome size (1,420 to 20,432 genes) and the rate of gene discovery (1 to 152 genes per isolate sequenced). Surprisingly, although potential nosocomial transmission of actively surveilled pathogens was rare, 8.7% of isolates belonged to genomically related clonal lineages that were present among multiple patients, usually with overlapping hospital admissions, and were associated with clinically significant infection in 62% of patients from which they were recovered. Multi-patient clonal lineages were particularly evident in the neonatal care unit, where seven separate Staphylococcus epidermidis clonal lineages were identified, including one lineage associated with bacteremia in 5/9 neonates. Our study highlights key differences in the information made available by conventional microbiological practices versus whole genome sequencing, and motivates the further integration of microbial genome sequencing into routine clinical care.

  15. Biomarker-based classification of bacterial and fungal whole-blood infections in a genome-wide expression study

    Directory of Open Access Journals (Sweden)

    Andreas eDix

    2015-03-01

    Full Text Available Sepsis is a clinical syndrome that can be caused by bacteria or fungi. Early knowledge on the nature of the causative agent is a prerequisite for targeted anti-microbial therapy. Besides currently used detection methods like blood culture and PCR-based assays, the analysis of the transcriptional response of the host to infecting organisms holds great promise. In this study, we aim to examine the transcriptional footprint of infections caused by the bacterial pathogens Staphylococcus aureus and Escherichia coli and the fungal pathogens Candida albicans and Aspergillus fumigatus in a human whole-blood model. Moreover, we use the expression information to build a random forest classifier to classify if a sample contains a bacterial, fungal, or mock-infection. After normalizing the transcription intensities using stably expressed reference genes, we filtered the gene set for biomarkers of bacterial or fungal blood infections. This selection is based on differential expression and an additional gene relevance measure. In this way, we identified 38 biomarker genes, including IL6, SOCS3, and IRG1 which were already associated to sepsis by other studies. Using these genes, we trained the classifier and assessed its performance. It yielded a 96% accuracy (sensitivities >93%, specificities >97% for a 10-fold stratified cross-validation and a 92% accuracy (sensitivities and specificities >83% for an additional test dataset comprising Cryptococcus neoformans infections. Furthermore, the classifier is robust to Gaussian noise, indicating correct class predictions on datasets of new species. In conclusion, this genome-wide approach demonstrates an effective feature selection process in combination with the construction of a well-performing classification model. Further analyses of genes with pathogen-dependent expression patterns can provide insights into the systemic host responses, which may lead to new anti-microbial therapeutic advances.

  16. Integration of expression data in genome-scale metabolic network reconstructions

    Directory of Open Access Journals (Sweden)

    Anna S. Blazier

    2012-08-01

    Full Text Available With the advent of high-throughput technologies, the field of systems biology has amassed an abundance of omics data, quantifying thousands of cellular components across a variety of scales, ranging from mRNA transcript levels to metabolite quantities. Methods are needed to not only integrate this omics data but to also use this data to heighten the predictive capabilities of computational models. Several recent studies have successfully demonstrated how flux balance analysis (FBA, a constraint-based modeling approach, can be used to integrate transcriptomic data into genome-scale metabolic network reconstructions to generate predictive computational models. In this review, we summarize such FBA-based methods for integrating expression data into genome-scale metabolic network reconstructions, highlighting their advantages as well as their limitations.

  17. Genome-Scale Analysis of Cell-Specific Regulatory Codes Using Nuclear Enzymes.

    Science.gov (United States)

    Baek, Songjoon; Sung, Myong-Hee

    2016-01-01

    High-throughput sequencing technologies have made it possible for biologists to generate genome-wide profiles of chromatin features at the nucleotide resolution. Enzymes such as nucleases or transposes have been instrumental as a chromatin-probing agent due to their ability to target accessible chromatin for cleavage or insertion. On the scale of a few hundred base pairs, preferential action of the nuclear enzymes on accessible chromatin allows mapping of cell state-specific accessibility in vivo. Such accessible regions contain functionally important regulatory sites, including promoters and enhancers, which undergo active remodeling for cells adapting in a dynamic environment. DNase-seq and the more recent ATAC-seq are two assays that are gaining popularity. Deep sequencing of DNA libraries from these assays, termed genomic footprinting, has been proposed to enable the comprehensive construction of protein occupancy profiles over the genome at the nucleotide level. Recent studies have discovered limitations of genomic footprinting which reduce the scope of detectable proteins. In addition, the identification of putative factors that bind to the observed footprints remains challenging. Despite these caveats, the methodology still presents significant advantages over alternative techniques such as ChIP-seq or FAIRE-seq. Here we describe computational approaches and tools for analysis of chromatin accessibility and genomic footprinting. Proper experimental design and assay-specific data analysis ensure the detection sensitivity and maximize retrievable information. The enzyme-based chromatin profiling approaches represent a powerful and evolving methodology which facilitates our understanding of how the genome is regulated.

  18. Rare and common regulatory variation in population-scale sequenced human genomes.

    Directory of Open Access Journals (Sweden)

    Stephen B Montgomery

    2011-07-01

    Full Text Available Population-scale genome sequencing allows the characterization of functional effects of a broad spectrum of genetic variants underlying human phenotypic variation. Here, we investigate the influence of rare and common genetic variants on gene expression patterns, using variants identified from sequencing data from the 1000 genomes project in an African and European population sample and gene expression data from lymphoblastoid cell lines. We detect comparable numbers of expression quantitative trait loci (eQTLs when compared to genotypes obtained from HapMap 3, but as many as 80% of the top expression quantitative trait variants (eQTVs discovered from 1000 genomes data are novel. The properties of the newly discovered variants suggest that mapping common causal regulatory variants is challenging even with full resequencing data; however, we observe significant enrichment of regulatory effects in splice-site and nonsense variants. Using RNA sequencing data, we show that 46.2% of nonsynonymous variants are differentially expressed in at least one individual in our sample, creating widespread potential for interactions between functional protein-coding and regulatory variants. We also use allele-specific expression to identify putative rare causal regulatory variants. Furthermore, we demonstrate that outlier expression values can be due to rare variant effects, and we approximate the number of such effects harboured in an individual by effect size. Our results demonstrate that integration of genomic and RNA sequencing analyses allows for the joint assessment of genome sequence and genome function.

  19. [Bacterial infections as seen from the eukaryotic genome: DNA double strand breaks, inflammation and cancer].

    Science.gov (United States)

    Lemercier, Claudie

    2014-01-01

    An increasing number of studies report that infection by pathogenic bacteria alters the host genome, producing highly hazardous DNA double strand breaks for the eukaryotic cell. Even when DNA repair occurs, it often leaves "scars" on chromosomes that might generate genomic instability at the next cell division. Chronic intestinal inflammation promotes the expansion of genotoxic bacteria in the intestinal microbiote which in turn triggers tumor formation and colon carcinomas. Bacteria act at the level of the host DNA repair machinery. They also highjack the host cell cycle to allow themselves time for replication in an appropriate reservoir. However, except in the case of bacteria carrying the CDT nuclease, the molecular mechanisms responsible for DNA lesions are not well understood, even if reactive oxygen species released during infection make good candidates.

  20. Complete Genome Sequence of Lactobacillus jensenii Strain SNUV360, a Probiotic for Treatment of Bacterial Vaginosis Isolated from the Vagina of a Healthy Korean Woman

    Science.gov (United States)

    Lee, Sunghee; You, Hyun Ju; Kwon, Bomi

    2017-01-01

    ABSTRACT Lactobacillus jensenii SNUV360 is a potential probiotic strain that shows antimicrobial activity for the treatment of bacterial vaginosis. Here, we present the complete genomic sequence of L. jensenii SNUV360, isolated from a vaginal sample from a healthy Korean woman. Analysis of the sequence may provide insight into its functional activity. PMID:28280032

  1. Genomic selection models double the accuracy of predicted breeding values for bacterial cold water disease resistance compared to a traditional pedigree-based model in rainbow trout aquaculture

    Science.gov (United States)

    Previously we have shown that bacterial cold water disease (BCWD) resistance in rainbow trout can be improved using traditional family-based selection, but progress has been limited to exploiting only between-family genetic variation. Genomic selection (GS) is a new alternative enabling exploitation...

  2. Genome-enabled selection doubles the accuracy of predicted breeding values for bacterial cold water disease resistance compared to traditional family-based selection in rainbow trout aquaculture

    Science.gov (United States)

    We have shown previously that bacterial cold water disease (BCWD) resistance in rainbow trout can be improved using traditional family-based selection, but progress has been limited to exploiting only between-family genetic variation. Genomic selection (GS) is a new alternative enabling exploitation...

  3. Estimation of long-terminal repeat element content in the Helicoverpa zea genome from next generation sequencing of reduced representation bacterial artificial chromosome (BAC) pools

    Science.gov (United States)

    The lepidopteran pest insect, Helicoverpa zea, feeds on cultivated corn and cotton crops in North America where control remains challenging due to evolution of resistance to chemical and transgenic insecticidal toxins, yet few genomic resources are available for this species. A bacterial artificial...

  4. Construction of a full bacterial artificial chromosome (BAC) library of Oryza sativa genome

    Institute of Scientific and Technical Information of China (English)

    TAOQUANZHOU; HAIYINGZHAO; 等

    1994-01-01

    We have constructed a full BAC library for the superior early indica variety of Oryza sativa,Guang Lu Ai 4.The MAX Efficiency DH10B with increased stability of inserts was used as BAC host cells.The potent pBelo BACII with double selection markers was used as cloning vector.The cloning efficiency we have reached was as high as 98%,and the transformation efficiency was raised up to 106 transformants/μg of large fragment DNA.The BAC recombinant transformants were picked at random and analyzed for the size of inserts,which turned out to be of 120 kb in length on average.We have obtained more than 20,000 such BAC clones.According to conventional probability equation,they covered the entire rice genome of 420,000 kb in length.The entire length of inserts of the library obtained has the 5-to 6-fold coverage of the genome.To our knowledge,this is the first reported full BAC library for a complex genome.

  5. Toward the automated generation of genome-scale metabolic networks in the SEED

    Directory of Open Access Journals (Sweden)

    Gould John

    2007-04-01

    Full Text Available Abstract Background Current methods for the automated generation of genome-scale metabolic networks focus on genome annotation and preliminary biochemical reaction network assembly, but do not adequately address the process of identifying and filling gaps in the reaction network, and verifying that the network is suitable for systems level analysis. Thus, current methods are only sufficient for generating draft-quality networks, and refinement of the reaction network is still largely a manual, labor-intensive process. Results We have developed a method for generating genome-scale metabolic networks that produces substantially complete reaction networks, suitable for systems level analysis. Our method partitions the reaction space of central and intermediary metabolism into discrete, interconnected components that can be assembled and verified in isolation from each other, and then integrated and verified at the level of their interconnectivity. We have developed a database of components that are common across organisms, and have created tools for automatically assembling appropriate components for a particular organism based on the metabolic pathways encoded in the organism's genome. This focuses manual efforts on that portion of an organism's metabolism that is not yet represented in the database. We have demonstrated the efficacy of our method by reverse-engineering and automatically regenerating the reaction network from a published genome-scale metabolic model for Staphylococcus aureus. Additionally, we have verified that our method capitalizes on the database of common reaction network components created for S. aureus, by using these components to generate substantially complete reconstructions of the reaction networks from three other published metabolic models (Escherichia coli, Helicobacter pylori, and Lactococcus lactis. We have implemented our tools and database within the SEED, an open-source software environment for comparative

  6. The genome sequence of E. coli W (ATCC 9637: comparative genome analysis and an improved genome-scale reconstruction of E. coli

    Directory of Open Access Journals (Sweden)

    Lee Sang

    2011-01-01

    Full Text Available Abstract Background Escherichia coli is a model prokaryote, an important pathogen, and a key organism for industrial biotechnology. E. coli W (ATCC 9637, one of four strains designated as safe for laboratory purposes, has not been sequenced. E. coli W is a fast-growing strain and is the only safe strain that can utilize sucrose as a carbon source. Lifecycle analysis has demonstrated that sucrose from sugarcane is a preferred carbon source for industrial bioprocesses. Results We have sequenced and annotated the genome of E. coli W. The chromosome is 4,900,968 bp and encodes 4,764 ORFs. Two plasmids, pRK1 (102,536 bp and pRK2 (5,360 bp, are also present. W has unique features relative to other sequenced laboratory strains (K-12, B and Crooks: it has a larger genome and belongs to phylogroup B1 rather than A. W also grows on a much broader range of carbon sources than does K-12. A genome-scale reconstruction was developed and validated in order to interrogate metabolic properties. Conclusions The genome of W is more similar to commensal and pathogenic B1 strains than phylogroup A strains, and therefore has greater utility for comparative analyses with these strains. W should therefore be the strain of choice, or 'type strain' for group B1 comparative analyses. The genome annotation and tools created here are expected to allow further utilization and development of E. coli W as an industrial organism for sucrose-based bioprocesses. Refinements in our E. coli metabolic reconstruction allow it to more accurately define E. coli metabolism relative to previous models.

  7. Bacterial molecular networks: bridging the gap between functional genomics and dynamical modelling.

    Science.gov (United States)

    van Helden, Jacques; Toussaint, Ariane; Thieffry, Denis

    2012-01-01

    This introductory review synthesizes the contents of the volume Bacterial Molecular Networks of the series Methods in Molecular Biology. This volume gathers 9 reviews and 16 method chapters describing computational protocols for the analysis of metabolic pathways, protein interaction networks, and regulatory networks. Each protocol is documented by concrete case studies dedicated to model bacteria or interacting populations. Altogether, the chapters provide a representative overview of state-of-the-art methods for data integration and retrieval, network visualization, graph analysis, and dynamical modelling.

  8. Systematic planning of genome-scale experiments in poorly studied species.

    Science.gov (United States)

    Guan, Yuanfang; Dunham, Maitreya; Caudy, Amy; Troyanskaya, Olga

    2010-03-05

    Genome-scale datasets have been used extensively in model organisms to screen for specific candidates or to predict functions for uncharacterized genes. However, despite the availability of extensive knowledge in model organisms, the planning of genome-scale experiments in poorly studied species is still based on the intuition of experts or heuristic trials. We propose that computational and systematic approaches can be applied to drive the experiment planning process in poorly studied species based on available data and knowledge in closely related model organisms. In this paper, we suggest a computational strategy for recommending genome-scale experiments based on their capability to interrogate diverse biological processes to enable protein function assignment. To this end, we use the data-rich functional genomics compendium of the model organism to quantify the accuracy of each dataset in predicting each specific biological process and the overlap in such coverage between different datasets. Our approach uses an optimized combination of these quantifications to recommend an ordered list of experiments for accurately annotating most proteins in the poorly studied related organisms to most biological processes, as well as a set of experiments that target each specific biological process. The effectiveness of this experiment- planning system is demonstrated for two related yeast species: the model organism Saccharomyces cerevisiae and the comparatively poorly studied Saccharomyces bayanus. Our system recommended a set of S. bayanus experiments based on an S. cerevisiae microarray data compendium. In silico evaluations estimate that less than 10% of the experiments could achieve similar functional coverage to the whole microarray compendium. This estimation was confirmed by performing the recommended experiments in S. bayanus, therefore significantly reducing the labor devoted to characterize the poorly studied genome. This experiment-planning framework could

  9. Disinfection of bacterial biofilms in pilot-scale cooling tower systems.

    Science.gov (United States)

    Liu, Yang; Zhang, Wei; Sileika, Tadas; Warta, Richard; Cianciotto, Nicholas P; Packman, Aaron I

    2011-04-01

    The impact of continuous chlorination and periodic glutaraldehyde treatment on planktonic and biofilm microbial communities was evaluated in pilot-scale cooling towers operated continuously for 3 months. The system was operated at a flow rate of 10,080 l day(-1). Experiments were performed with a well-defined microbial consortium containing three heterotrophic bacteria: Pseudomonas aeruginosa, Klebsiella pneumoniae and Flavobacterium sp. The persistence of each species was monitored in the recirculating cooling water loop and in biofilms on steel and PVC coupons in the cooling tower basin. The observed bacterial colonization in cooling towers did not follow trends in growth rates observed under batch conditions and, instead, reflected differences in the ability of each organism to remain attached and form biofilms under the high-through flow conditions in cooling towers. Flavobacterium was the dominant organism in the community, while P. aeruginosa and K. pneumoniae did not attach well to either PVC or steel coupons in cooling towers and were not able to persist in biofilms. As a result, the much greater ability of Flavobacterium to adhere to surfaces protected it from disinfection, whereas P. aeruginosa and K. pneumoniae were subject to rapid disinfection in the planktonic state.

  10. Weak synchronization and large-scale collective oscillation in dense bacterial suspensions

    Science.gov (United States)

    Chen, Chong; Liu, Song; Shi, Xia-Qing; Chaté, Hugues; Wu, Yilin

    2017-01-01

    Collective oscillatory behaviour is ubiquitous in nature, having a vital role in many biological processes from embryogenesis and organ development to pace-making in neuron networks. Elucidating the mechanisms that give rise to synchronization is essential to the understanding of biological self-organization. Collective oscillations in biological multicellular systems often arise from long-range coupling mediated by diffusive chemicals, by electrochemical mechanisms, or by biomechanical interaction between cells and their physical environment. In these examples, the phase of some oscillatory intracellular degree of freedom is synchronized. Here, in contrast, we report the discovery of a weak synchronization mechanism that does not require long-range coupling or inherent oscillation of individual cells. We find that millions of motile cells in dense bacterial suspensions can self-organize into highly robust collective oscillatory motion, while individual cells move in an erratic manner, without obvious periodic motion but with frequent, abrupt and random directional changes. So erratic are individual trajectories that uncovering the collective oscillations of our micrometre-sized cells requires individual velocities to be averaged over tens or hundreds of micrometres. On such large scales, the oscillations appear to be in phase and the mean position of cells typically describes a regular elliptic trajectory. We found that the phase of the oscillations is organized into a centimetre-scale travelling wave. We present a model of noisy self-propelled particles with strictly local interactions that accounts faithfully for our observations, suggesting that self-organized collective oscillatory motion results from spontaneous chiral and rotational symmetry breaking. These findings reveal a previously unseen type of long-range order in active matter systems (those in which energy is spent locally to produce non-random motion). This mechanism of collective oscillation may

  11. Genomics, evolution, and crystal structure of a new family of bacterial spore kinases.

    Science.gov (United States)

    Scheeff, Eric D; Axelrod, Herbert L; Miller, Mitchell D; Chiu, Hsiu-Ju; Deacon, Ashley M; Wilson, Ian A; Manning, Gerard

    2010-05-01

    Bacterial spore formation is a complex process of fundamental relevance to biology and human disease. The spore coat structure is complex and poorly understood, and the roles of many of the protein components remain unclear. We describe a new family of spore coat proteins, the bacterial spore kinases (BSKs), and the first crystal structure of a BSK, YtaA (CotI) from Bacillus subtilis. BSKs are widely distributed in spore-forming Bacillus and Clostridium species, and have a dynamic evolutionary history. Sequence and structure analyses indicate that the BSKs are CAKs, a prevalent group of small molecule kinases in bacteria that is distantly related to the eukaryotic protein kinases. YtaA has substantial structural similarity to CAKs, but also displays distinctive features that broaden our understanding of the CAK group. Evolutionary constraint analysis of the protein surfaces indicates that members of the BSK family have distinct clade-conserved patterns in the substrate binding region, and probably bind and phosphorylate distinct targets. Several classes of BSKs have apparently independently lost catalytic activity to become pseudokinases, indicating that the family also has a major noncatalytic function.

  12. Combined Analysis of Variation in Core, Accessory and Regulatory Genome Regions Provides a Super-Resolution View into the Evolution of Bacterial Populations

    Science.gov (United States)

    McNally, Alan; Oren, Yaara; Kelly, Darren; Sreecharan, Tristan; Vehkala, Minna; Välimäki, Niko; Prentice, Michael B.; Ashour, Amgad; Avram, Oren; Pupko, Tal; Literak, Ivan; Guenther, Sebastian; Schaufler, Katharina; Wieler, Lothar H.; Zhiyong, Zong; Sheppard, Samuel K.; Corander, Jukka

    2016-01-01

    The use of whole-genome phylogenetic analysis has revolutionized our understanding of the evolution and spread of many important bacterial pathogens due to the high resolution view it provides. However, the majority of such analyses do not consider the potential role of accessory genes when inferring evolutionary trajectories. Moreover, the recently discovered importance of the switching of gene regulatory elements suggests that an exhaustive analysis, combining information from core and accessory genes with regulatory elements could provide unparalleled detail of the evolution of a bacterial population. Here we demonstrate this principle by applying it to a worldwide multi-host sample of the important pathogenic E. coli lineage ST131. Our approach reveals the existence of multiple circulating subtypes of the major drug–resistant clade of ST131 and provides the first ever population level evidence of core genome substitutions in gene regulatory regions associated with the acquisition and maintenance of different accessory genome elements. PMID:27618184

  13. Genome-scale identification of Legionella pneumophila effectors using a machine learning approach.

    Directory of Open Access Journals (Sweden)

    David Burstein

    2009-07-01

    Full Text Available A large number of highly pathogenic bacteria utilize secretion systems to translocate effector proteins into host cells. Using these effectors, the bacteria subvert host cell processes during infection. Legionella pneumophila translocates effectors via the Icm/Dot type-IV secretion system and to date, approximately 100 effectors have been identified by various experimental and computational techniques. Effector identification is a critical first step towards the understanding of the pathogenesis system in L. pneumophila as well as in other bacterial pathogens. Here, we formulate the task of effector identification as a classification problem: each L. pneumophila open reading frame (ORF was classified as either effector or not. We computationally defined a set of features that best distinguish effectors from non-effectors. These features cover a wide range of characteristics including taxonomical dispersion, regulatory data, genomic organization, similarity to eukaryotic proteomes and more. Machine learning algorithms utilizing these features were then applied to classify all the ORFs within the L. pneumophila genome. Using this approach we were able to predict and experimentally validate 40 new effectors, reaching a success rate of above 90%. Increasing the number of validated effectors to around 140, we were able to gain novel insights into their characteristics. Effectors were found to have low G+C content, supporting the hypothesis that a large number of effectors originate via horizontal gene transfer, probably from their protozoan host. In addition, effectors were found to cluster in specific genomic regions. Finally, we were able to provide a novel description of the C-terminal translocation signal required for effector translocation by the Icm/Dot secretion system. To conclude, we have discovered 40 novel L. pneumophila effectors, predicted over a hundred additional highly probable effectors, and shown the applicability of machine

  14. Solving the Problem of Comparing Whole Bacterial Genomes across Different Sequencing Platforms

    DEFF Research Database (Denmark)

    Kaas, Rolf Sommer; Leekitcharoenphon, Pimlapas; Aarestrup, Frank Møller;

    2014-01-01

    Whole genome sequencing (WGS) shows great potential for real-time monitoring and identification of infectious disease outbreaks. However, rapid and reliable comparison of data generated in multiple laboratories and using multiple technologies is essential. So far studies have focused on using one...... data sets and sequenced on three different platforms (Illumina, 454, Ion Torrent). We show that the methods are able to overcome the systematic biases caused by the sequencers and infer the expected phylogenies. It is concluded that the cause of the success of these new procedures is due...

  15. iAK692: A genome-scale metabolic model of Spirulina platensis C1

    Science.gov (United States)

    2012-01-01

    Background Spirulina (Arthrospira) platensis is a well-known filamentous cyanobacterium used in the production of many industrial products, including high value compounds, healthy food supplements, animal feeds, pharmaceuticals and cosmetics, for example. It has been increasingly studied around the world for scientific purposes, especially for its genome, biology, physiology, and also for the analysis of its small-scale metabolic network. However, the overall description of the metabolic and biotechnological capabilities of S. platensis requires the development of a whole cellular metabolism model. Recently, the S. platensis C1 (Arthrospira sp. PCC9438) genome sequence has become available, allowing systems-level studies of this commercial cyanobacterium. Results In this work, we present the genome-scale metabolic network analysis of S. platensis C1, iAK692, its topological properties, and its metabolic capabilities and functions. The network was reconstructed from the S. platensis C1 annotated genomic sequence using Pathway Tools software to generate a preliminary network. Then, manual curation was performed based on a collective knowledge base and a combination of genomic, biochemical, and physiological information. The genome-scale metabolic model consists of 692 genes, 837 metabolites, and 875 reactions. We validated iAK692 by conducting fermentation experiments and simulating the model under autotrophic, heterotrophic, and mixotrophic growth conditions using COBRA toolbox. The model predictions under these growth conditions were consistent with the experimental results. The iAK692 model was further used to predict the unique active reactions and essential genes for each growth condition. Additionally, the metabolic states of iAK692 during autotrophic and mixotrophic growths were described by phenotypic phase plane (PhPP) analysis. Conclusions This study proposes the first genome-scale model of S. platensis C1, iAK692, which is a predictive metabolic platform

  16. iAK692: A genome-scale metabolic model of Spirulina platensis C1

    Directory of Open Access Journals (Sweden)

    Klanchui Amornpan

    2012-06-01

    Full Text Available Abstract Background Spirulina (Arthrospira platensis is a well-known filamentous cyanobacterium used in the production of many industrial products, including high value compounds, healthy food supplements, animal feeds, pharmaceuticals and cosmetics, for example. It has been increasingly studied around the world for scientific purposes, especially for its genome, biology, physiology, and also for the analysis of its small-scale metabolic network. However, the overall description of the metabolic and biotechnological capabilities of S. platensis requires the development of a whole cellular metabolism model. Recently, the S. platensis C1 (Arthrospira sp. PCC9438 genome sequence has become available, allowing systems-level studies of this commercial cyanobacterium. Results In this work, we present the genome-scale metabolic network analysis of S. platensis C1, iAK692, its topological properties, and its metabolic capabilities and functions. The network was reconstructed from the S. platensis C1 annotated genomic sequence using Pathway Tools software to generate a preliminary network. Then, manual curation was performed based on a collective knowledge base and a combination of genomic, biochemical, and physiological information. The genome-scale metabolic model consists of 692 genes, 837 metabolites, and 875 reactions. We validated iAK692 by conducting fermentation experiments and simulating the model under autotrophic, heterotrophic, and mixotrophic growth conditions using COBRA toolbox. The model predictions under these growth conditions were consistent with the experimental results. The iAK692 model was further used to predict the unique active reactions and essential genes for each growth condition. Additionally, the metabolic states of iAK692 during autotrophic and mixotrophic growths were described by phenotypic phase plane (PhPP analysis. Conclusions This study proposes the first genome-scale model of S. platensis C1, iAK692, which is a

  17. Benchmarking of methods for identification of antimicrobial resistance genes in bacterial whole genome data

    DEFF Research Database (Denmark)

    Clausen, Philip T. L. C.; Zankari, Ea; Aarestrup, Frank Møller;

    2016-01-01

    Next generation sequencing (NGS) may be an alternative to phenotypic susceptibility testing for surveillance and clinical diagnosis. However, current bioinformatics methods may be associated with false positives and negatives. In this study, a novel mapping method was developed and benchmarked...... to two different methods in current use for identification of antibiotic resistance genes in bacterial WGS data. A novel method, KmerResistance, which examines the co-occurrence of k-mers between the WGS data and a database of resistance genes, was developed. The performance of this method was compared...... with two previously described methods; ResFinder and SRST2, which use an assembly/BLAST method and BWA, respectively, using two datasets with a total of 339 isolates, covering five species, originating from the Oxford University Hospitals NHS Trust and Danish pig farms. The predicted resistance...

  18. Genome-Wide Motif Statistics are Shaped by DNA Binding Proteins over Evolutionary Time Scales

    Science.gov (United States)

    Qian, Long; Kussell, Edo

    2016-10-01

    The composition of a genome with respect to all possible short DNA motifs impacts the ability of DNA binding proteins to locate and bind their target sites. Since nonfunctional DNA binding can be detrimental to cellular functions and ultimately to organismal fitness, organisms could benefit from reducing the number of nonfunctional DNA binding sites genome wide. Using in vitro measurements of binding affinities for a large collection of DNA binding proteins, in multiple species, we detect a significant global avoidance of weak binding sites in genomes. We demonstrate that the underlying evolutionary process leaves a distinct genomic hallmark in that similar words have correlated frequencies, a signal that we detect in all species across domains of life. We consider the possibility that natural selection against weak binding sites contributes to this process, and using an evolutionary model we show that the strength of selection needed to maintain global word compositions is on the order of point mutation rates. Likewise, we show that evolutionary mechanisms based on interference of protein-DNA binding with replication and mutational repair processes could yield similar results and operate with similar rates. On the basis of these modeling and bioinformatic results, we conclude that genome-wide word compositions have been molded by DNA binding proteins acting through tiny evolutionary steps over time scales spanning millions of generations.

  19. Modeling of Scale-Dependent Bacterial Growth by Chemical Kinetics Approach

    Directory of Open Access Journals (Sweden)

    Haydee Martínez

    2014-01-01

    Full Text Available We applied the so-called chemical kinetics approach to complex bacterial growth patterns that were dependent on the liquid-surface-area-to-volume ratio (SA/V of the bacterial cultures. The kinetic modeling was based on current experimental knowledge in terms of autocatalytic bacterial growth, its inhibition by the metabolite CO2, and the relief of inhibition through the physical escape of the inhibitor. The model quantitatively reproduces kinetic data of SA/V-dependent bacterial growth and can discriminate between differences in the growth dynamics of enteropathogenic E. coli, E. coli  JM83, and Salmonella typhimurium on one hand and Vibrio cholerae on the other hand. Furthermore, the data fitting procedures allowed predictions about the velocities of the involved key processes and the potential behavior in an open-flow bacterial chemostat, revealing an oscillatory approach to the stationary states.

  20. Basic and applied uses of genome-scale metabolic network reconstructions of Escherichia coli

    DEFF Research Database (Denmark)

    McCloskey, Douglas; Palsson, Bernhard; Feist, Adam

    2013-01-01

    The genome-scale model (GEM) of metabolism in the bacterium Escherichia coli K-12 has been in development for over a decade and is now in wide use. GEM-enabled studies of E. coli have been primarily focused on six applications: (1) metabolic engineering, (2) model-driven discovery, (3) prediction...... of cellular phenotypes, (4) analysis of biological network properties, (5) studies of evolutionary processes, and (6) models of interspecies interactions. In this review, we provide an overview of these applications along with a critical assessment of their successes and limitations, and a perspective...... on likely future developments in the field. Taken together, the studies performed over the past decade have established a genome-scale mechanistic understanding of genotype-phenotype relationships in E. coli metabolism that forms the basis for similar efforts for other microbial species. Future challenges...

  1. A systems approach to predict oncometabolites via context-specific genome-scale metabolic networks.

    Directory of Open Access Journals (Sweden)

    Hojung Nam

    2014-09-01

    Full Text Available Altered metabolism in cancer cells has been viewed as a passive response required for a malignant transformation. However, this view has changed through the recently described metabolic oncogenic factors: mutated isocitrate dehydrogenases (IDH, succinate dehydrogenase (SDH, and fumarate hydratase (FH that produce oncometabolites that competitively inhibit epigenetic regulation. In this study, we demonstrate in silico predictions of oncometabolites that have the potential to dysregulate epigenetic controls in nine types of cancer by incorporating massive scale genetic mutation information (collected from more than 1,700 cancer genomes, expression profiling data, and deploying Recon 2 to reconstruct context-specific genome-scale metabolic models. Our analysis predicted 15 compounds and 24 substructures of potential oncometabolites that could result from the loss-of-function and gain-of-function mutations of metabolic enzymes, respectively. These results suggest a substantial potential for discovering unidentified oncometabolites in various forms of cancers.

  2. Genome-scale metabolic model of Pichia pastoris with native and humanized glycosylation of recombinant proteins

    DEFF Research Database (Denmark)

    Irani, Zahra Azimzadeh; Kerkhoven, Eduard J.; Shojaosadati, Seyed Abbas;

    2016-01-01

    Pichia pastoris is used for commercial production of human therapeutic proteins, and genome-scale models of P. pastoris metabolism have been generated in the past to study the metabolism and associated protein production by this yeast. A major challenge with clinical usage of recombinant proteins...... produced by P. pastoris is the difference in N-glycosylation of proteins produced by humans and this yeast. However, through metabolic engineering, a P. pastoris strain capable of producing humanized N-glycosylated proteins was constructed. The current genome-scale models of P. pastoris do not address...... native nor humanized N-glycosylation, and we therefore developed ihGlycopastoris, an extension to the iLC915 model with both native and humanized N-glycosylation for recombinant protein production, but also an estimation of N-glycosylation of P. pastoris native proteins. This new model gives a better...

  3. Genome-scale dynamic modeling of the competition between Rhodoferax and Geobacter in anoxic subsurface environments.

    Science.gov (United States)

    Zhuang, Kai; Izallalen, Mounir; Mouser, Paula; Richter, Hanno; Risso, Carla; Mahadevan, Radhakrishnan; Lovley, Derek R

    2011-02-01

    The advent of rapid complete genome sequencing, and the potential to capture this information in genome-scale metabolic models, provide the possibility of comprehensively modeling microbial community interactions. For example, Rhodoferax and Geobacter species are acetate-oxidizing Fe(III)-reducers that compete in anoxic subsurface environments and this competition may have an influence on the in situ bioremediation of uranium-contaminated groundwater. Therefore, genome-scale models of Geobacter sulfurreducens and Rhodoferax ferrireducens were used to evaluate how Geobacter and Rhodoferax species might compete under diverse conditions found in a uranium-contaminated aquifer in Rifle, CO. The model predicted that at the low rates of acetate flux expected under natural conditions at the site, Rhodoferax will outcompete Geobacter as long as sufficient ammonium is available. The model also predicted that when high concentrations of acetate are added during in situ bioremediation, Geobacter species would predominate, consistent with field-scale observations. This can be attributed to the higher expected growth yields of Rhodoferax and the ability of Geobacter to fix nitrogen. The modeling predicted relative proportions of Geobacter and Rhodoferax in geochemically distinct zones of the Rifle site that were comparable to those that were previously documented with molecular techniques. The model also predicted that under nitrogen fixation, higher carbon and electron fluxes would be diverted toward respiration rather than biomass formation in Geobacter, providing a potential explanation for enhanced in situ U(VI) reduction in low-ammonium zones. These results show that genome-scale modeling can be a useful tool for predicting microbial interactions in subsurface environments and shows promise for designing bioremediation strategies.

  4. The genome-scale metabolic extreme pathway structure in Haemophilus influenzae shows significant network redundancy.

    Science.gov (United States)

    Papin, Jason A; Price, Nathan D; Edwards, Jeremy S; Palsson B, Bernhard Ø

    2002-03-07

    Genome-scale metabolic networks can be characterized by a set of systemically independent and unique extreme pathways. These extreme pathways span a convex, high-dimensional space that circumscribes all potential steady-state flux distributions achievable by the defined metabolic network. Genome-scale extreme pathways associated with the production of non-essential amino acids in Haemophilus influenzae were computed. They offer valuable insight into the functioning of its metabolic network. Three key results were obtained. First, there were multiple internal flux maps corresponding to externally indistinguishable states. It was shown that there was an average of 37 internal states per unique exchange flux vector in H. influenzae when the network was used to produce a single amino acid while allowing carbon dioxide and acetate as carbon sinks. With the inclusion of succinate as an additional output, this ratio increased to 52, a 40% increase. Second, an analysis of the carbon fates illustrated that the extreme pathways were non-uniformly distributed across the carbon fate spectrum. In the detailed case study, 45% of the distinct carbon fate values associated with lysine production represented 85% of the extreme pathways. Third, this distribution fell between distinct systemic constraints. For lysine production, the carbon fate values that represented 85% of the pathways described above corresponded to only 2 distinct ratios of 1:1 and 4:1 between carbon dioxide and acetate. The present study analysed single outputs from one organism, and provides a start to genome-scale extreme pathways studies. These emergent system-level characterizations show the significance of metabolic extreme pathway analysis at the genome-scale.

  5. Genome-scale modeling of human metabolism - a systems biology approach.

    Science.gov (United States)

    Mardinoglu, Adil; Gatto, Francesco; Nielsen, Jens

    2013-09-01

    Altered metabolism is linked to the appearance of various human diseases and a better understanding of disease-associated metabolic changes may lead to the identification of novel prognostic biomarkers and the development of new therapies. Genome-scale metabolic models (GEMs) have been employed for studying human metabolism in a systematic manner, as well as for understanding complex human diseases. In the past decade, such metabolic models - one of the fundamental aspects of systems biology - have started contributing to the understanding of the mechanistic relationship between genotype and phenotype. In this review, we focus on the construction of the Human Metabolic Reaction database, the generation of healthy cell type- and cancer-specific GEMs using different procedures, and the potential applications of these developments in the study of human metabolism and in the identification of metabolic changes associated with various disorders. We further examine how in silico genome-scale reconstructions can be employed to simulate metabolic flux distributions and how high-throughput omics data can be analyzed in a context-dependent fashion. Insights yielded from this mechanistic modeling approach can be used for identifying new therapeutic agents and drug targets as well as for the discovery of novel biomarkers. Finally, recent advancements in genome-scale modeling and the future challenge of developing a model of whole-body metabolism are presented. The emergent contribution of GEMs to personalized and translational medicine is also discussed.

  6. Diagnostics for stochastic genome-scale modeling via model slicing and debugging.

    Directory of Open Access Journals (Sweden)

    Kevin J Tsai

    Full Text Available Modeling of biological behavior has evolved from simple gene expression plots represented by mathematical equations to genome-scale systems biology networks. However, due to obstacles in complexity and scalability of creating genome-scale models, several biological modelers have turned to programming or scripting languages and away from modeling fundamentals. In doing so, they have traded the ability to have exchangeable, standardized model representation formats, while those that remain true to standardized model representation are faced with challenges in model complexity and analysis. We have developed a model diagnostic methodology inspired by program slicing and debugging and demonstrate the effectiveness of the methodology on a genome-scale metabolic network model published in the BioModels database. The computer-aided identification revealed specific points of interest such as reversibility of reactions, initialization of species amounts, and parameter estimation that improved a candidate cell's adenosine triphosphate production. We then compared the advantages of our methodology over other modeling techniques such as model checking and model reduction. A software application that implements the methodology is available at http://gel.ym.edu.tw/gcs/.

  7. Diagnostics for stochastic genome-scale modeling via model slicing and debugging.

    Science.gov (United States)

    Tsai, Kevin J; Chang, Chuan-Hsiung

    2014-01-01

    Modeling of biological behavior has evolved from simple gene expression plots represented by mathematical equations to genome-scale systems biology networks. However, due to obstacles in complexity and scalability of creating genome-scale models, several biological modelers have turned to programming or scripting languages and away from modeling fundamentals. In doing so, they have traded the ability to have exchangeable, standardized model representation formats, while those that remain true to standardized model representation are faced with challenges in model complexity and analysis. We have developed a model diagnostic methodology inspired by program slicing and debugging and demonstrate the effectiveness of the methodology on a genome-scale metabolic network model published in the BioModels database. The computer-aided identification revealed specific points of interest such as reversibility of reactions, initialization of species amounts, and parameter estimation that improved a candidate cell's adenosine triphosphate production. We then compared the advantages of our methodology over other modeling techniques such as model checking and model reduction. A software application that implements the methodology is available at http://gel.ym.edu.tw/gcs/.

  8. Comparative genome-scale reconstruction of gapless metabolic networks for present and ancestral species.

    Directory of Open Access Journals (Sweden)

    Esa Pitkänen

    2014-02-01

    Full Text Available We introduce a novel computational approach, CoReCo, for comparative metabolic reconstruction and provide genome-scale metabolic network models for 49 important fungal species. Leveraging on the exponential growth in sequenced genome availability, our method reconstructs genome-scale gapless metabolic networks simultaneously for a large number of species by integrating sequence data in a probabilistic framework. High reconstruction accuracy is demonstrated by comparisons to the well-curated Saccharomyces cerevisiae consensus model and large-scale knock-out experiments. Our comparative approach is particularly useful in scenarios where the quality of available sequence data is lacking, and when reconstructing evolutionary distant species. Moreover, the reconstructed networks are fully carbon mapped, allowing their use in 13C flux analysis. We demonstrate the functionality and usability of the reconstructed fungal models with computational steady-state biomass production experiment, as these fungi include some of the most important production organisms in industrial biotechnology. In contrast to many existing reconstruction techniques, only minimal manual effort is required before the reconstructed models are usable in flux balance experiments. CoReCo is available at http://esaskar.github.io/CoReCo/.

  9. Construction of an E. Coli genome-scale atom mapping model for MFA calculations.

    Science.gov (United States)

    Ravikirthi, Prabhasa; Suthers, Patrick F; Maranas, Costas D

    2011-06-01

    Metabolic flux analysis (MFA) has so far been restricted to lumped networks lacking many important pathways, partly due to the difficulty in automatically generating isotope mapping matrices for genome-scale metabolic networks. Here we introduce a procedure that uses a compound matching algorithm based on the graph theoretical concept of pattern recognition along with relevant reaction information to automatically generate genome-scale atom mappings which trace the path of atoms from reactants to products for every reaction. The procedure is applied to the iAF1260 metabolic reconstruction of Escherichia coli yielding the genome-scale isotope mapping model imPR90068. This model maps 90,068 non-hydrogen atoms that span all 2,077 reactions present in iAF1260 (previous largest mapping model included 238 reactions). The expanded scope of the isotope mapping model allows the complete tracking of labeled atoms through pathways such as cofactor and prosthetic group biosynthesis and histidine metabolism. An EMU representation of imPR90068 is also constructed and made available.

  10. Modeling Method for Increased Precision and Scope of Directly Measurable Fluxes at a Genome-Scale.

    Science.gov (United States)

    McCloskey, Douglas; Young, Jamey D; Xu, Sibei; Palsson, Bernhard O; Feist, Adam M

    2016-04-05

    Metabolic flux analysis (MFA) is considered to be the gold standard for determining the intracellular flux distribution of biological systems. The majority of work using MFA has been limited to core models of metabolism due to challenges in implementing genome-scale MFA and the undesirable trade-off between increased scope and decreased precision in flux estimations. This work presents a tunable workflow for expanding the scope of MFA to the genome-scale without trade-offs in flux precision. The genome-scale MFA model presented here, iDM2014, accounts for 537 net reactions, which includes the core pathways of traditional MFA models and also covers the additional pathways of purine, pyrimidine, isoprenoid, methionine, riboflavin, coenzyme A, and folate, as well as other biosynthetic pathways. When evaluating the iDM2014 using a set of measured intracellular intermediate and cofactor mass isotopomer distributions (MIDs),1 it was found that a total of 232 net fluxes of central and peripheral metabolism could be resolved in the E. coli network. The increase in scope was shown to cover the full biosynthetic route to an expanded set of bioproduction pathways, which should facilitate applications such as the design of more complex bioprocessing strains and aid in identifying new antimicrobials. Importantly, it was found that there was no loss in precision of core fluxes when compared to a traditional core model, and additionally there was an overall increase in precision when considering all observable reactions.

  11. Pantograph: A template-based method for genome-scale metabolic model reconstruction.

    Science.gov (United States)

    Loira, Nicolas; Zhukova, Anna; Sherman, David James

    2015-04-01

    Genome-scale metabolic models are a powerful tool to study the inner workings of biological systems and to guide applications. The advent of cheap sequencing has brought the opportunity to create metabolic maps of biotechnologically interesting organisms. While this drives the development of new methods and automatic tools, network reconstruction remains a time-consuming process where extensive manual curation is required. This curation introduces specific knowledge about the modeled organism, either explicitly in the form of molecular processes, or indirectly in the form of annotations of the model elements. Paradoxically, this knowledge is usually lost when reconstruction of a different organism is started. We introduce the Pantograph method for metabolic model reconstruction. This method combines a template reaction knowledge base, orthology mappings between two organisms, and experimental phenotypic evidence, to build a genome-scale metabolic model for a target organism. Our method infers implicit knowledge from annotations in the template, and rewrites these inferences to include them in the resulting model of the target organism. The generated model is well suited for manual curation. Scripts for evaluating the model with respect to experimental data are automatically generated, to aid curators in iterative improvement. We present an implementation of the Pantograph method, as a toolbox for genome-scale model reconstruction, curation and validation. This open source package can be obtained from: http://pathtastic.gforge.inria.fr.

  12. Comparative Genome-Scale Reconstruction of Gapless Metabolic Networks for Present and Ancestral Species

    Science.gov (United States)

    Pitkänen, Esa; Jouhten, Paula; Hou, Jian; Syed, Muhammad Fahad; Blomberg, Peter; Kludas, Jana; Oja, Merja; Holm, Liisa; Penttilä, Merja; Rousu, Juho; Arvas, Mikko

    2014-01-01

    We introduce a novel computational approach, CoReCo, for comparative metabolic reconstruction and provide genome-scale metabolic network models for 49 important fungal species. Leveraging on the exponential growth in sequenced genome availability, our method reconstructs genome-scale gapless metabolic networks simultaneously for a large number of species by integrating sequence data in a probabilistic framework. High reconstruction accuracy is demonstrated by comparisons to the well-curated Saccharomyces cerevisiae consensus model and large-scale knock-out experiments. Our comparative approach is particularly useful in scenarios where the quality of available sequence data is lacking, and when reconstructing evolutionary distant species. Moreover, the reconstructed networks are fully carbon mapped, allowing their use in 13C flux analysis. We demonstrate the functionality and usability of the reconstructed fungal models with computational steady-state biomass production experiment, as these fungi include some of the most important production organisms in industrial biotechnology. In contrast to many existing reconstruction techniques, only minimal manual effort is required before the reconstructed models are usable in flux balance experiments. CoReCo is available at http://esaskar.github.io/CoReCo/. PMID:24516375

  13. Using weakly conserved motifs hidden in secretion signals to identify type-III effectors from bacterial pathogen genomes.

    Directory of Open Access Journals (Sweden)

    Xiaobao Dong

    Full Text Available BACKGROUND: As one of the most important virulence factor types in gram-negative pathogenic bacteria, type-III effectors (TTEs play a crucial role in pathogen-host interactions by directly influencing immune signaling pathways within host cells. Based on the hypothesis that type-III secretion signals may be comprised of some weakly conserved sequence motifs, here we used profile-based amino acid pair information to develop an accurate TTE predictor. RESULTS: For a TTE or non-TTE, we first used a hidden Markov model-based sequence searching method (i.e., HHblits to detect its weakly homologous sequences and extracted the profile-based k-spaced amino acid pair composition (HH-CKSAAP from the N-terminal sequences. In the next step, the feature vector HH-CKSAAP was used to train a linear support vector machine model, which we designate as BEAN (Bacterial Effector ANalyzer. We compared our method with four existing TTE predictors through an independent test set, and our method revealed improved performance. Furthermore, we listed the most predictive amino acid pairs according to their weights in the established classification model. Evolutionary analysis shows that predictive amino acid pairs tend to be more conserved. Some predictive amino acid pairs also show significantly different position distributions between TTEs and non-TTEs. These analyses confirmed that some weakly conserved sequence motifs may play important roles in type-III secretion signals. Finally, we also used BEAN to scan one plant pathogen genome and showed that BEAN can be used for genome-wide TTE identification. The webserver and stand-alone version of BEAN are available at http://protein.cau.edu.cn:8080/bean/.

  14. Research progress of bacterial pan-genome%细菌泛基因组学的研究

    Institute of Scientific and Technical Information of China (English)

    庄绪冉; 朱泳璋

    2012-01-01

    Bacterium, one of the most ancient organisms, has great diversity and obvious differentiation in phenotype among different strains and even in different lines of one strain. The hereditary basis of differentiation is due to the genomir genetic information difference among different strains. In order to illustrate the individual genetic diversity and explore the hereditary basis of individual phylogenesis and phenotype difference, the concept of pan-genome is put forward. The causes of hereditary diversity of bacteria, the research strategy of pan-genome and its application in bacterial research are reviewed in this paper.%细菌是自然界中最古老的生物种群之一,不同菌种之间甚至是同一种菌的不同株系之间也具有丰富的遗传多样性,在表型特征上具有明显分化,这些分化的遗传基础主要源自菌株之间的基因组遗传信息的差异.为了更全面地在基因组水平上揭示细菌种内个体间的遗传多样性,进一步探寻个体间的系统发生关系和个体间表型差异的遗传基础,科学家提出了细菌泛基因组学的概念,该文对细菌菌种遗传多样性的形成机制、泛基因组的研究策略及其在细菌研究中的应用和进展等作一综述.

  15. ``Black Holes" and Bacterial Pathogenicity: A Large Genomic Deletion that Enhances the Virulence of Shigella spp. and Enteroinvasive Escherichia coli

    Science.gov (United States)

    Maurelli, Anthony T.; Fernandez, Reinaldo E.; Bloch, Craig A.; Rode, Christopher K.; Fasano, Alessio

    1998-03-01

    Plasmids, bacteriophages, and pathogenicity islands are genomic additions that contribute to the evolution of bacterial pathogens. For example, Shigella spp., the causative agents of bacillary dysentery, differ from the closely related commensal Escherichia coli in the presence of a plasmid in Shigella that encodes virulence functions. However, pathogenic bacteria also may lack properties that are characteristic of nonpathogens. Lysine decarboxylate (LDC) activity is present in ≈ 90% of E. coli strains but is uniformly absent in Shigella strains. When the gene for LDC, cadA, was introduced into Shigella flexneri 2a, virulence became attenuated, and enterotoxin activity was inhibited greatly. The enterotoxin inhibitor was identified as cadaverine, a product of the reaction catalyzed by LDC. Comparison of the S. flexneri 2a and laboratory E. coli K-12 genomes in the region of cadA revealed a large deletion in Shigella. Representative strains of Shigella spp. and enteroinvasive E. coli displayed similar deletions of cadA. Our results suggest that, as Shigella spp. evolved from E. coli to become pathogens, they not only acquired virulence genes on a plasmid but also shed genes via deletions. The formation of these ``black holes,'' deletions of genes that are detrimental to a pathogenic lifestyle, provides an evolutionary pathway that enables a pathogen to enhance virulence. Furthermore, the demonstration that cadaverine can inhibit enterotoxin activity may lead to more general models about toxin activity or entry into cells and suggests an avenue for antitoxin therapy. Thus, understanding the role of black holes in pathogen evolution may yield clues to new treatments of infectious diseases.

  16. Factors Controlling Soil Microbial Biomass and Bacterial Diversity and Community Composition in a Cold Desert Ecosystem: Role of Geographic Scale

    OpenAIRE

    Horn, David J. van; Lee Van Horn, M.; Barrett, John E.; Gooseff, Michael N.; Altrichter, Adam E; Geyer, Kevin M; Lydia H Zeglin; Takacs-Vesbach, Cristina D.

    2013-01-01

    Understanding controls over the distribution of soil bacteria is a fundamental step toward describing soil ecosystems, understanding their functional capabilities, and predicting their responses to environmental change. This study investigated the controls on the biomass, species richness, and community structure and composition of soil bacterial communities in the McMurdo Dry Valleys, Antarctica, at local and regional scales. The goals of the study were to describe the relationships between ...

  17. Multidrug-resistant Escherichia coli soft tissue infection investigated with bacterial whole genome sequencing

    Science.gov (United States)

    Buchanan, Ruaridh; Stoesser, Nicole; Crook, Derrick; Bowler, Ian C J W

    2014-01-01

    A 45-year-old man with dilated cardiomyopathy presented with acute leg pain and erythema suggestive of necrotising fasciitis. Initial surgical exploration revealed no necrosis and treatment for a soft tissue infection was started. Blood and tissue cultures unexpectedly grew a Gram-negative bacillus, subsequently identified by an automated broth microdilution phenotyping system as an extended-spectrum β-lactamase producing Escherichia coli. The patient was treated with a 3-week course of antibiotics (ertapenem followed by ciprofloxacin) and debridement for small areas of necrosis, followed by skin grafting. The presence of E. coli triggered investigation of both host and pathogen. The patient was found to have previously undiagnosed liver disease, a risk factor for E. coli soft tissue infection. Whole genome sequencing of isolates from all specimens confirmed they were clonal, of sequence type ST131 and associated with a likely plasmid-associated AmpC (CMY-2), several other resistance genes and a number of virulence factors. PMID:25331151

  18. Burkholderia contaminans Biofilm Regulating Operon and Its Distribution in Bacterial Genomes.

    Science.gov (United States)

    Voronina, Olga L; Kunda, Marina S; Ryzhova, Natalia N; Aksenova, Ekaterina I; Semenov, Andrey N; Romanova, Yulia M; Gintsburg, Alexandr L

    2016-01-01

    Biofilm formation by Burkholderia spp. is a principal cause of lung chronic infections in cystic fibrosis patients. A "lacking biofilm production" (LBP) strain B. contaminans GIMC4587:Bct370-19 has been obtained by insertion modification of clinical strain with plasposon mutagenesis. It has an interrupted transcriptional response regulator (RR) gene. The focus of our investigation was a two-component signal transduction system determination, including this RR. B. contaminans clinical and LBP strains were analyzed by whole genome sequencing and bioinformatics resources. A four-component operon (BiofilmReg) has a key role in biofilm formation. The relative location (i.e., by being separated by another gene) of RR and histidine kinase genes is unique in BiofilmReg. Orthologs were found in other members of the Burkholderiales order. Phylogenetic analysis of strains containing BiofilmReg operons demonstrated evidence for earlier inheritance of a three-component operon. During further evolution one lineage acquired a fourth gene, whereas others lost the third component of the operon. Mutations in sensor domains have created biodiversity which is advantageous for adaptation to various ecological niches. Different species Burkholderia and Achromobacter strains all demonstrated similar BiofilmReg operon structure. Therefore, there may be an opportunity to develop a common drug which is effective for treating all these causative agents.

  19. Similar processes but different environmental filters for soil bacterial and fungal community composition turnover on a broad spatial scale.

    Directory of Open Access Journals (Sweden)

    Nicolas Chemidlin Prévost-Bouré

    Full Text Available Spatial scaling of microorganisms has been demonstrated over the last decade. However, the processes and environmental filters shaping soil microbial community structure on a broad spatial scale still need to be refined and ranked. Here, we compared bacterial and fungal community composition turnovers through a biogeographical approach on the same soil sampling design at a broad spatial scale (area range: 13300 to 31000 km2: i to examine their spatial structuring; ii to investigate the relative importance of environmental selection and spatial autocorrelation in determining their community composition turnover; and iii to identify and rank the relevant environmental filters and scales involved in their spatial variations. Molecular fingerprinting of soil bacterial and fungal communities was performed on 413 soils from four French regions of contrasting environmental heterogeneity (LandesBacterial and fungal community composition turnovers were mainly driven by environmental selection explaining from 10% to 20% of community composition variations, but spatial variables also explained 3% to 9% of total variance. These variables highlighted significant spatial autocorrelation of both communities unexplained by the environmental variables measured and could partly be explained by dispersal limitations. Although the identified filters and their hierarchy were dependent on the region and organism, selection was systematically based on a common group of environmental variables: pH, trophic resources, texture and land use. Spatial autocorrelation was also important at

  20. Analysis of growth of Lactobacillus plantarum WCFS1 on a complex medium using a genome-scale metabolic model

    NARCIS (Netherlands)

    Teusink, B.; Wiersma, A.; Molenaar, D.; Francke, C.; Vos, de W.M.; Siezen, R.J.; Smid, E.J.

    2006-01-01

    A genome-scale metabolic model of the lactic acid bacterium Lactobacillus plantarum WCFS1 was constructed based on genomic content and experimental data. The complete model includes 721 genes, 643 reactions, and 531 metabolites. Different stoichiometric modeling techniques were used for interpretati

  1. Analysis of Comparative Sequence and Genomic Data to Verify Phylogenetic Relationship and Explore a New Subfamily of Bacterial Lipases.

    Directory of Open Access Journals (Sweden)

    Malihe Masomian

    Full Text Available Thermostable and organic solvent-tolerant enzymes have significant potential in a wide range of synthetic reactions in industry due to their inherent stability at high temperatures and their ability to endure harsh organic solvents. In this study, a novel gene encoding a true lipase was isolated by construction of a genomic DNA library of thermophilic Aneurinibacillus thermoaerophilus strain HZ into Escherichia coli plasmid vector. Sequence analysis revealed that HZ lipase had 62% identity to putative lipase from Bacillus pseudomycoides. The closely characterized lipases to the HZ lipase gene are from thermostable Bacillus and Geobacillus lipases belonging to the subfamily I.5 with ≤ 57% identity. The amino acid sequence analysis of HZ lipase determined a conserved pentapeptide containing the active serine, GHSMG and a Ca(2+-binding motif, GCYGSD in the enzyme. Protein structure modeling showed that HZ lipase consisted of an α/β hydrolase fold and a lid domain. Protein sequence alignment, conserved regions analysis, clustal distance matrix and amino acid composition illustrated differences between HZ lipase and other thermostable lipases. Phylogenetic analysis revealed that this lipase represented a new subfamily of family I of bacterial true lipases, classified as family I.9. The HZ lipase was expressed under promoter Plac using IPTG and was characterized. The recombinant enzyme showed optimal activity at 65 °C and retained ≥ 97% activity after incubation at 50 °C for 1h. The HZ lipase was stable in various polar and non-polar organic solvents.

  2. A genome-wide, fine-scale map of natural pigmentation variation in Drosophila melanogaster.

    Directory of Open Access Journals (Sweden)

    Héloïse Bastide

    2013-06-01

    Full Text Available Various approaches can be applied to uncover the genetic basis of natural phenotypic variation, each with their specific strengths and limitations. Here, we use a replicated genome-wide association approach (Pool-GWAS to fine-scale map genomic regions contributing to natural variation in female abdominal pigmentation in Drosophila melanogaster, a trait that is highly variable in natural populations and highly heritable in the laboratory. We examined abdominal pigmentation phenotypes in approximately 8000 female European D. melanogaster, isolating 1000 individuals with extreme phenotypes. We then used whole-genome Illumina sequencing to identify single nucleotide polymorphisms (SNPs segregating in our sample, and tested these for associations with pigmentation by contrasting allele frequencies between replicate pools of light and dark individuals. We identify two small regions near the pigmentation genes tan and bric-à-brac 1, both corresponding to known cis-regulatory regions, which contain SNPs showing significant associations with pigmentation variation. While the Pool-GWAS approach suffers some limitations, its cost advantage facilitates replication and it can be applied to any non-model system with an available reference genome.

  3. Mutational analysis of the human mitochondrial genome branches into the realm of bacterial genetics

    Energy Technology Data Exchange (ETDEWEB)

    Howell, N. [Univ. of Texas Medical Branch, Galveston, TX (United States)

    1996-10-01

    This is shaping up as a vintage year for studies of the genetics and evolution of the human mitochondrial genome (mtDNA). In a theoretical and experimental tour de force, Shenkar et al. (1996), on pages 772-780 of this issue, derive the mutation rate of the 4,977-bp (or {open_quotes}common{close_quotes}) deletion in the human mtDNA through refinement and extension of fluctuation analysis, a technique that was first used >50 years ago. Shenkar et al., in essence, have solved or bypassed many of the difficulties that are inherent in the application of fluctuation analysis to human mitochondrial gene mutations. Their study is important for two principal reasons. In the first place, high levels of this deletion cause a variety of pathological disorders, including Kearns-Sayre syndrome and chronic progressive external ophthalmoplegia. Their current report, therefore, is a major step in the elucidation of the molecular genetic pathogenesis of this group of mitochondrial disorders. For example, it now may be feasible to analyze the effects of selection on transmission and segregation of this deletion and, perhaps, other mtDNA mutations as well. Second, and at a broader level, the approach of Shenkar et al. should find widespread applicability to the study of other mtDNA mutations. It has been recognized for several years that mammalian mtDNA mutates much more rapidly than nuclear DNA, a phenomenon with potentially profound evolutionary implications. It is exciting and useful, both experimentally and theoretically, that this {open_quotes}old{close_quotes} approach can be used for {open_quotes}new{close_quotes} applications. 56 refs.

  4. Bacterial Factors Associated with Lethal Outcome of Enteropathogenic Escherichia coli Infection: Genomic Case-Control Studies.

    Directory of Open Access Journals (Sweden)

    Michael S Donnenberg

    2015-05-01

    Full Text Available Typical enteropathogenic Escherichia coli (tEPEC strains were associated with mortality in the Global Enteric Multicenter Study (GEMS. Genetic differences in tEPEC strains could underlie some of the variability in clinical outcome.We produced draft genome sequences of all available tEPEC strains from GEMS lethal infections (LIs and of closely matched EPEC strains from GEMS subjects with non-lethal symptomatic infections (NSIs and asymptomatic infections (AIs to identify gene clusters (potential protein encoding sequences sharing ≥90% nucleotide sequence identity associated with lethality.Among 14,412 gene clusters identified, the presence or absence of 392 was associated with clinical outcome. As expected, more gene clusters were associated with LI versus AI than LI versus NSI. The gene clusters more prevalent in strains from LI than those from NSI and AI included those encoding proteins involved in O-antigen biogenesis, while clusters encoding type 3 secretion effectors EspJ and OspB were among those more prevalent in strains from non-lethal infections. One gene cluster encoding a variant of an NleG ubiquitin ligase was associated with LI versus AI, while two other nleG clusters had the opposite association. Similar associations were found for two nleG gene clusters in an additional, larger sample of NSI and AI GEMS strains.Particular genes are associated with lethal tEPEC infections. Further study of these factors holds potential to unravel the mechanisms underlying severe disease and to prevent adverse outcomes.

  5. Genome-scale reconstruction of metabolic networks of Lactobacillus casei ATCC 334 and 12A.

    Directory of Open Access Journals (Sweden)

    Elena Vinay-Lara

    Full Text Available Lactobacillus casei strains are widely used in industry and the utility of this organism in these industrial applications is strain dependent. Hence, tools capable of predicting strain specific phenotypes would have utility in the selection of strains for specific industrial processes. Genome-scale metabolic models can be utilized to better understand genotype-phenotype relationships and to compare different organisms. To assist in the selection and development of strains with enhanced industrial utility, genome-scale models for L. casei ATCC 334, a well characterized strain, and strain 12A, a corn silage isolate, were constructed. Draft models were generated from RAST genome annotations using the Model SEED database and refined by evaluating ATP generating cycles, mass-and-charge-balances of reactions, and growth phenotypes. After the validation process was finished, we compared the metabolic networks of these two strains to identify metabolic, genetic and ortholog differences that may lead to different phenotypic behaviors. We conclude that the metabolic capabilities of the two networks are highly similar. The L. casei ATCC 334 model accounts for 1,040 reactions, 959 metabolites and 548 genes, while the L. casei 12A model accounts for 1,076 reactions, 979 metabolites and 640 genes. The developed L. casei ATCC 334 and 12A metabolic models will enable better understanding of the physiology of these organisms and be valuable tools in the development and selection of strains with enhanced utility in a variety of industrial applications.

  6. GenomeScale Reconstruction of Metabolic Networks of Lactobacillus casei ATCC 334 and 12A

    Science.gov (United States)

    Vinay-Lara, Elena; Hamilton, Joshua J.; Stahl, Buffy; Broadbent, Jeff R.; Reed, Jennifer L.; Steele, James L.

    2014-01-01

    Lactobacillus casei strains are widely used in industry and the utility of this organism in these industrial applications is strain dependent. Hence, tools capable of predicting strain specific phenotypes would have utility in the selection of strains for specific industrial processes. Genome-scale metabolic models can be utilized to better understand genotype-phenotype relationships and to compare different organisms. To assist in the selection and development of strains with enhanced industrial utility, genome-scale models for L. casei ATCC 334, a well characterized strain, and strain 12A, a corn silage isolate, were constructed. Draft models were generated from RAST genome annotations using the Model SEED database and refined by evaluating ATP generating cycles, mass-and-charge-balances of reactions, and growth phenotypes. After the validation process was finished, we compared the metabolic networks of these two strains to identify metabolic, genetic and ortholog differences that may lead to different phenotypic behaviors. We conclude that the metabolic capabilities of the two networks are highly similar. The L. casei ATCC 334 model accounts for 1,040 reactions, 959 metabolites and 548 genes, while the L. casei 12A model accounts for 1,076 reactions, 979 metabolites and 640 genes. The developed L. casei ATCC 334 and 12A metabolic models will enable better understanding of the physiology of these organisms and be valuable tools in the development and selection of strains with enhanced utility in a variety of industrial applications. PMID:25365062

  7. Genome-Scale Model Reveals Metabolic Basis of Biomass Partitioning in a Model Diatom.

    Science.gov (United States)

    Levering, Jennifer; Broddrick, Jared; Dupont, Christopher L; Peers, Graham; Beeri, Karen; Mayers, Joshua; Gallina, Alessandra A; Allen, Andrew E; Palsson, Bernhard O; Zengler, Karsten

    2016-01-01

    Diatoms are eukaryotic microalgae that contain genes from various sources, including bacteria and the secondary endosymbiotic host. Due to this unique combination of genes, diatoms are taxonomically and functionally distinct from other algae and vascular plants and confer novel metabolic capabilities. Based on the genome annotation, we performed a genome-scale metabolic network reconstruction for the marine diatom Phaeodactylum tricornutum. Due to their endosymbiotic origin, diatoms possess a complex chloroplast structure which complicates the prediction of subcellular protein localization. Based on previous work we implemented a pipeline that exploits a series of bioinformatics tools to predict protein localization. The manually curated reconstructed metabolic network iLB1027_lipid accounts for 1,027 genes associated with 4,456 reactions and 2,172 metabolites distributed across six compartments. To constrain the genome-scale model, we determined the organism specific biomass composition in terms of lipids, carbohydrates, and proteins using Fourier transform infrared spectrometry. Our simulations indicate the presence of a yet unknown glutamine-ornithine shunt that could be used to transfer reducing equivalents generated by photosynthesis to the mitochondria. The model reflects the known biochemical composition of P. tricornutum in defined culture conditions and enables metabolic engineering strategies to improve the use of P. tricornutum for biotechnological applications.

  8. Genome-Scale Model Reveals Metabolic Basis of Biomass Partitioning in a Model Diatom.

    Directory of Open Access Journals (Sweden)

    Jennifer Levering

    Full Text Available Diatoms are eukaryotic microalgae that contain genes from various sources, including bacteria and the secondary endosymbiotic host. Due to this unique combination of genes, diatoms are taxonomically and functionally distinct from other algae and vascular plants and confer novel metabolic capabilities. Based on the genome annotation, we performed a genome-scale metabolic network reconstruction for the marine diatom Phaeodactylum tricornutum. Due to their endosymbiotic origin, diatoms possess a complex chloroplast structure which complicates the prediction of subcellular protein localization. Based on previous work we implemented a pipeline that exploits a series of bioinformatics tools to predict protein localization. The manually curated reconstructed metabolic network iLB1027_lipid accounts for 1,027 genes associated with 4,456 reactions and 2,172 metabolites distributed across six compartments. To constrain the genome-scale model, we determined the organism specific biomass composition in terms of lipids, carbohydrates, and proteins using Fourier transform infrared spectrometry. Our simulations indicate the presence of a yet unknown glutamine-ornithine shunt that could be used to transfer reducing equivalents generated by photosynthesis to the mitochondria. The model reflects the known biochemical composition of P. tricornutum in defined culture conditions and enables metabolic engineering strategies to improve the use of P. tricornutum for biotechnological applications.

  9. Functional states of the genome-scale Escherichia coli transcriptional regulatory system.

    Directory of Open Access Journals (Sweden)

    Erwin P Gianchandani

    2009-06-01

    Full Text Available A transcriptional regulatory network (TRN constitutes the collection of regulatory rules that link environmental cues to the transcription state of a cell's genome. We recently proposed a matrix formalism that quantitatively represents a system of such rules (a transcriptional regulatory system [TRS] and allows systemic characterization of TRS properties. The matrix formalism not only allows the computation of the transcription state of the genome but also the fundamental characterization of the input-output mapping that it represents. Furthermore, a key advantage of this "pseudo-stoichiometric" matrix formalism is its ability to easily integrate with existing stoichiometric matrix representations of signaling and metabolic networks. Here we demonstrate for the first time how this matrix formalism is extendable to large-scale systems by applying it to the genome-scale Escherichia coli TRS. We analyze the fundamental subspaces of the regulatory network matrix (R to describe intrinsic properties of the TRS. We further use Monte Carlo sampling to evaluate the E. coli transcription state across a subset of all possible environments, comparing our results to published gene expression data as validation. Finally, we present novel in silico findings for the E. coli TRS, including (1 a gene expression correlation matrix delineating functional motifs; (2 sets of gene ontologies for which regulatory rules governing gene transcription are poorly understood and which may direct further experimental characterization; and (3 the appearance of a distributed TRN structure, which is in stark contrast to the more hierarchical organization of metabolic networks.

  10. Functional states of the genome-scale Escherichia coli transcriptional regulatory system.

    Science.gov (United States)

    Gianchandani, Erwin P; Joyce, Andrew R; Palsson, Bernhard Ø; Papin, Jason A

    2009-06-01

    A transcriptional regulatory network (TRN) constitutes the collection of regulatory rules that link environmental cues to the transcription state of a cell's genome. We recently proposed a matrix formalism that quantitatively represents a system of such rules (a transcriptional regulatory system [TRS]) and allows systemic characterization of TRS properties. The matrix formalism not only allows the computation of the transcription state of the genome but also the fundamental characterization of the input-output mapping that it represents. Furthermore, a key advantage of this "pseudo-stoichiometric" matrix formalism is its ability to easily integrate with existing stoichiometric matrix representations of signaling and metabolic networks. Here we demonstrate for the first time how this matrix formalism is extendable to large-scale systems by applying it to the genome-scale Escherichia coli TRS. We analyze the fundamental subspaces of the regulatory network matrix (R) to describe intrinsic properties of the TRS. We further use Monte Carlo sampling to evaluate the E. coli transcription state across a subset of all possible environments, comparing our results to published gene expression data as validation. Finally, we present novel in silico findings for the E. coli TRS, including (1) a gene expression correlation matrix delineating functional motifs; (2) sets of gene ontologies for which regulatory rules governing gene transcription are poorly understood and which may direct further experimental characterization; and (3) the appearance of a distributed TRN structure, which is in stark contrast to the more hierarchical organization of metabolic networks.

  11. Characterizing the optimal flux space of genome-scale metabolic reconstructions through modified latin-hypercube sampling

    NARCIS (Netherlands)

    Chaudhary, N.; Tøndel, K.; Bhatnagar, R.; Martins dos Santos, V.A.P.; Puchalka, J.

    2016-01-01

    Genome-Scale Metabolic Reconstructions (GSMRs), along with optimization-based methods, predominantly Flux Balance Analysis (FBA) and its derivatives, are widely applied for assessing and predicting the behavior of metabolic networks upon perturbation, thereby enabling identification of potential nov

  12. Genome-scale reconstruction of the Streptococcus pyogenes M49 metabolic network reveals growth requirements and indicates potential drug targets

    NARCIS (Netherlands)

    Levering, J.; Fiedler, T.; Sieg, A.; van Grinsven, K.W.A.; Hering, S.; Veith, N.; Olivier, B.G.; Klett, L.; Hugenholtz, J.; Teusink, B.; Kreikemeyer, B.; Kummer, U.

    2016-01-01

    Genome-scale metabolic models comprise stoichiometric relations between metabolites, as well as associations between genes and metabolic reactions and facilitate the analysis of metabolism. We computationally reconstructed the metabolic network of the lactic acid bacterium Streptococcus pyogenes M49

  13. Identifying anti-growth factors for human cancer cell lines through genome-scale metabolic modeling

    DEFF Research Database (Denmark)

    Ghaffari, Pouyan; Mardinoglu, Adil; Asplund, Anna

    2015-01-01

    Human cancer cell lines are used as important model systems to study molecular mechanisms associated with tumor growth, hereunder how genomic and biological heterogeneity found in primary tumors affect cellular phenotypes. We reconstructed Genome scale metabolic models (GEMs) for eleven cell lines...... based on RNA-Seq data and validated the functionality of these models with data from metabolite profiling. We used cell line-specific GEMs to analyze the differences in the metabolism of cancer cell lines, and to explore the heterogeneous expression of the metabolic subsystems. Furthermore, we predicted...... antimetabolites using two cell lines with different phenotypic origins, and found that it is effective in inhibiting the growth of these cell lines. Using immunohistochemistry, we also showed high or moderate expression levels of proteins targeted by the validated antimetabolite. Identified anti-growth factors...

  14. A Consensus Genome-scale Reconstruction of Chinese Hamster Ovary Cell Metabolism

    DEFF Research Database (Denmark)

    Hefzi, Hooman; Ang, Kok Siong; Hanscho, Michael

    2016-01-01

    in CHO and associated them with >1,700 genes in the Cricetulus griseus genome. The genome-scale metabolic model based on this reconstruction, iCHO1766, and cell-line-specific models for CHO-K1, CHO-S, and CHO-DG44 cells provide the biochemical basis of growth and recombinant protein production......Chinese hamster ovary (CHO) cells dominate biotherapeutic protein production and are widely used in mammalian cell line engineering research. To elucidate metabolic bottlenecks in protein production and to guide cell engineering and bioprocess optimization, we reconstructed the metabolic pathways...... simulations show that the metabolic resources in CHO are more than three times more efficiently utilized for growth or recombinant protein synthesis following targeted efforts to engineer the CHO secretory pathway. This model will further accelerate CHO cell engineering and help optimize bioprocesses....

  15. Improved Evidence-Based Genome-scale Metabolic Models for Maize Leaf, Embryo, and Endosperm.

    Directory of Open Access Journals (Sweden)

    Samuel eSeaver

    2015-03-01

    Full Text Available There is a growing demand for genome-scale metabolic reconstructions for plants, fueled by the need to understand the metabolic basis of crop yield and by progress in genome and transcriptome sequencing. Methods are also required to enable the interpretation of plant transcriptome data to study how cellular metabolic activity varies under different growth conditions or even within different organs, tissues, and developmental stages. Such methods depend extensively on the accuracy with which genes have been mapped to the biochemical reactions in the plant metabolic pathways. Errors in these mappings lead to metabolic reconstructions with an inflated number of reactions and possible generation of unreliable metabolic phenotype predictions. Here we introduce a new evidence-based genome-scale metabolic reconstruction of maize, with significant improvements in the quality of the gene-reaction associations included within our model. We also present a new approach for applying our model to predict active metabolic genes based on transcriptome data. This method includes a minimal set of reactions associated with low expression genes to enable activity of a maximum number of reactions associated with high expression genes. We apply this method to construct an organ-specific model for the maize leaf, and tissue specific models for maize embryo and endosperm cells. We validate our models using fluxomics data for the endosperm and embryo, demonstrating an improved capacity of our models to fit the available fluxomics data. All models are publicly available via the DOE Systems Biology Knowledgebase and PlantSEED, and our new method is generally applicable for analysis transcript profiles from any plant, paving the way for further in silico studies with a wide variety of plant genomes.

  16. Fusion of large-scale genomic knowledge and frequency data computationally prioritizes variants in epilepsy.

    Science.gov (United States)

    Campbell, Ian M; Rao, Mitchell; Arredondo, Sean D; Lalani, Seema R; Xia, Zhilian; Kang, Sung-Hae L; Bi, Weimin; Breman, Amy M; Smith, Janice L; Bacino, Carlos A; Beaudet, Arthur L; Patel, Ankita; Cheung, Sau Wai; Lupski, James R; Stankiewicz, Paweł; Ramocki, Melissa B; Shaw, Chad A

    2013-01-01

    Curation and interpretation of copy number variants identified by genome-wide testing is challenged by the large number of events harbored in each personal genome. Conventional determination of phenotypic relevance relies on patterns of higher frequency in affected individuals versus controls; however, an increasing amount of ascertained variation is rare or private to clans. Consequently, frequency data have less utility to resolve pathogenic from benign. One solution is disease-specific algorithms that leverage gene knowledge together with variant frequency to aid prioritization. We used large-scale resources including Gene Ontology, protein-protein interactions and other annotation systems together with a broad set of 83 genes with known associations to epilepsy to construct a pathogenicity score for the phenotype. We evaluated the score for all annotated human genes and applied Bayesian methods to combine the derived pathogenicity score with frequency information from our diagnostic laboratory. Analysis determined Bayes factors and posterior distributions for each gene. We applied our method to subjects with abnormal chromosomal microarray results and confirmed epilepsy diagnoses gathered by electronic medical record review. Genes deleted in our subjects with epilepsy had significantly higher pathogenicity scores and Bayes factors compared to subjects referred for non-neurologic indications. We also applied our scores to identify a recently validated epilepsy gene in a complex genomic region and to reveal candidate genes for epilepsy. We propose a potential use in clinical decision support for our results in the context of genome-wide screening. Our approach demonstrates the utility of integrative data in medical genomics.

  17. Fusion of large-scale genomic knowledge and frequency data computationally prioritizes variants in epilepsy.

    Directory of Open Access Journals (Sweden)

    Ian M Campbell

    Full Text Available Curation and interpretation of copy number variants identified by genome-wide testing is challenged by the large number of events harbored in each personal genome. Conventional determination of phenotypic relevance relies on patterns of higher frequency in affected individuals versus controls; however, an increasing amount of ascertained variation is rare or private to clans. Consequently, frequency data have less utility to resolve pathogenic from benign. One solution is disease-specific algorithms that leverage gene knowledge together with variant frequency to aid prioritization. We used large-scale resources including Gene Ontology, protein-protein interactions and other annotation systems together with a broad set of 83 genes with known associations to epilepsy to construct a pathogenicity score for the phenotype. We evaluated the score for all annotated human genes and applied Bayesian methods to combine the derived pathogenicity score with frequency information from our diagnostic laboratory. Analysis determined Bayes factors and posterior distributions for each gene. We applied our method to subjects with abnormal chromosomal microarray results and confirmed epilepsy diagnoses gathered by electronic medical record review. Genes deleted in our subjects with epilepsy had significantly higher pathogenicity scores and Bayes factors compared to subjects referred for non-neurologic indications. We also applied our scores to identify a recently validated epilepsy gene in a complex genomic region and to reveal candidate genes for epilepsy. We propose a potential use in clinical decision support for our results in the context of genome-wide screening. Our approach demonstrates the utility of integrative data in medical genomics.

  18. Factors affecting reproducibility between genome-scale siRNA-based screens

    Science.gov (United States)

    Barrows, Nicholas J.; Le Sommer, Caroline; Garcia-Blanco, Mariano A.; Pearson, James L.

    2011-01-01

    RNA interference-based screening is a powerful new genomic technology which addresses gene function en masse. To evaluate factors influencing hit list composition and reproducibility, we performed two identically designed small interfering RNA (siRNA)-based, whole genome screens for host factors supporting yellow fever virus infection. These screens represent two separate experiments completed five months apart and allow the direct assessment of the reproducibility of a given siRNA technology when performed in the same environment. Candidate hit lists generated by sum rank, median absolute deviation, z-score, and strictly standardized mean difference were compared within and between whole genome screens. Application of these analysis methodologies within a single screening dataset using a fixed threshold equivalent to a p-value ≤ 0.001 resulted in hit lists ranging from 82 to 1,140 members and highlighted the tremendous impact analysis methodology has on hit list composition. Intra- and inter-screen reproducibility was significantly influenced by the analysis methodology and ranged from 32% to 99%. This study also highlighted the power of testing at least two independent siRNAs for each gene product in primary screens. To facilitate validation we conclude by suggesting methods to reduce false discovery at the primary screening stage. In this study we present the first comprehensive comparison of multiple analysis strategies, and demonstrate the impact of the analysis methodology on the composition of the “hit list”. Therefore, we propose that the entire dataset derived from functional genome-scale screens, especially if publicly funded, should be made available as is done with data derived from gene expression and genome-wide association studies. PMID:20625183

  19. Comparative Genomics of Field Isolates of Mycobacterium bovis and M. caprae Provides Evidence for Possible Correlates with Bacterial Viability and Virulence

    Science.gov (United States)

    de la Fuente, José; Díez-Delgado, Iratxe; Contreras, Marinela; Vicente, Joaquín; Cabezas-Cruz, Alejandro; Tobes, Raquel; Manrique, Marina; López, Vladimir; Romero, Beatriz; Bezos, Javier; Dominguez, Lucas; Sevilla, Iker A.; Garrido, Joseba M.; Juste, Ramón; Madico, Guillermo; Jones-López, Edward; Gortazar, Christian

    2015-01-01

    Mycobacteria of the Mycobacterium tuberculosis complex (MTBC) greatly affect humans and animals worldwide. The life cycle of mycobacteria is complex and the mechanisms resulting in pathogen infection and survival in host cells are not fully understood. Recently, comparative genomics analyses have provided new insights into the evolution and adaptation of the MTBC to survive inside the host. However, most of this information has been obtained using M. tuberculosis but not other members of the MTBC such as M. bovis and M. caprae. In this study, the genome of three M. bovis (MB1, MB3, MB4) and one M. caprae (MB2) field isolates with different lesion score, prevalence and host distribution phenotypes were sequenced. Genome sequence information was used for whole-genome and protein-targeted comparative genomics analysis with the aim of finding correlates with phenotypic variation with potential implications for tuberculosis (TB) disease risk assessment and control. At the whole-genome level the results of the first comparative genomics study of field isolates of M. bovis including M. caprae showed that as previously reported for M. tuberculosis, sequential chromosomal nucleotide substitutions were the main driver of the M. bovis genome evolution. The phylogenetic analysis provided a strong support for the M. bovis/M. caprae clade, but supported M. caprae as a separate species. The comparison of the MB1 and MB4 isolates revealed differences in genome sequence, including gene families that are important for bacterial infection and transmission, thus highlighting differences with functional implications between isolates otherwise classified with the same spoligotype. Strategic protein-targeted analysis using the ESX or type VII secretion system, proteins linking stress response with lipid metabolism, host T cell epitopes of mycobacteria, antigens and peptidoglycan assembly protein identified new genetic markers and candidate vaccine antigens that warrant further study to

  20. Comparative Genomics of Field Isolates of Mycobacterium bovis and M. caprae Provides Evidence for Possible Correlates with Bacterial Viability and Virulence.

    Directory of Open Access Journals (Sweden)

    José de la Fuente

    2015-11-01

    Full Text Available Mycobacteria of the Mycobacterium tuberculosis complex (MTBC greatly affect humans and animals worldwide. The life cycle of mycobacteria is complex and the mechanisms resulting in pathogen infection and survival in host cells are not fully understood. Recently, comparative genomics analyses have provided new insights into the evolution and adaptation of the MTBC to survive inside the host. However, most of this information has been obtained using M. tuberculosis but not other members of the MTBC such as M. bovis and M. caprae. In this study, the genome of three M. bovis (MB1, MB3, MB4 and one M. caprae (MB2 field isolates with different lesion score, prevalence and host distribution phenotypes were sequenced. Genome sequence information was used for whole-genome and protein-targeted comparative genomics analysis with the aim of finding correlates with phenotypic variation with potential implications for tuberculosis (TB disease risk assessment and control. At the whole-genome level the results of the first comparative genomics study of field isolates of M. bovis including M. caprae showed that as previously reported for M. tuberculosis, sequential chromosomal nucleotide substitutions were the main driver of the M. bovis genome evolution. The phylogenetic analysis provided a strong support for the M. bovis/M. caprae clade, but supported M. caprae as a separate species. The comparison of the MB1 and MB4 isolates revealed differences in genome sequence, including gene families that are important for bacterial infection and transmission, thus highlighting differences with functional implications between isolates otherwise classified with the same spoligotype. Strategic protein-targeted analysis using the ESX or type VII secretion system, proteins linking stress response with lipid metabolism, host T cell epitopes of mycobacteria, antigens and peptidoglycan assembly protein identified new genetic markers and candidate vaccine antigens that warrant

  1. Comparative Genomics of Field Isolates of Mycobacterium bovis and M. caprae Provides Evidence for Possible Correlates with Bacterial Viability and Virulence.

    Science.gov (United States)

    de la Fuente, José; Díez-Delgado, Iratxe; Contreras, Marinela; Vicente, Joaquín; Cabezas-Cruz, Alejandro; Tobes, Raquel; Manrique, Marina; López, Vladimir; Romero, Beatriz; Bezos, Javier; Dominguez, Lucas; Sevilla, Iker A; Garrido, Joseba M; Juste, Ramón; Madico, Guillermo; Jones-López, Edward; Gortazar, Christian

    2015-11-01

    Mycobacteria of the Mycobacterium tuberculosis complex (MTBC) greatly affect humans and animals worldwide. The life cycle of mycobacteria is complex and the mechanisms resulting in pathogen infection and survival in host cells are not fully understood. Recently, comparative genomics analyses have provided new insights into the evolution and adaptation of the MTBC to survive inside the host. However, most of this information has been obtained using M. tuberculosis but not other members of the MTBC such as M. bovis and M. caprae. In this study, the genome of three M. bovis (MB1, MB3, MB4) and one M. caprae (MB2) field isolates with different lesion score, prevalence and host distribution phenotypes were sequenced. Genome sequence information was used for whole-genome and protein-targeted comparative genomics analysis with the aim of finding correlates with phenotypic variation with potential implications for tuberculosis (TB) disease risk assessment and control. At the whole-genome level the results of the first comparative genomics study of field isolates of M. bovis including M. caprae showed that as previously reported for M. tuberculosis, sequential chromosomal nucleotide substitutions were the main driver of the M. bovis genome evolution. The phylogenetic analysis provided a strong support for the M. bovis/M. caprae clade, but supported M. caprae as a separate species. The comparison of the MB1 and MB4 isolates revealed differences in genome sequence, including gene families that are important for bacterial infection and transmission, thus highlighting differences with functional implications between isolates otherwise classified with the same spoligotype. Strategic protein-targeted analysis using the ESX or type VII secretion system, proteins linking stress response with lipid metabolism, host T cell epitopes of mycobacteria, antigens and peptidoglycan assembly protein identified new genetic markers and candidate vaccine antigens that warrant further study to

  2. Genome-scale metabolic flux analysis of Streptomyces lividans growing on a complex medium.

    Science.gov (United States)

    D'Huys, Pieter-Jan; Lule, Ivan; Vercammen, Dominique; Anné, Jozef; Van Impe, Jan F; Bernaerts, Kristel

    2012-09-15

    Constraint-based metabolic modeling comprises various excellent tools to assess experimentally observed phenotypic behavior of micro-organisms in terms of intracellular metabolic fluxes. In combination with genome-scale metabolic networks, micro-organisms can be investigated in much more detail and under more complex environmental conditions. Although complex media are ubiquitously applied in industrial fermentations and are often a prerequisite for high protein secretion yields, such multi-component conditions are seldom investigated using genome-scale flux analysis. In this paper, a systematic and integrative approach is presented to determine metabolic fluxes in Streptomyces lividans TK24 grown on a nutritious and complex medium. Genome-scale flux balance analysis and randomized sampling of the solution space are combined to extract maximum information from exometabolome profiles. It is shown that biomass maximization cannot predict the observed metabolite production pattern as such. Although this cellular objective commonly applies to batch fermentation data, both input and output constraints are required to reproduce the measured biomass production rate. Rich media hence not necessarily lead to maximum biomass growth. To eventually identify a unique intracellular flux vector, a hierarchical optimization of cellular objectives is adopted. Out of various tested secondary objectives, maximization of the ATP yield per flux unit returns the closest agreement with the maximum frequency in flux histograms. This unique flux estimation is hence considered as a reasonable approximation for the biological fluxes. Flux maps for different growth phases show no active oxidative part of the pentose phosphate pathway, but NADPH generation in the TCA cycle and NADPH transdehydrogenase activity are most important in fulfilling the NADPH balance. Amino acids contribute to biomass growth by augmenting the pool of available amino acids and by boosting the TCA cycle, particularly

  3. The future of genome-scale modeling of yeast through integration of a transcriptional regulatory network

    DEFF Research Database (Denmark)

    Liu, Guodong; Marras, Antonio; Nielsen, Jens

    2014-01-01

    regulatory information is necessary to improve the accuracy and predictive ability of metabolic models. Here we review the strategies for the reconstruction of a transcriptional regulatory network (TRN) for yeast and the integration of such a reconstruction into a flux balance analysis-based metabolic model......Metabolism is regulated at multiple levels in response to the changes of internal or external conditions. Transcriptional regulation plays an important role in regulating many metabolic reactions by altering the concentrations of metabolic enzymes. Thus, integration of the transcriptional...... transcriptional regulatory interactions to genome-scale metabolic models in a quantitative manner....

  4. Determining the Control Circuitry of Redox Metabolism at the Genome-Scale

    DEFF Research Database (Denmark)

    Federowicz, Stephen; Kim, Donghyuk; Ebrahim, Ali

    2014-01-01

    -scale metabolic model to show that ArcA and Fnr regulate >80% of total metabolic flux and 96% of differential gene expression across fermentative and nitrate respiratory conditions. Based on the data, we propose a feedforward with feedback trim regulatory scheme, given the extensive repression of catabolic genes...... that are regulated during electron acceptor shifts. Here we propose a qualitative model that accounts for the full breadth of regulated genes by detailing how two global transcription factors (TFs), ArcA and Fnr of E. coli, sense key metabolic redox ratios and act on a genome-wide basis to regulate anabolic...

  5. Screening of metagenomic and genomic libraries reveals three classes of bacterial enzymes that overcome the toxicity of acrylate.

    Science.gov (United States)

    Curson, Andrew R J; Burns, Oliver J; Voget, Sonja; Daniel, Rolf; Todd, Jonathan D; McInnis, Kathryn; Wexler, Margaret; Johnston, Andrew W B

    2014-01-01

    Acrylate is produced in significant quantities through the microbial cleavage of the highly abundant marine osmoprotectant dimethylsulfoniopropionate, an important process in the marine sulfur cycle. Acrylate can inhibit bacterial growth, likely through its conversion to the highly toxic molecule acrylyl-CoA. Previous work identified an acrylyl-CoA reductase, encoded by the gene acuI, as being important for conferring on bacteria the ability to grow in the presence of acrylate. However, some bacteria lack acuI, and, conversely, many bacteria that may not encounter acrylate in their regular environments do contain this gene. We therefore sought to identify new genes that might confer tolerance to acrylate. To do this, we used functional screening of metagenomic and genomic libraries to identify novel genes that corrected an E. coli mutant that was defective in acuI, and was therefore hyper-sensitive to acrylate. The metagenomic libraries yielded two types of genes that overcame this toxicity. The majority encoded enzymes resembling AcuI, but with significant sequence divergence among each other and previously ratified AcuI enzymes. One other metagenomic gene, arkA, had very close relatives in Bacillus and related bacteria, and is predicted to encode an enoyl-acyl carrier protein reductase, in the same family as FabK, which catalyses the final step in fatty-acid biosynthesis in some pathogenic Firmicute bacteria. A genomic library of Novosphingobium, a metabolically versatile alphaproteobacterium that lacks both acuI and arkA, yielded vutD and vutE, two genes that, together, conferred acrylate resistance. These encode sequential steps in the oxidative catabolism of valine in a pathway in which, significantly, methacrylyl-CoA is a toxic intermediate. These findings expand the range of bacteria for which the acuI gene encodes a functional acrylyl-CoA reductase, and also identify novel enzymes that can similarly function in conferring acrylate resistance, likely, again

  6. Screening of metagenomic and genomic libraries reveals three classes of bacterial enzymes that overcome the toxicity of acrylate.

    Directory of Open Access Journals (Sweden)

    Andrew R J Curson

    Full Text Available Acrylate is produced in significant quantities through the microbial cleavage of the highly abundant marine osmoprotectant dimethylsulfoniopropionate, an important process in the marine sulfur cycle. Acrylate can inhibit bacterial growth, likely through its conversion to the highly toxic molecule acrylyl-CoA. Previous work identified an acrylyl-CoA reductase, encoded by the gene acuI, as being important for conferring on bacteria the ability to grow in the presence of acrylate. However, some bacteria lack acuI, and, conversely, many bacteria that may not encounter acrylate in their regular environments do contain this gene. We therefore sought to identify new genes that might confer tolerance to acrylate. To do this, we used functional screening of metagenomic and genomic libraries to identify novel genes that corrected an E. coli mutant that was defective in acuI, and was therefore hyper-sensitive to acrylate. The metagenomic libraries yielded two types of genes that overcame this toxicity. The majority encoded enzymes resembling AcuI, but with significant sequence divergence among each other and previously ratified AcuI enzymes. One other metagenomic gene, arkA, had very close relatives in Bacillus and related bacteria, and is predicted to encode an enoyl-acyl carrier protein reductase, in the same family as FabK, which catalyses the final step in fatty-acid biosynthesis in some pathogenic Firmicute bacteria. A genomic library of Novosphingobium, a metabolically versatile alphaproteobacterium that lacks both acuI and arkA, yielded vutD and vutE, two genes that, together, conferred acrylate resistance. These encode sequential steps in the oxidative catabolism of valine in a pathway in which, significantly, methacrylyl-CoA is a toxic intermediate. These findings expand the range of bacteria for which the acuI gene encodes a functional acrylyl-CoA reductase, and also identify novel enzymes that can similarly function in conferring acrylate

  7. Bacterial Genome Editing with CRISPR-Cas9: Deletion, Integration, Single Nucleotide Modification, and Desirable "Clean" Mutant Selection in Clostridium beijerinckii as an Example.

    Science.gov (United States)

    Wang, Yi; Zhang, Zhong-Tian; Seo, Seung-Oh; Lynn, Patrick; Lu, Ting; Jin, Yong-Su; Blaschek, Hans P

    2016-07-15

    CRISPR-Cas9 has been demonstrated as a transformative genome engineering tool for many eukaryotic organisms; however, its utilization in bacteria remains limited and ineffective. Here we explored Streptococcus pyogenes CRISPR-Cas9 for genome editing in Clostridium beijerinckii (industrially significant but notorious for being difficult to metabolically engineer) as a representative attempt to explore CRISPR-Cas9 for genome editing in microorganisms that previously lacked sufficient genetic tools. By combining inducible expression of Cas9 and plasmid-borne editing templates, we successfully achieved gene deletion and integration with high efficiency in single steps. We further achieved single nucleotide modification by applying innovative two-step approaches, which do not rely on availability of Protospacer Adjacent Motif sequences. Severe vector integration events were observed during the genome engineering process, which is likely difficult to avoid but has never been reported by other researchers for the bacterial genome engineering based on homologous recombination with plasmid-borne editing templates. We then further successfully employed CRISPR-Cas9 as an efficient tool for selecting desirable "clean" mutants in this study. The approaches we developed are broadly applicable and will open the way for precise genome editing in diverse microorganisms.

  8. Sexagesimal scale for mapping human genome Escala sexagesimal para mapear el genoma humano

    Directory of Open Access Journals (Sweden)

    RICARDO CRUZ-COKE

    2001-03-01

    Full Text Available In a previous work I designed a diagram of the human genome based on a circular ideogram of the haploid set of chromosomes, using a low resolution scale of Megabase units. The purpose of this work is to draft a new scale to measure the physical map of the human genome at the highest resolution level. The entire length of the haploid genome of males is deployed in a circumference, marked with a sexagesimal scale with 360 degrees and 1296000 arc seconds. The radio of this circunference displays a semilogaritmic metric scale from 1 m up to the nanometer level. The base pair level of DNA sequences, 10-9 of this circunsference, is measured in milliarsec unit (mas, equivalent to a thousand of arcsecond. The "mas" unit, correspond to 1.27 nanometers (nm or 0.427 base pair (bp and it is the framework for measure DNA sequences. Thus the three billion base pairs of the human genome may be identified by 1296000000 "mas" units in continous correlation from number 1 to number 1296000000. This sexagesimal scale covers all the levels of the nuclear genetic material, from nucleotides to chromosomes. The locations of every codon and every gene may be numbered in the physical map of chomosome regions according to this new scale, instead of the partial kilobase and Megabase scales used today. The advantage of the new scale is the unification of the set of chromosomes under a continous scale of measurement at the DNA level, facilitating the correlation with the phenotypes of man and other speciesEn un trabajo anterior yo diseñé un diagrama del genoma humano basado en un ideograma circular del conjunto haploide de cromosomas, usando una escala de baja resolución en megabases. El propósito de este trabajo es el de diseñar una nueva escala para medir el mapa físico del genoma humano al más alto nivel de resolución. La longitud completa del genoma haploide del varon es extendido en una circunsferencia, marcada con una escala sexagesimal de 360 grados y 1296000

  9. Factors Controlling Soil Microbial Biomass and Bacterial Diversity and Community Composition in a Cold Desert Ecosystem: Role of Geographic Scale.

    Directory of Open Access Journals (Sweden)

    David J Van Horn

    Full Text Available Understanding controls over the distribution of soil bacteria is a fundamental step toward describing soil ecosystems, understanding their functional capabilities, and predicting their responses to environmental change. This study investigated the controls on the biomass, species richness, and community structure and composition of soil bacterial communities in the McMurdo Dry Valleys, Antarctica, at local and regional scales. The goals of the study were to describe the relationships between abiotic characteristics and soil bacteria in this unique, microbially dominated environment, and to test the scale dependence of these relationships in a low complexity ecosystem. Samples were collected from dry mineral soils associated with snow patches, which are a significant source of water in this desert environment, at six sites located in the major basins of the Taylor and Wright Valleys. Samples were analyzed for a suite of characteristics including soil moisture, pH, electrical conductivity, soil organic matter, major nutrients and ions, microbial biomass, 16 S rRNA gene richness, and bacterial community structure and composition. Snow patches created local biogeochemical gradients while inter-basin comparisons encompassed landscape scale gradients enabling comparisons of microbial controls at two distinct spatial scales. At the organic carbon rich, mesic, low elevation sites Acidobacteria and Actinobacteria were prevalent, while Firmicutes and Proteobacteria were dominant at the high elevation, low moisture and biomass sites. Microbial parameters were significantly related with soil water content and edaphic characteristics including soil pH, organic matter, and sulfate. However, the magnitude and even the direction of these relationships varied across basins and the application of mixed effects models revealed evidence of significant contextual effects at local and regional scales. The results highlight the importance of the geographic scale of

  10. Advances in the integration of transcriptional regulatory information into genome-scale metabolic models.

    Science.gov (United States)

    Vivek-Ananth, R P; Samal, Areejit

    2016-09-01

    A major goal of systems biology is to build predictive computational models of cellular metabolism. Availability of complete genome sequences and wealth of legacy biochemical information has led to the reconstruction of genome-scale metabolic networks in the last 15 years for several organisms across the three domains of life. Due to paucity of information on kinetic parameters associated with metabolic reactions, the constraint-based modelling approach, flux balance analysis (FBA), has proved to be a vital alternative to investigate the capabilities of reconstructed metabolic networks. In parallel, advent of high-throughput technologies has led to the generation of massive amounts of omics data on transcriptional regulation comprising mRNA transcript levels and genome-wide binding profile of transcriptional regulators. A frontier area in metabolic systems biology has been the development of methods to integrate the available transcriptional regulatory information into constraint-based models of reconstructed metabolic networks in order to increase the predictive capabilities of computational models and understand the regulation of cellular metabolism. Here, we review the existing methods to integrate transcriptional regulatory information into constraint-based models of metabolic networks.

  11. Genome-scale metabolic network validation of Shewanella oneidensis using transposon insertion frequency analysis.

    Directory of Open Access Journals (Sweden)

    Hong Yang

    2014-09-01

    Full Text Available Transposon mutagenesis, in combination with parallel sequencing, is becoming a powerful tool for en-masse mutant analysis. A probability generating function was used to explain observed miniHimar transposon insertion patterns, and gene essentiality calls were made by transposon insertion frequency analysis (TIFA. TIFA incorporated the observed genome and sequence motif bias of the miniHimar transposon. The gene essentiality calls were compared to: 1 previous genome-wide direct gene-essentiality assignments; and, 2 flux balance analysis (FBA predictions from an existing genome-scale metabolic model of Shewanella oneidensis MR-1. A three-way comparison between FBA, TIFA, and the direct essentiality calls was made to validate the TIFA approach. The refinement in the interpretation of observed transposon insertions demonstrated that genes without insertions are not necessarily essential, and that genes that contain insertions are not always nonessential. The TIFA calls were in reasonable agreement with direct essentiality calls for S. oneidensis, but agreed more closely with E. coli essentiality calls for orthologs. The TIFA gene essentiality calls were in good agreement with the MR-1 FBA essentiality predictions, and the agreement between TIFA and FBA predictions was substantially better than between the FBA and the direct gene essentiality predictions.

  12. T346Hunter: a novel web-based tool for the prediction of type III, type IV and type VI secretion systems in bacterial genomes.

    Directory of Open Access Journals (Sweden)

    Pedro Manuel Martínez-García

    Full Text Available T346Hunter (Type Three, Four and Six secretion system Hunter is a web-based tool for the identification and localisation of type III, type IV and type VI secretion systems (T3SS, T4SS and T6SS, respectively clusters in bacterial genomes. Non-flagellar T3SS (NF-T3SS and T6SS are complex molecular machines that deliver effector proteins from bacterial cells into the environment or into other eukaryotic or prokaryotic cells, with significant implications for pathogenesis of the strains encoding them. Meanwhile, T4SS is a more functionally diverse system, which is involved in not only effector translocation but also conjugation and DNA uptake/release. Development of control strategies against bacterial-mediated diseases requires genomic identification of the virulence arsenal of pathogenic bacteria, with T3SS, T4SS and T6SS being major determinants in this regard. Therefore, computational methods for systematic identification of these specialised machines are of particular interest. With the aim of facilitating this task, T346Hunter provides a user-friendly web-based tool for the prediction of T3SS, T4SS and T6SS clusters in newly sequenced bacterial genomes. After inspection of the available scientific literature, we constructed a database of hidden Markov model (HMM protein profiles and sequences representing the various components of T3SS, T4SS and T6SS. T346Hunter performs searches of such a database against user-supplied bacterial sequences and localises enriched regions in any of these three types of secretion systems. Moreover, through the T346Hunter server, users can visualise the predicted clusters obtained for approximately 1700 bacterial chromosomes and plasmids. T346Hunter offers great help to researchers in advancing their understanding of the biological mechanisms in which these sophisticated molecular machines are involved. T346Hunter is freely available at http://bacterial-virulence-factors.cbgp.upm.es/T346Hunter.

  13. Simplified large-scale Sanger genome sequencing for influenza A/H3N2 virus.

    Directory of Open Access Journals (Sweden)

    Hong Kai Lee

    Full Text Available BACKGROUND: The advent of next-generation sequencing technologies and the resultant lower costs of sequencing have enabled production of massive amounts of data, including the generation of full genome sequences of pathogens. However, the small genome size of the influenza virus arguably justifies the use of the more conventional Sanger sequencing technology which is still currently more readily available in most diagnostic laboratories. RESULTS: We present a simplified Sanger-based genome sequencing method for sequencing the influenza A/H3N2 virus in a large-scale format. The entire genome sequencing was completed with 19 reverse transcription-polymerase chain reactions (RT-PCRs and 39 sequencing reactions. This method was tested on 15 native clinical samples and 15 culture isolates, respectively, collected between 2009 and 2011. The 15 native clinical samples registered quantification cycle values ranging from 21.0 to 30.56, which were equivalent to 2.4×10(3-1.4×10(6 viral copies/µL of RNA extract. All the PCR-amplified products were sequenced directly without PCR product purification. Notably, high quality sequencing data up to 700 bp were generated for all the samples tested. The completed sequence covered 408,810 nucleotides in total, with 13,627 nucleotides per genome, attaining 100% coding completeness. Of all the bases produced, an average of 89.49% were Phred quality value 40 (QV40 bases (representing an accuracy of circa one miscall for every 10,000 bases or higher, and an average of 93.46% were QV30 bases (one miscall every 1000 bases or higher. CONCLUSIONS: This sequencing protocol has been shown to be cost-effective and less labor-intensive in obtaining full influenza genomes. The constant high quality of sequences generated imparts confidence in extending the application of this non-purified amplicon sequencing approach to other gene sequencing assays, with appropriate use of suitably designed primers.

  14. Why close a bacterial genome? The plasmid of Alteromonas macleodii HOT1A3 is a vector for inter-specific transfer of a flexible genomic island

    Directory of Open Access Journals (Sweden)

    Eduard eFadeev

    2016-03-01

    Full Text Available Genome sequencing is rapidly becoming a staple technique in environmental and clinical microbiology, yet computational challenges still remain, leading to many draft genomes which are typically fragmented into many contigs. We sequenced and completely assembled the genome of a marine heterotrophic bacterium, Alteromonas macleodii HOT1A3, and compared its full genome to several draft genomes obtained using different reference-based and de-novo methods. In general, the de-novo assemblies clearly outperformed the reference-based or hybrid ones, covering>99% of the genes and representing essentially all of the gene functions. However, only the fully closed genome (~4.5Mbp allowed us to identify the presence of a large, 148 kbp plasmid, pAM1A3. While HOT1A3 belongs to Alteromonas macleodii, typically found in surface waters (surface ecotype, this plasmid consists of an almost complete flexible genomic island, containing many genes involved in metal resistance previously identified in the genomes of Alteromonas mediterranea (deep ecotype. Indeed, similar to A. mediterranea, A. macleodii HOT1A3 grows at concentrations of zinc, mercury and copper that are inhibitory for other A. macleodii strains. The presence of a plasmid encoding almost an entire flexible genomic island suggests that wholesale genomic exchange between heterotrophic marine bacteria belonging to related but ecologically different populations is not uncommon.

  15. Genome scale evolution of myxoma virus reveals host-pathogen adaptation and rapid geographic spread.

    Science.gov (United States)

    Kerr, Peter J; Rogers, Matthew B; Fitch, Adam; Depasse, Jay V; Cattadori, Isabella M; Twaddle, Alan C; Hudson, Peter J; Tscharke, David C; Read, Andrew F; Holmes, Edward C; Ghedin, Elodie

    2013-12-01

    The evolutionary interplay between myxoma virus (MYXV) and the European rabbit (Oryctolagus cuniculus) following release of the virus in Australia in 1950 as a biological control is a classic example of host-pathogen coevolution. We present a detailed genomic and phylogeographic analysis of 30 strains of MYXV, including the Australian progenitor strain Standard Laboratory Strain (SLS), 24 Australian viruses isolated from 1951 to 1999, and three isolates from the early radiation in Britain from 1954 and 1955. We show that in Australia MYXV has spread rapidly on a spatial scale, with multiple lineages cocirculating within individual localities, and that both highly virulent and attenuated viruses were still present in the field through the 1990s. In addition, the detection of closely related virus lineages at sites 1,000 km apart suggests that MYXV moves freely in geographic space, with mosquitoes, fleas, and rabbit migration all providing means of transport. Strikingly, despite multiple introductions, all modern viruses appear to be ultimately derived from the original introductions of SLS. The rapidity of MYXV evolution was also apparent at the genomic scale, with gene duplications documented in a number of viruses. Duplication of potential virulence genes may be important in increasing the expression of virulence proteins and provides the basis for the evolution of novel functions. Mutations leading to loss of open reading frames were surprisingly frequent and in some cases may explain attenuation, but no common mutations that correlated with virulence or attenuation were identified.

  16. A Method to Constrain Genome-Scale Models with 13C Labeling Data.

    Directory of Open Access Journals (Sweden)

    Héctor García Martín

    2015-09-01

    Full Text Available Current limitations in quantitatively predicting biological behavior hinder our efforts to engineer biological systems to produce biofuels and other desired chemicals. Here, we present a new method for calculating metabolic fluxes, key targets in metabolic engineering, that incorporates data from 13C labeling experiments and genome-scale models. The data from 13C labeling experiments provide strong flux constraints that eliminate the need to assume an evolutionary optimization principle such as the growth rate optimization assumption used in Flux Balance Analysis (FBA. This effective constraining is achieved by making the simple but biologically relevant assumption that flux flows from core to peripheral metabolism and does not flow back. The new method is significantly more robust than FBA with respect to errors in genome-scale model reconstruction. Furthermore, it can provide a comprehensive picture of metabolite balancing and predictions for unmeasured extracellular fluxes as constrained by 13C labeling data. A comparison shows that the results of this new method are similar to those found through 13C Metabolic Flux Analysis (13C MFA for central carbon metabolism but, additionally, it provides flux estimates for peripheral metabolism. The extra validation gained by matching 48 relative labeling measurements is used to identify where and why several existing COnstraint Based Reconstruction and Analysis (COBRA flux prediction algorithms fail. We demonstrate how to use this knowledge to refine these methods and improve their predictive capabilities. This method provides a reliable base upon which to improve the design of biological systems.

  17. An experimentally-supported genome-scale metabolic network reconstruction for Yersinia pestis CO92

    Directory of Open Access Journals (Sweden)

    Motin Vladimir L

    2011-10-01

    Full Text Available Abstract Background Yersinia pestis is a gram-negative bacterium that causes plague, a disease linked historically to the Black Death in Europe during the Middle Ages and to several outbreaks during the modern era. Metabolism in Y. pestis displays remarkable flexibility and robustness, allowing the bacterium to proliferate in both warm-blooded mammalian hosts and cold-blooded insect vectors such as fleas. Results Here we report a genome-scale reconstruction and mathematical model of metabolism for Y. pestis CO92 and supporting experimental growth and metabolite measurements. The model contains 815 genes, 678 proteins, 963 unique metabolites and 1678 reactions, accurately simulates growth on a range of carbon sources both qualitatively and quantitatively, and identifies gaps in several key biosynthetic pathways and suggests how those gaps might be filled. Furthermore, our model presents hypotheses to explain certain known nutritional requirements characteristic of this strain. Conclusions Y. pestis continues to be a dangerous threat to human health during modern times. The Y. pestis genome-scale metabolic reconstruction presented here, which has been benchmarked against experimental data and correctly reproduces known phenotypes, provides an in silico platform with which to investigate the metabolism of this important human pathogen.

  18. A genome-scale metabolic model of the lipid-accumulating yeast Yarrowia lipolytica

    Directory of Open Access Journals (Sweden)

    Loira Nicolas

    2012-05-01

    Full Text Available Abstract Background Yarrowia lipolytica is an oleaginous yeast which has emerged as an important microorganism for several biotechnological processes, such as the production of organic acids, lipases and proteases. It is also considered a good candidate for single-cell oil production. Although some of its metabolic pathways are well studied, its metabolic engineering is hindered by the lack of a genome-scale model that integrates the current knowledge about its metabolism. Results Combining in silico tools and expert manual curation, we have produced an accurate genome-scale metabolic model for Y. lipolytica. Using a scaffold derived from a functional metabolic model of the well-studied but phylogenetically distant yeast S. cerevisiae, we mapped conserved reactions, rewrote gene associations, added species-specific reactions and inserted specialized copies of scaffold reactions to account for species-specific expansion of protein families. We used physiological measures obtained under lab conditions to validate our predictions. Conclusions Y. lipolytica iNL895 represents the first well-annotated metabolic model of an oleaginous yeast, providing a base for future metabolic improvement, and a starting point for the metabolic reconstruction of other species in the Yarrowia clade and other oleaginous yeasts.

  19. Quantitative assessment of thermodynamic constraints on the solution space of genome-scale metabolic models.

    Science.gov (United States)

    Hamilton, Joshua J; Dwivedi, Vivek; Reed, Jennifer L

    2013-07-16

    Constraint-based methods provide powerful computational techniques to allow understanding and prediction of cellular behavior. These methods rely on physiochemical constraints to eliminate infeasible behaviors from the space of available behaviors. One such constraint is thermodynamic feasibility, the requirement that intracellular flux distributions obey the laws of thermodynamics. The past decade has seen several constraint-based methods that interpret this constraint in different ways, including those that are limited to small networks, rely on predefined reaction directions, and/or neglect the relationship between reaction free energies and metabolite concentrations. In this work, we utilize one such approach, thermodynamics-based metabolic flux analysis (TMFA), to make genome-scale, quantitative predictions about metabolite concentrations and reaction free energies in the absence of prior knowledge of reaction directions, while accounting for uncertainties in thermodynamic estimates. We applied TMFA to a genome-scale network reconstruction of Escherichia coli and examined the effect of thermodynamic constraints on the flux space. We also assessed the predictive performance of TMFA against gene essentiality and quantitative metabolomics data, under both aerobic and anaerobic, and optimal and suboptimal growth conditions. Based on these results, we propose that TMFA is a useful tool for validating phenotypes and generating hypotheses, and that additional types of data and constraints can improve predictions of metabolite concentrations.

  20. Genome-scale DNA sequence recognition by hybridization to short oligomers.

    Science.gov (United States)

    Milosavljević, A; Savković, S; Crkvenjakov, R; Salbego, D; Serrato, H; Kreuzer, H; Gemmell, A; Batus, S; Grujić, D; Carnahan, S; Tepavcević, J

    1996-01-01

    Recently developed hybridization technology (Drmanac et al. 1994) enables economical large-scale detection of short oligomers within DNA fragments. The newly developed recognition method (Milosavljević 1995b) enables comparison of lists of oligomers detected within DNA fragments against known DNA sequences. We here describe an experiment involving a set of 4,513 distinct genomic E.coli clones of average length 2kb, each hybridized with 636 randomly selected short oligomer probes. High hybridization signal with a particular probe was used as an indication of the presence of a complementary oligomer in the particular clone. For each clone, a list of oligomers with highest hybridization signals was compiled. The database consisting of 4,513 oligomer lists was then searched using known E.coli sequences as queries in an attempt to identify the clones that match the query sequence. Out of a total of 11 clones that were recognized at highest significance level by our method, 8 were single-pass sequenced from both ends. The single-pass sequenced ends were then compared against the query sequences. The sequence comparisons confirmed 7 out of the total of 8 examined recognitions. This experiment represents the first successful example of genome-scale sequence recognition based on hybridization data.

  1. Identification of novel targets for breast cancer by exploring gene switches on a genome scale

    Directory of Open Access Journals (Sweden)

    Wu Ming

    2011-11-01

    Full Text Available Abstract Background An important feature that emerges from analyzing gene regulatory networks is the "switch-like behavior" or "bistability", a dynamic feature of a particular gene to preferentially toggle between two steady-states. The state of gene switches plays pivotal roles in cell fate decision, but identifying switches has been difficult. Therefore a challenge confronting the field is to be able to systematically identify gene switches. Results We propose a top-down mining approach to exploring gene switches on a genome-scale level. Theoretical analysis, proof-of-concept examples, and experimental studies demonstrate the ability of our mining approach to identify bistable genes by sampling across a variety of different conditions. Applying the approach to human breast cancer data identified genes that show bimodality within the cancer samples, such as estrogen receptor (ER and ERBB2, as well as genes that show bimodality between cancer and non-cancer samples, where tumor-associated calcium signal transducer 2 (TACSTD2 is uncovered. We further suggest a likely transcription factor that regulates TACSTD2. Conclusions Our mining approach demonstrates that one can capitalize on genome-wide expression profiling to capture dynamic properties of a complex network. To the best of our knowledge, this is the first attempt in applying mining approaches to explore gene switches on a genome-scale, and the identification of TACSTD2 demonstrates that single cell-level bistability can be predicted from microarray data. Experimental confirmation of the computational results suggest TACSTD2 could be a potential biomarker and attractive candidate for drug therapy against both ER+ and ER- subtypes of breast cancer, including the triple negative subtype.

  2. Comparative genomic analysis of Xanthomonas axonopodis pv. citrumelo F1, which causes citrus bacterial spot disease, and related strains provides insights into virulence and host specificity.

    Science.gov (United States)

    Jalan, Neha; Aritua, Valente; Kumar, Dibyendu; Yu, Fahong; Jones, Jeffrey B; Graham, James H; Setubal, João C; Wang, Nian

    2011-11-01

    Xanthomonas axonopodis pv. citrumelo is a citrus pathogen causing citrus bacterial spot disease that is geographically restricted within the state of Florida. Illumina, 454 sequencing, and optical mapping were used to obtain a complete genome sequence of X. axonopodis pv. citrumelo strain F1, 4.9 Mb in size. The strain lacks plasmids, in contrast to other citrus Xanthomonas pathogens. Phylogenetic analysis revealed that this pathogen is very close to the tomato bacterial spot pathogen X. campestris pv. vesicatoria 85-10, with a completely different host range. We also compared X. axonopodis pv. citrumelo to the genome of citrus canker pathogen X. axonopodis pv. citri 306. Comparative genomic analysis showed differences in several gene clusters, like those for type III effectors, the type IV secretion system, lipopolysaccharide synthesis, and others. In addition to pthA, effectors such as xopE3, xopAI, and hrpW were absent from X. axonopodis pv. citrumelo while present in X. axonopodis pv. citri. These effectors might be responsible for survival and the low virulence of this pathogen on citrus compared to that of X. axonopodis pv. citri. We also identified unique effectors in X. axonopodis pv. citrumelo that may be related to the different host range as compared to that of X. axonopodis pv. citri. X. axonopodis pv. citrumelo also lacks various genes, such as syrE1, syrE2, and RTX toxin family genes, which were present in X. axonopodis pv. citri. These may be associated with the distinct virulences of X. axonopodis pv. citrumelo and X. axonopodis pv. citri. Comparison of the complete genome sequence of X. axonopodis pv. citrumelo to those of X. axonopodis pv. citri and X. campestris pv. vesicatoria provides valuable insights into the mechanism of bacterial virulence and host specificity.

  3. Integrating Kinetic Model of E. coli with Genome Scale Metabolic Fluxes Overcomes Its Open System Problem and Reveals Bistability in Central Metabolism.

    Directory of Open Access Journals (Sweden)

    Ahmad A Mannan

    Full Text Available An understanding of the dynamics of the metabolic profile of a bacterial cell is sought from a dynamical systems analysis of kinetic models. This modelling formalism relies on a deterministic mathematical description of enzyme kinetics and their metabolite regulation. However, it is severely impeded by the lack of available kinetic information, limiting the size of the system that can be modelled. Furthermore, the subsystem of the metabolic network whose dynamics can be modelled is faced with three problems: how to parameterize the model with mostly incomplete steady state data, how to close what is now an inherently open system, and how to account for the impact on growth. In this study we address these challenges of kinetic modelling by capitalizing on multi-'omics' steady state data and a genome-scale metabolic network model. We use these to generate parameters that integrate knowledge embedded in the genome-scale metabolic network model, into the most comprehensive kinetic model of the central carbon metabolism of E. coli realized to date. As an application, we performed a dynamical systems analysis of the resulting enriched model. This revealed bistability of the central carbon metabolism and thus its potential to express two distinct metabolic states. Furthermore, since our model-informing technique ensures both stable states are constrained by the same thermodynamically feasible steady state growth rate, the ensuing bistability represents a temporal coexistence of the two states, and by extension, reveals the emergence of a phenotypically heterogeneous population.

  4. Diel-scale temporal dynamics recorded for bacterial groups in Namib Desert soil

    Science.gov (United States)

    Gunnigle, Eoin; Frossard, Aline; Ramond, Jean-Baptiste; Guerrero, Leandro; Seely, Mary; Cowan, Don A.

    2017-01-01

    Microbes in hot desert soil partake in core ecosystem processes e.g., biogeochemical cycling of carbon. Nevertheless, there is still a fundamental lack of insights regarding short-term (i.e., over a 24-hour [diel] cycle) microbial responses to highly fluctuating microenvironmental parameters like temperature and humidity. To address this, we employed T-RFLP fingerprinting and 454 pyrosequencing of 16S rRNA-derived cDNA to characterize potentially active bacteria in Namib Desert soil over multiple diel cycles. Strikingly, we found that significant shifts in active bacterial groups could occur over a single 24-hour period. For instance, members of the predominant Actinobacteria phyla exhibited a significant reduction in relative activity from morning to night, whereas many Proteobacterial groups displayed an opposite trend. Contrary to our leading hypothesis, environmental parameters could only account for 10.5% of the recorded total variation. Potential biotic associations shown through co-occurrence networks indicated that non-random inter- and intra-phyla associations were ‘time-of-day-dependent’ which may constitute a key feature of this system. Notably, many cyanobacterial groups were positioned outside and/or between highly interconnected bacterial associations (modules); possibly acting as inter-module ‘hubs’ orchestrating interactions between important functional consortia. Overall, these results provide empirical evidence that bacterial communities in hot desert soils exhibit complex and diel-dependent inter-community associations. PMID:28071697

  5. Biological Removal of Phosphate Using Phosphate Solubilizing Bacterial Consortium from Synthetic Wastewater: A Laboratory Scale

    Directory of Open Access Journals (Sweden)

    Dipak Paul

    2015-01-01

    Full Text Available Biological phosphate removal is an important process having gained worldwide attention and widely used for removing phosphorus from wastewater. The present investigation was aimed to screen the efficient phosphate solubilizing bacterial isolates and used to remove phosphate from synthetic wastewater under shaking flasks conditions. Pseudomonas sp. JPSB12, Enterobacter sp. TPSB20, Flavobacterium sp. TPSB23 and mixed bacterial consortium (Pseudomonas sp. JPSB12+Enterobacter sp. TPSB20+Flavobacterium sp. TPSB23 were used for the removal of phosphate. Among the individual strains, Enterobacter sp. TPSB20 was removed maximum phosphate (61.75% from synthetic wastewater in presence of glucose as a carbon source. The consortium was effectively removed phosphate (74.15-82.50% in the synthetic wastewater when compared to individual strains. The pH changes in culture medium with time and extracellular phosphatase activity (acid and alkaline were also investigated. The efficient removal of phosphate by the consortium may be due to the synergistic activity among the individual strains and phosphatase enzyme activity. The use of bacterial consortium in the remediation of phosphate contaminated aquatic environments has been discussed.

  6. In situ probing the interior of single bacterial cells at nanometer scale

    Science.gov (United States)

    Liu, Boyin; Hemayet Uddin, Md; Ng, Tuck Wah; Paterson, David L.; Velkov, Tony; Li, Jian; Fu, Jing

    2014-10-01

    We report a novel approach to probe the interior of single bacterial cells at nanometre resolution by combining focused ion beam (FIB) and atomic force microscopy (AFM). After removing layers of pre-defined thickness in the order of 100 nm on the target bacterial cells with FIB milling, AFM of different modes can be employed to probe the cellular interior under both ambient and aqueous environments. Our initial investigations focused on the surface topology induced by FIB milling and the hydration effects on AFM measurements, followed by assessment of the sample protocols. With fine-tuning of the process parameters, in situ AFM probing beneath the bacterial cell wall was achieved for the first time. We further demonstrate the proposed method by performing a spatial mapping of intracellular elasticity and chemistry of the multi-drug resistant strain Klebsiella pneumoniae cells prior to and after it was exposed to the ‘last-line’ antibiotic polymyxin B. Our results revealed increased stiffness occurring in both surface and interior regions of the treated cells, suggesting loss of integrity of the outer membrane from polymyxin treatments. In addition, the hydrophobicity measurement using a functionalized AFM tip was able to highlight the evident hydrophobic portion of the cell such as the regions containing cell membrane. We expect that the proposed FIB-AFM platform will help in gaining deeper insights of bacteria-drug interactions to develop potential strategies for combating multi-drug resistance.

  7. Diel-scale temporal dynamics recorded for bacterial groups in Namib Desert soil

    Science.gov (United States)

    Gunnigle, Eoin; Frossard, Aline; Ramond, Jean-Baptiste; Guerrero, Leandro; Seely, Mary; Cowan, Don A.

    2017-01-01

    Microbes in hot desert soil partake in core ecosystem processes e.g., biogeochemical cycling of carbon. Nevertheless, there is still a fundamental lack of insights regarding short-term (i.e., over a 24-hour [diel] cycle) microbial responses to highly fluctuating microenvironmental parameters like temperature and humidity. To address this, we employed T-RFLP fingerprinting and 454 pyrosequencing of 16S rRNA-derived cDNA to characterize potentially active bacteria in Namib Desert soil over multiple diel cycles. Strikingly, we found that significant shifts in active bacterial groups could occur over a single 24-hour period. For instance, members of the predominant Actinobacteria phyla exhibited a significant reduction in relative activity from morning to night, whereas many Proteobacterial groups displayed an opposite trend. Contrary to our leading hypothesis, environmental parameters could only account for 10.5% of the recorded total variation. Potential biotic associations shown through co-occurrence networks indicated that non-random inter- and intra-phyla associations were ‘time-of-day-dependent’ which may constitute a key feature of this system. Notably, many cyanobacterial groups were positioned outside and/or between highly interconnected bacterial associations (modules); possibly acting as inter-module ‘hubs’ orchestrating interactions between important functional consortia. Overall, these results provide empirical evidence that bacterial communities in hot desert soils exhibit complex and diel-dependent inter-community associations.

  8. Comparison of HapMap and 1000 genomes reference panels in a large-scale genome-wide association study

    NARCIS (Netherlands)

    P.S. de Vries (Paul); M. Sabater-Lleal (Maria); D.I. Chasman (Daniel); S. Trompet (Stella); T.S. Ahluwalia (Tarunveer Singh); A. Teumer (Alexander); M.E. Kleber (Marcus); M.-H. Chen (Ming-Huei); J.J. Wang (Jie Jin); J. Attia (John); R.E. Marioni (Riccardo); M. Steri (Maristella); Weng, L.-C. (Lu-Chen); R. Pool (Reńe); V. Grossmann (Vera); J. Brody (Jennifer); C. Venturini (Cristina); T. Tanaka (Toshiko); L.M. Rose (Lynda); C. Oldmeadow (Christopher); J. Mazur (Johanna); S. Basu (Saonli); M. Frånberg (Mattias); Q. Yang (Qiong); S. Ligthart (Symen); J.J. Hottenga (Jouke Jan); A. Rumley (Ann); Mulas, A. (Antonella); A.J. de Craen (Anton); A. Grotevendt (Anne); K.D. Taylor (Kent D.); G. Delgado; A. Kifley (Annette); L.M. Lopez (Lorna); T.L. Berentzen (Tina L.); M. Mangino (Massimo); S. Bandinelli (Stefania); Morrison, A.C. (Alanna C.); A. Hamsten (Anders); G.H. Tofler (Geoffrey); M.P.M. de Maat (Moniek); G. Draisma (Gerrit); G.D. Lowe (Gordon D.); M. Zoledziewska (Magdalena); N. Sattar (Naveed); Lackner, K.J. (Karl J.); U. Völker (Uwe); McKnight, B. (Barbara); J. Huang (Jian); E.G. Holliday (Elizabeth); McEvoy, M.A. (Mark A.); J.M. Starr (John); P.G. Hysi (Pirro); D.G. Hernandez (Dena); W. Guan (Weihua); F. Rivadeneira Ramirez (Fernando); W.L. McArdle (Wendy); P.E. Slagboom (Eline); Zeller, T. (Tanja); B.M. Psaty (Bruce); A.G. Uitterlinden (André); E.J.C. de Geus (Eco); D.J. Stott (David J.); H. Binder (Harald); A. Hofman (Albert); O.H. Franco (Oscar); J.I. Rotter (Jerome I.); L. Ferrucci (Luigi); Spector, T.D. (Tim D.); I.J. Deary (Ian J.); W. März (Winfried); A. Greinacher (Andreas); P.S. Wild (Philipp S.); F. Cucca (Francesco); D.I. Boomsma (Dorret); Watkins, H. (Hugh); Tang, W. (Weihong); P.M. Ridker (Paul); J.W. Jukema; R.J. Scott (Rodney J.); P. Mitchell (Paul); T. Hansen (T.); O'Donnell, C.J. (Christopher J.); Smith, N.L. (Nicholas L.); D.P. Strachan (David P.); A. Dehghan (Abbas)

    2017-01-01

    textabstractAn increasing number of genome-wide association (GWA) studies are now using the higher resolution 1000 Genomes Project reference panel (1000G) for imputation, with the expectation that 1000G imputation will lead to the discovery of additional associated loci when compared to HapMap imput

  9. Dynamics of bacterial populations during bench-scale bioremediation of oily seawater and desert soil bioaugmented with coastal microbial mats.

    Science.gov (United States)

    Ali, Nidaa; Dashti, Narjes; Salamah, Samar; Sorkhoh, Naser; Al-Awadhi, Husain; Radwan, Samir

    2016-03-01

    This study describes a bench-scale attempt to bioremediate Kuwaiti, oily water and soil samples through bioaugmentation with coastal microbial mats rich in hydrocarbonoclastic bacterioflora. Seawater and desert soil samples were artificially polluted with 1% weathered oil, and bioaugmented with microbial mat suspensions. Oil removal and microbial community dynamics were monitored. In batch cultures, oil removal was more effective in soil than in seawater. Hydrocarbonoclastic bacteria associated with mat samples colonized soil more readily than seawater. The predominant oil degrading bacterium in seawater batches was the autochthonous seawater species Marinobacter hydrocarbonoclasticus. The main oil degraders in the inoculated soil samples, on the other hand, were a mixture of the autochthonous mat and desert soil bacteria; Xanthobacter tagetidis, Pseudomonas geniculata, Olivibacter ginsengisoli and others. More bacterial diversity prevailed in seawater during continuous than batch bioremediation. Out of seven hydrocarbonoclastic bacterial species isolated from those cultures, only one, Mycobacterium chlorophenolicum, was of mat origin. This result too confirms that most of the autochthonous mat bacteria failed to colonize seawater. Also culture-independent analysis of seawater from continuous cultures revealed high-bacterial diversity. Many of the bacteria belonged to the Alphaproteobacteria, Flavobacteria and Gammaproteobacteria, and were hydrocarbonoclastic. Optimal biostimulation practices for continuous culture bioremediation of seawater via mat bioaugmentation were adding the highest possible oil concentration as one lot in the beginning of bioremediation, addition of vitamins, and slowing down the seawater flow rate.

  10. Structural characterization of genomes by large scale sequence-structure threading: application of reliability analysis in structural genomics

    Directory of Open Access Journals (Sweden)

    Brunham Robert C

    2004-07-01

    Full Text Available Abstract Background We establish that the occurrence of protein folds among genomes can be accurately described with a Weibull function. Systems which exhibit Weibull character can be interpreted with reliability theory commonly used in engineering analysis. For instance, Weibull distributions are widely used in reliability, maintainability and safety work to model time-to-failure of mechanical devices, mechanisms, building constructions and equipment. Results We have found that the Weibull function describes protein fold distribution within and among genomes more accurately than conventional power functions which have been used in a number of structural genomic studies reported to date. It has also been found that the Weibull reliability parameter β for protein fold distributions varies between genomes and may reflect differences in rates of gene duplication in evolutionary history of organisms. Conclusions The results of this work demonstrate that reliability analysis can provide useful insights and testable predictions in the fields of comparative and structural genomics.

  11. T4SP Database 2.0: An Improved Database for Type IV Secretion Systems in Bacterial Genomes with New Online Analysis Tools

    Science.gov (United States)

    Han, Na; Yu, Weiwen; Qiang, Yujun

    2016-01-01

    Type IV secretion system (T4SS) can mediate the passage of macromolecules across cellular membranes and is essential for virulent and genetic material exchange among bacterial species. The Type IV Secretion Project 2.0 (T4SP 2.0) database is an improved and extended version of the platform released in 2013 aimed at assisting with the detection of Type IV secretion systems (T4SS) in bacterial genomes. This advanced version provides users with web server tools for detecting the existence and variations of T4SS genes online. The new interface for the genome browser provides a user-friendly access to the most complete and accurate resource of T4SS gene information (e.g., gene number, name, type, position, sequence, related articles, and quick links to other webs). Currently, this online database includes T4SS information of 5239 bacterial strains. Conclusions. T4SS is one of the most versatile secretion systems necessary for the virulence and survival of bacteria and the secretion of protein and/or DNA substrates from a donor to a recipient cell. This database on virB/D genes of the T4SS system will help scientists worldwide to improve their knowledge on secretion systems and also identify potential pathogenic mechanisms of various microbial species.

  12. Genome-Scale Mapping of Escherichia coli σ54 Reveals Widespread, Conserved Intragenic Binding.

    Directory of Open Access Journals (Sweden)

    Richard P Bonocora

    2015-10-01

    Full Text Available Bacterial RNA polymerases must associate with a σ factor to bind promoter DNA and initiate transcription. There are two families of σ factor: the σ70 family and the σ54 family. Members of the σ54 family are distinct in their ability to bind promoter DNA sequences, in the context of RNA polymerase holoenzyme, in a transcriptionally inactive state. Here, we map the genome-wide association of Escherichia coli σ54, the archetypal member of the σ54 family. Thus, we vastly expand the list of known σ54 binding sites to 135. Moreover, we estimate that there are more than 250 σ54 sites in total. Strikingly, the majority of σ54 binding sites are located inside genes. The location and orientation of intragenic σ54 binding sites is non-random, and many intragenic σ54 binding sites are conserved. We conclude that many intragenic σ54 binding sites are likely to be functional. Consistent with this assertion, we identify three conserved, intragenic σ54 promoters that drive transcription of mRNAs with unusually long 5' UTRs.

  13. Production of polyhydroxyalkanoates (PHA) by bacterial consortium from excess sludge fermentation liquid at laboratory and pilot scales.

    Science.gov (United States)

    Jia, Qianqian; Xiong, Huilei; Wang, Hui; Shi, Hanchang; Sheng, Xinying; Sun, Run; Chen, Guoqiang

    2014-11-01

    The generation of polyhydroxyalkanoates (PHA) from excess sludge fermentation liquid (SFL) was studied at lab and pilot scale. A PHA-accumulated bacterial consortium (S-150) was isolated from activated sludge using simulated SFL (S-SFL) contained high concentration volatile fatty acids (VFA) and nitrogen. The maximal PHA content accounted for 59.18% in S-SFL and dropped to 23.47% in actual SFL (L-SFL) of the dry cell weight (DCW) at lab scale. The pilot-scale integrated system comprised an anaerobic fermentation reactor (AFR), a ceramic membrane system (CMS) and a PHA production bio-reactor (PHAR). The PHA content from pilot-scale SFL (P-SFL) finally reached to 59.47% DCW with the maximal PHA yield coefficient (YP/S) of 0.17 g PHA/g COD. The results indicated that VFA-containing SFL was suitable for PHA production. The adverse impact of excess nitrogen and non-VFAs in SFL might be eliminated by pilot-scale domestication, which might resulted in community structure optimization and substrate selective ability improvement of S-150.

  14. Substrate type and free ammonia determine bacterial community structure in full-scale mesophilic anaerobic digesters treating cattle or swine manure

    Directory of Open Access Journals (Sweden)

    Jiabao eLi

    2015-11-01

    Full Text Available The microbial-mediated anaerobic digestion (AD process represents an efficient biological process for the treatment of organic waste along with biogas harvest. Currently, the key factors structuring bacterial communities and the potential core and unique bacterial populations in manure anaerobic digesters are not completely elucidated yet. In this study, we collected sludge samples from 20 full-scale anaerobic digesters treating cattle or swine manure, and investigated the variations of bacterial community compositions using high-throughput 16S rRNA amplicon sequencing. Clustering and correlation analysis suggested that substrate type and free ammonia (FA play key roles in determining the bacterial community structure. The COD: NH4+-N (C:N ratio of substrate and FA were the most important available operational parameters correlating to the bacterial communities in cattle and swine manure digesters, respectively. The bacterial populations in all of the digesters were dominated by phylum Firmicutes, followed by Bacteroidetes, Proteobacteria and Chloroflexi. Increased FA content selected Firmicutes, suggesting that they probably play more important roles under high FA content. Syntrophic metabolism by Proteobacteria, Chloroflexi, Synergistetes and Planctomycetes are likely inhibited when FA content is high. Despite the different manure substrates, operational conditions and geographical locations of digesters, core bacterial communities were identified. The core communities were best characterized by phylum Firmicutes, wherein Clostridium predominated overwhelmingly. Substrate-unique and abundant communities may reflect the properties of manure substrate and operational conditions. These findings extend our current understanding of the bacterial assembly in full-scale manure anaerobic digesters.

  15. Genome-wide association analysis of bacterial cold water disease resistance in rainbow trout reveals the potential of a hybrid approach between genomic selection and marker assisted selection

    Science.gov (United States)

    Genomic selection (GS) simultaneously incorporates dense SNP marker genotypes with phenotypic data from related animals to predict animal-specific genomic breeding value (GEBV), which circumvents the need to measure the disease phenotype in potential breeders. Marker assisted selection (MAS) involv...

  16. Large scale comparison of innate responses to viral and bacterial pathogens in mouse and macaque.

    Directory of Open Access Journals (Sweden)

    Guy Zinman

    Full Text Available Viral and bacterial infections of the lower respiratory tract are major causes of morbidity and mortality worldwide. Alveolar macrophages line the alveolar spaces and are the first cells of the immune system to respond to invading pathogens. To determine the similarities and differences between the responses of mice and macaques to invading pathogens we profiled alveolar macrophages from these species following infection with two viral (PR8 and Fuj/02 influenza A and two bacterial (Mycobacterium tuberculosis and Francisella tularensis Schu S4 pathogens. Cells were collected at 6 time points following each infection and expression profiles were compared across and between species. Our analyses identified a core set of genes, activated in both species and across all pathogens that were predominantly part of the interferon response pathway. In addition, we identified similarities across species in the way innate immune cells respond to lethal versus non-lethal pathogens. On the other hand we also found several species and pathogen specific response patterns. These results provide new insights into mechanisms by which the innate immune system responds to, and interacts with, invading pathogens.

  17. Genome-scale reconstruction and analysis of the metabolic network in the hyperthermophilic archaeon Sulfolobus solfataricus.

    Directory of Open Access Journals (Sweden)

    Thomas Ulas

    Full Text Available We describe the reconstruction of a genome-scale metabolic model of the crenarchaeon Sulfolobus solfataricus, a hyperthermoacidophilic microorganism. It grows in terrestrial volcanic hot springs with growth occurring at pH 2-4 (optimum 3.5 and a temperature of 75-80°C (optimum 80°C. The genome of Sulfolobus solfataricus P2 contains 2,992,245 bp on a single circular chromosome and encodes 2,977 proteins and a number of RNAs. The network comprises 718 metabolic and 58 transport/exchange reactions and 705 unique metabolites, based on the annotated genome and available biochemical data. Using the model in conjunction with constraint-based methods, we simulated the metabolic fluxes induced by different environmental and genetic conditions. The predictions were compared to experimental measurements and phenotypes of S. solfataricus. Furthermore, the performance of the network for 35 different carbon sources known for S. solfataricus from the literature was simulated. Comparing the growth on different carbon sources revealed that glycerol is the carbon source with the highest biomass flux per imported carbon atom (75% higher than glucose. Experimental data was also used to fit the model to phenotypic observations. In addition to the commonly known heterotrophic growth of S. solfataricus, the crenarchaeon is also able to grow autotrophically using the hydroxypropionate-hydroxybutyrate cycle for bicarbonate fixation. We integrated this pathway into our model and compared bicarbonate fixation with growth on glucose as sole carbon source. Finally, we tested the robustness of the metabolism with respect to gene deletions using the method of Minimization of Metabolic Adjustment (MOMA, which predicted that 18% of all possible single gene deletions would be lethal for the organism.

  18. Genome-scale metabolic analysis of Clostridium thermocellum for bioethanol production

    Directory of Open Access Journals (Sweden)

    Brooks J Paul

    2010-03-01

    Full Text Available Abstract Background Microorganisms possess diverse metabolic capabilities that can potentially be leveraged for efficient production of biofuels. Clostridium thermocellum (ATCC 27405 is a thermophilic anaerobe that is both cellulolytic and ethanologenic, meaning that it can directly use the plant sugar, cellulose, and biochemically convert it to ethanol. A major challenge in using microorganisms for chemical production is the need to modify the organism to increase production efficiency. The process of properly engineering an organism is typically arduous. Results Here we present a genome-scale model of C. thermocellum metabolism, iSR432, for the purpose of establishing a computational tool to study the metabolic network of C. thermocellum and facilitate efforts to engineer C. thermocellum for biofuel production. The model consists of 577 reactions involving 525 intracellular metabolites, 432 genes, and a proteomic-based representation of a cellulosome. The process of constructing this metabolic model led to suggested annotation refinements for 27 genes and identification of areas of metabolism requiring further study. The accuracy of the iSR432 model was tested using experimental growth and by-product secretion data for growth on cellobiose and fructose. Analysis using this model captures the relationship between the reduction-oxidation state of the cell and ethanol secretion and allowed for prediction of gene deletions and environmental conditions that would increase ethanol production. Conclusions By incorporating genomic sequence data, network topology, and experimental measurements of enzyme activities and metabolite fluxes, we have generated a model that is reasonably accurate at predicting the cellular phenotype of C. thermocellum and establish a strong foundation for rational strain design. In addition, we are able to draw some important conclusions regarding the underlying metabolic mechanisms for observed behaviors of C. thermocellum

  19. Genome-scale reconstruction of the metabolic network in Yersinia pestis, strain 91001

    Energy Technology Data Exchange (ETDEWEB)

    Navid, A; Almaas, E

    2009-01-13

    The gram-negative bacterium Yersinia pestis, the aetiological agent of bubonic plague, is one the deadliest pathogens known to man. Despite its historical reputation, plague is a modern disease which annually afflicts thousands of people. Public safety considerations greatly limit clinical experimentation on this organism and thus development of theoretical tools to analyze the capabilities of this pathogen is of utmost importance. Here, we report the first genome-scale metabolic model of Yersinia pestis biovar Mediaevalis based both on its recently annotated genome, and physiological and biochemical data from literature. Our model demonstrates excellent agreement with Y. pestis known metabolic needs and capabilities. Since Y. pestis is a meiotrophic organism, we have developed CryptFind, a systematic approach to identify all candidate cryptic genes responsible for known and theoretical meiotrophic phenomena. In addition to uncovering every known cryptic gene for Y. pestis, our analysis of the rhamnose fermentation pathway suggests that betB is the responsible cryptic gene. Despite all of our medical advances, we still do not have a vaccine for bubonic plague. Recent discoveries of antibiotic resistant strains of Yersinia pestis coupled with the threat of plague being used as a bioterrorism weapon compel us to develop new tools for studying the physiology of this deadly pathogen. Using our theoretical model, we can study the cell's phenotypic behavior under different circumstances and identify metabolic weaknesses which may be harnessed for the development of therapeutics. Additionally, the automatic identification of cryptic genes expands the usage of genomic data for pharmaceutical purposes.

  20. Analysis of genetic variation and potential applications in genome-scale metabolic modeling

    Directory of Open Access Journals (Sweden)

    João Gonçalo Rocha Cardoso

    2015-02-01

    Full Text Available Genetic variation is the motor of evolution and allows organisms to overcome the environmental challenges they encounter. It can be both beneficial and harmful in the process of engineering cell factories for the production of proteins and chemicals. Throughout the history of biotechnology, there have been efforts to exploit genetic variation in our favor to create strains with favorable phenotypes. Genetic variation can either be present in natural populations or it can be artificially created by mutagenesis and selection or adaptive laboratory evolution. On the other hand, unintended genetic variation during a long term production process may lead to significant economic losses and it is important to understand how to control this type of variation. With the emergence of next-generation sequencing technologies, genetic variation in microbial strains can now be determined on an unprecedented scale and resolution by re-sequencing thousands of strains systematically. In this article, we review challenges in the integration and analysis of large-scale re-sequencing data, present an extensive overview of bioinformatics methods for predicting the effects of genetic variants on protein function, and discuss approaches for interfacing existing bioinformatics approaches with genome-scale models of cellular processes in order to predict effects of sequence variation on cellular phenotypes.

  1. Bacterial genomes lacking long-range correlations may not be modeled by low-order Markov chains: the role of mixing statistics and frame shift of neighboring genes.

    Science.gov (United States)

    Cocho, Germinal; Miramontes, Pedro; Mansilla, Ricardo; Li, Wentian

    2014-12-01

    We examine the relationship between exponential correlation functions and Markov models in a bacterial genome in detail. Despite the well known fact that Markov models generate sequences with correlation function that decays exponentially, simply constructed Markov models based on nearest-neighbor dimer (first-order), trimer (second-order), up to hexamer (fifth-order), and treating the DNA sequence as being homogeneous all fail to predict the value of exponential decay rate. Even reading-frame-specific Markov models (both first- and fifth-order) could not explain the fact that the exponential decay is very slow. Starting with the in-phase coding-DNA-sequence (CDS), we investigated correlation within a fixed-codon-position subsequence, and in artificially constructed sequences by packing CDSs with out-of-phase spacers, as well as altering CDS length distribution by imposing an upper limit. From these targeted analyses, we conclude that the correlation in the bacterial genomic sequence is mainly due to a mixing of heterogeneous statistics at different codon positions, and the decay of correlation is due to the possible out-of-phase between neighboring CDSs. There are also small contributions to the correlation from bases at the same codon position, as well as by non-coding sequences. These show that the seemingly simple exponential correlation functions in bacterial genome hide a complexity in correlation structure which is not suitable for a modeling by Markov chain in a homogeneous sequence. Other results include: use of the (absolute value) second largest eigenvalue to represent the 16 correlation functions and the prediction of a 10-11 base periodicity from the hexamer frequencies.

  2. Large-scale compression of genomic sequence databases with the Burrows-Wheeler transform

    CERN Document Server

    Cox, Anthony J; Jakobi, Tobias; Rosone, Giovanna

    2012-01-01

    Motivation The Burrows-Wheeler transform (BWT) is the foundation of many algorithms for compression and indexing of text data, but the cost of computing the BWT of very large string collections has prevented these techniques from being widely applied to the large sets of sequences often encountered as the outcome of DNA sequencing experiments. In previous work, we presented a novel algorithm that allows the BWT of human genome scale data to be computed on very moderate hardware, thus enabling us to investigate the BWT as a tool for the compression of such datasets. Results We first used simulated reads to explore the relationship between the level of compression and the error rate, the length of the reads and the level of sampling of the underlying genome and compare choices of second-stage compression algorithm. We demonstrate that compression may be greatly improved by a particular reordering of the sequences in the collection and give a novel `implicit sorting' strategy that enables these benefits to be re...

  3. Reframed Genome-Scale Metabolic Model to Facilitate Genetic Design and Integration with Expression Data.

    Science.gov (United States)

    Gu, Deqing; Jian, Xingxing; Zhang, Cheng; Hua, Qiang

    2016-06-08

    Genome-scale metabolic network models (GEMs) have played important roles in the design of genetically engineered strains and helped biologists to decipher metabolism. However, due to the complex gene-reaction relationships that exist in model systems, most algorithms have limited capabilities with respect to directly predicting accurate genetic design for metabolic engineering. In particular, methods that predict reaction knockout strategies leading to overproduction are often impractical in terms of gene manipulations. Recently, we proposed a method named LTM (logical transformation of model) to simplify the gene-reaction associations by introducing intermediate pseudo reactions, which makes it possible to generate genetic design. Here, we propose an alternative method to relieve researchers from deciphering complex gene-reactions by adding pseudo gene controlling reactions. In comparison to LTM, this new method introduces fewer pseudo reactions and generates a much smaller model system named as gModel. We showed that gModel allows two seldom reported applications: identification of minimal genomes and design of minimal cell factories within a modified OptKnock framework. In addition, gModel could be used to integrate expression data directly and improve the performance of the E-Fmin method for predicting fluxes. In conclusion, the model transformation procedure will facilitate genetic research based on GEMs, extending their applications.

  4. A Consensus Genome-scale Reconstruction of Chinese Hamster Ovary Cell Metabolism

    KAUST Repository

    Hefzi, Hooman

    2016-11-23

    Chinese hamster ovary (CHO) cells dominate biotherapeutic protein production and are widely used in mammalian cell line engineering research. To elucidate metabolic bottlenecks in protein production and to guide cell engineering and bioprocess optimization, we reconstructed the metabolic pathways in CHO and associated them with >1,700 genes in the Cricetulus griseus genome. The genome-scale metabolic model based on this reconstruction, iCHO1766, and cell-line-specific models for CHO-K1, CHO-S, and CHO-DG44 cells provide the biochemical basis of growth and recombinant protein production. The models accurately predict growth phenotypes and known auxotrophies in CHO cells. With the models, we quantify the protein synthesis capacity of CHO cells and demonstrate that common bioprocess treatments, such as histone deacetylase inhibitors, inefficiently increase product yield. However, our simulations show that the metabolic resources in CHO are more than three times more efficiently utilized for growth or recombinant protein synthesis following targeted efforts to engineer the CHO secretory pathway. This model will further accelerate CHO cell engineering and help optimize bioprocesses.

  5. Genome-scale reconstruction of metabolic network for a halophilic extremophile, Chromohalobacter salexigens DSM 3043

    Directory of Open Access Journals (Sweden)

    Oner Ebru

    2011-01-01

    Full Text Available Abstract Background Chromohalobacter salexigens (formerly Halomonas elongata DSM 3043 is a halophilic extremophile with a very broad salinity range and is used as a model organism to elucidate prokaryotic osmoadaptation due to its strong euryhaline phenotype. Results C. salexigens DSM 3043's metabolism was reconstructed based on genomic, biochemical and physiological information via a non-automated but iterative process. This manually-curated reconstruction accounts for 584 genes, 1386 reactions, and 1411 metabolites. By using flux balance analysis, the model was extensively validated against literature data on the C. salexigens phenotypic features, the transport and use of different substrates for growth as well as against experimental observations on the uptake and accumulation of industrially important organic osmolytes, ectoine, betaine, and its precursor choline, which play important roles in the adaptive response to osmotic stress. Conclusions This work presents the first comprehensive genome-scale metabolic model of a halophilic bacterium. Being a useful guide for identification and filling of knowledge gaps, the reconstructed metabolic network iOA584 will accelerate the research on halophilic bacteria towards application of systems biology approaches and design of metabolic engineering strategies.

  6. Inferring the choreography of parental genomes during fertilization from ultralarge-scale whole-transcriptome analysis.

    Science.gov (United States)

    Park, Sung-Joon; Komata, Makiko; Inoue, Fukashi; Yamada, Kaori; Nakai, Kenta; Ohsugi, Miho; Shirahige, Katsuhiko

    2013-12-15

    Fertilization precisely choreographs parental genomes by using gamete-derived cellular factors and activating genome regulatory programs. However, the mechanism remains elusive owing to the technical difficulties of preparing large numbers of high-quality preimplantation cells. Here, we collected >14 × 10(4) high-quality mouse metaphase II oocytes and used these to establish detailed transcriptional profiles for four early embryo stages and parthenogenetic development. By combining these profiles with other public resources, we found evidence that gene silencing appeared to be mediated in part by noncoding RNAs and that this was a prerequisite for post-fertilization development. Notably, we identified 817 genes that were differentially expressed in embryos after fertilization compared with parthenotes. The regulation of these genes was distinctly different from those expressed in parthenotes, suggesting functional specialization of particular transcription factors prior to first cell cleavage. We identified five transcription factors that were potentially necessary for developmental progression: Foxd1, Nkx2-5, Sox18, Myod1, and Runx1. Our very large-scale whole-transcriptome profile of early mouse embryos yielded a novel and valuable resource for studies in developmental biology and stem cell research. The database is available at http://dbtmee.hgc.jp.

  7. Deriving metabolic engineering strategies from genome-scale modeling with flux ratio constraints.

    Science.gov (United States)

    Yen, Jiun Y; Nazem-Bokaee, Hadi; Freedman, Benjamin G; Athamneh, Ahmad I M; Senger, Ryan S

    2013-05-01

    Optimized production of bio-based fuels and chemicals from microbial cell factories is a central goal of systems metabolic engineering. To achieve this goal, a new computational method of using flux balance analysis with flux ratios (FBrAtio) was further developed in this research and applied to five case studies to evaluate and design metabolic engineering strategies. The approach was implemented using publicly available genome-scale metabolic flux models. Synthetic pathways were added to these models along with flux ratio constraints by FBrAtio to achieve increased (i) cellulose production from Arabidopsis thaliana; (ii) isobutanol production from Saccharomyces cerevisiae; (iii) acetone production from Synechocystis sp. PCC6803; (iv) H2 production from Escherichia coli MG1655; and (v) isopropanol, butanol, and ethanol (IBE) production from engineered Clostridium acetobutylicum. The FBrAtio approach was applied to each case to simulate a metabolic engineering strategy already implemented experimentally, and flux ratios were continually adjusted to find (i) the end-limit of increased production using the existing strategy, (ii) new potential strategies to increase production, and (iii) the impact of these metabolic engineering strategies on product yield and culture growth. The FBrAtio approach has the potential to design "fine-tuned" metabolic engineering strategies in silico that can be implemented directly with available genomic tools.

  8. Reliable and efficient solution of genome-scale models of Metabolism and macromolecular Expression

    Science.gov (United States)

    Ma, Ding; Yang, Laurence; Fleming, Ronan M. T.; Thiele, Ines; Palsson, Bernhard O.; Saunders, Michael A.

    2017-01-01

    Constraint-Based Reconstruction and Analysis (COBRA) is currently the only methodology that permits integrated modeling of Metabolism and macromolecular Expression (ME) at genome-scale. Linear optimization computes steady-state flux solutions to ME models, but flux values are spread over many orders of magnitude. Data values also have greatly varying magnitudes. Standard double-precision solvers may return inaccurate solutions or report that no solution exists. Exact simplex solvers based on rational arithmetic require a near-optimal warm start to be practical on large problems (current ME models have 70,000 constraints and variables and will grow larger). We have developed a quadruple-precision version of our linear and nonlinear optimizer MINOS, and a solution procedure (DQQ) involving Double and Quad MINOS that achieves reliability and efficiency for ME models and other challenging problems tested here. DQQ will enable extensive use of large linear and nonlinear models in systems biology and other applications involving multiscale data.

  9. Integration of gene expression data into genome-scale metabolic models

    DEFF Research Database (Denmark)

    Åkesson, M.; Förster, Jochen; Nielsen, Jens

    2004-01-01

    of gene expression from chemostat and batch cultures of Saccharomyces cerevisiae were combined with a recently developed genome-scale model, and the computed metabolic flux distributions were compared to experimental values from carbon labeling experiments and metabolic network analysis. The integration......A framework for integration of transcriptome data into stoichiometric metabolic models to obtain improved flux predictions is presented. The key idea is to exploit the regulatory information in the expression data to give additional constraints on the metabolic fluxes in the model. Measurements...... of expression data resulted in improved predictions of metabolic behavior in batch cultures, enabling quantitative predictions of exchange fluxes as well as qualitative estimations of changes in intracellular fluxes. A critical discussion of correlation between gene expression and metabolic fluxes is given....

  10. Genome scale models of yeast: towards standardized evaluation and consistent omic integration

    DEFF Research Database (Denmark)

    Sanchez, Benjamin J.; Nielsen, Jens

    2015-01-01

    Genome scale models (GEMs) have enabled remarkable advances in systems biology, acting as functional databases of metabolism, and as scaffolds for the contextualization of high-throughput data. In the case of Saccharomyces cerevisiae (budding yeast), several GEMs have been published...... and are currently used for metabolic engineering and elucidating biological interactions. Here we review the history of yeast's GEMs, focusing on recent developments. We study how these models are typically evaluated, using both descriptive and predictive metrics. Additionally, we analyze the different ways...... in which all levels of omics data (from gene expression to flux) have been integrated in yeast GEMs. Relevant conclusions and current challenges for both GEM evaluation and omic integration are highlighted....

  11. Reconstruction and modeling protein translocation and compartmentalization in Escherichia coli at the genome-scale

    DEFF Research Database (Denmark)

    Liu, Joanne K.; O’Brien, Edward J.; Lerman, Joshua A.;

    2014-01-01

    Background: Membranes play a crucial role in cellular functions. Membranes provide a physical barrier, control the trafficking of substances entering and leaving the cell, and are a major determinant of cellular ultra-structure. In addition, components embedded within the membrane participate...... the functional content of membranes, cellular compartment-specific composition, and that it can be utilized to examine the effect of perturbing an expanded set of network components. iJL1678-ME takes a notable step towards the inclusion of cellular ultra-structure in genome-scale models....... in cell signaling, energy transduction, and other critical cellular functions. All these processes must share the limited space in the membrane; thus it represents a notable constraint on cellular functions. Membrane- and location-based processes have not yet been reconstructed and explicitly integrated...

  12. Principles of proteome allocation are revealed using proteomic data and genome-scale models

    DEFF Research Database (Denmark)

    Yang, Laurence; Yurkovich, James T.; Lloyd, Colton J.

    2016-01-01

    of these sectors for the general stress response sigma factor sigma(S). Finally, the sector constraints represent a general formalism for integrating omics data from any experimental condition into constraint-based ME models. The constraints can be fine-grained (individual proteins) or coarse-grained (functionally......Integrating omics data to refine or make context-specific models is an active field of constraint-based modeling. Proteomics now cover over 95% of the Escherichia coli proteome by mass. Genome-scale models of Metabolism and macromolecular Expression (ME) compute proteome allocation linked...... to metabolism and fitness. Using proteomics data, we formulated allocation constraints for key proteome sectors in the ME model. The resulting calibrated model effectively computed the "generalist" (wild-type) E. coli proteome and phenotype across diverse growth environments. Across 15 growth conditions...

  13. Moving image analysis to the cloud: A case study with a genome-scale tomographic study

    Science.gov (United States)

    Mader, Kevin; Stampanoni, Marco

    2016-01-01

    Over the last decade, the time required to measure a terabyte of microscopic imaging data has gone from years to minutes. This shift has moved many of the challenges away from experimental design and measurement to scalable storage, organization, and analysis. As many scientists and scientific institutions lack training and competencies in these areas, major bottlenecks have arisen and led to substantial delays and gaps between measurement, understanding, and dissemination. We present in this paper a framework for analyzing large 3D datasets using cloud-based computational and storage resources. We demonstrate its applicability by showing the setup and costs associated with the analysis of a genome-scale study of bone microstructure. We then evaluate the relative advantages and disadvantages associated with local versus cloud infrastructures.

  14. Moving image analysis to the cloud: A case study with a genome-scale tomographic study

    Energy Technology Data Exchange (ETDEWEB)

    Mader, Kevin [4Quant Ltd., Switzerland & Institute for Biomedical Engineering at University and ETH Zurich (Switzerland); Stampanoni, Marco [Institute for Biomedical Engineering at University and ETH Zurich, Switzerland & Swiss Light Source at Paul Scherrer Institut, Villigen (Switzerland)

    2016-01-28

    Over the last decade, the time required to measure a terabyte of microscopic imaging data has gone from years to minutes. This shift has moved many of the challenges away from experimental design and measurement to scalable storage, organization, and analysis. As many scientists and scientific institutions lack training and competencies in these areas, major bottlenecks have arisen and led to substantial delays and gaps between measurement, understanding, and dissemination. We present in this paper a framework for analyzing large 3D datasets using cloud-based computational and storage resources. We demonstrate its applicability by showing the setup and costs associated with the analysis of a genome-scale study of bone microstructure. We then evaluate the relative advantages and disadvantages associated with local versus cloud infrastructures.

  15. Genome-scale modeling of the protein secretory machinery in yeast.

    Science.gov (United States)

    Feizi, Amir; Österlund, Tobias; Petranovic, Dina; Bordel, Sergio; Nielsen, Jens

    2013-01-01

    The protein secretory machinery in Eukarya is involved in post-translational modification (PTMs) and sorting of the secretory and many transmembrane proteins. While the secretory machinery has been well-studied using classic reductionist approaches, a holistic view of its complex nature is lacking. Here, we present the first genome-scale model for the yeast secretory machinery which captures the knowledge generated through more than 50 years of research. The model is based on the concept of a Protein Specific Information Matrix (PSIM: characterized by seven PTMs features). An algorithm was developed which mimics secretory machinery and assigns each secretory protein to a particular secretory class that determines the set of PTMs and transport steps specific to each protein. Protein abundances were integrated with the model in order to gain system level estimation of the metabolic demands associated with the processing of each specific protein as well as a quantitative estimation of the activity of each component of the secretory machinery.

  16. Genome-scale consequences of cofactor balancing in engineered pentose utilization pathways in Saccharomyces cerevisiae.

    Directory of Open Access Journals (Sweden)

    Amit Ghosh

    Full Text Available Biofuels derived from lignocellulosic biomass offer promising alternative renewable energy sources for transportation fuels. Significant effort has been made to engineer Saccharomyces cerevisiae to efficiently ferment pentose sugars such as D-xylose and L-arabinose into biofuels such as ethanol through heterologous expression of the fungal D-xylose and L-arabinose pathways. However, one of the major bottlenecks in these fungal pathways is that the cofactors are not balanced, which contributes to inefficient utilization of pentose sugars. We utilized a genome-scale model of S. cerevisiae to predict the maximal achievable growth rate for cofactor balanced and imbalanced D-xylose and L-arabinose utilization pathways. Dynamic flux balance analysis (DFBA was used to simulate batch fermentation of glucose, D-xylose, and L-arabinose. The dynamic models and experimental results are in good agreement for the wild type and for the engineered D-xylose utilization pathway. Cofactor balancing the engineered D-xylose and L-arabinose utilization pathways simulated an increase in ethanol batch production of 24.7% while simultaneously reducing the predicted substrate utilization time by 70%. Furthermore, the effects of cofactor balancing the engineered pentose utilization pathways were evaluated throughout the genome-scale metabolic network. This work not only provides new insights to the global network effects of cofactor balancing but also provides useful guidelines for engineering a recombinant yeast strain with cofactor balanced engineered pathways that efficiently co-utilizes pentose and hexose sugars for biofuels production. Experimental switching of cofactor usage in enzymes has been demonstrated, but is a time-consuming effort. Therefore, systems biology models that can predict the likely outcome of such strain engineering efforts are highly useful for motivating which efforts are likely to be worth the significant time investment.

  17. Zea mays iRS1563: a comprehensive genome-scale metabolic reconstruction of maize metabolism.

    Science.gov (United States)

    Saha, Rajib; Suthers, Patrick F; Maranas, Costas D

    2011-01-01

    The scope and breadth of genome-scale metabolic reconstructions have continued to expand over the last decade. Herein, we introduce a genome-scale model for a plant with direct applications to food and bioenergy production (i.e., maize). Maize annotation is still underway, which introduces significant challenges in the association of metabolic functions to genes. The developed model is designed to meet rigorous standards on gene-protein-reaction (GPR) associations, elementally and charged balanced reactions and a biomass reaction abstracting the relative contribution of all biomass constituents. The metabolic network contains 1,563 genes and 1,825 metabolites involved in 1,985 reactions from primary and secondary maize metabolism. For approximately 42% of the reactions direct literature evidence for the participation of the reaction in maize was found. As many as 445 reactions and 369 metabolites are unique to the maize model compared to the AraGEM model for A. thaliana. 674 metabolites and 893 reactions are present in Zea mays iRS1563 that are not accounted for in maize C4GEM. All reactions are elementally and charged balanced and localized into six different compartments (i.e., cytoplasm, mitochondrion, plastid, peroxisome, vacuole and extracellular). GPR associations are also established based on the functional annotation information and homology prediction accounting for monofunctional, multifunctional and multimeric proteins, isozymes and protein complexes. We describe results from performing flux balance analysis under different physiological conditions, (i.e., photosynthesis, photorespiration and respiration) of a C4 plant and also explore model predictions against experimental observations for two naturally occurring mutants (i.e., bm1 and bm3). The developed model corresponds to the largest and more complete to-date effort at cataloguing metabolism for a plant species.

  18. Zea mays iRS1563: a comprehensive genome-scale metabolic reconstruction of maize metabolism.

    Directory of Open Access Journals (Sweden)

    Rajib Saha

    Full Text Available The scope and breadth of genome-scale metabolic reconstructions have continued to expand over the last decade. Herein, we introduce a genome-scale model for a plant with direct applications to food and bioenergy production (i.e., maize. Maize annotation is still underway, which introduces significant challenges in the association of metabolic functions to genes. The developed model is designed to meet rigorous standards on gene-protein-reaction (GPR associations, elementally and charged balanced reactions and a biomass reaction abstracting the relative contribution of all biomass constituents. The metabolic network contains 1,563 genes and 1,825 metabolites involved in 1,985 reactions from primary and secondary maize metabolism. For approximately 42% of the reactions direct literature evidence for the participation of the reaction in maize was found. As many as 445 reactions and 369 metabolites are unique to the maize model compared to the AraGEM model for A. thaliana. 674 metabolites and 893 reactions are present in Zea mays iRS1563 that are not accounted for in maize C4GEM. All reactions are elementally and charged balanced and localized into six different compartments (i.e., cytoplasm, mitochondrion, plastid, peroxisome, vacuole and extracellular. GPR associations are also established based on the functional annotation information and homology prediction accounting for monofunctional, multifunctional and multimeric proteins, isozymes and protein complexes. We describe results from performing flux balance analysis under different physiological conditions, (i.e., photosynthesis, photorespiration and respiration of a C4 plant and also explore model predictions against experimental observations for two naturally occurring mutants (i.e., bm1 and bm3. The developed model corresponds to the largest and more complete to-date effort at cataloguing metabolism for a plant species.

  19. A Genome-Scale Model of Shewanella piezotolerans Simulates Mechanisms of Metabolic Diversity and Energy Conservation.

    Science.gov (United States)

    Dufault-Thompson, Keith; Jian, Huahua; Cheng, Ruixue; Li, Jiefu; Wang, Fengping; Zhang, Ying

    2017-01-01

    Shewanella piezotolerans strain WP3 belongs to the group 1 branch of the Shewanella genus and is a piezotolerant and psychrotolerant species isolated from the deep sea. In this study, a genome-scale model was constructed for WP3 using a combination of genome annotation, ortholog mapping, and physiological verification. The metabolic reconstruction contained 806 genes, 653 metabolites, and 922 reactions, including central metabolic functions that represented nonhomologous replacements between the group 1 and group 2 Shewanella species. Metabolic simulations with the WP3 model demonstrated consistency with existing knowledge about the physiology of the organism. A comparison of model simulations with experimental measurements verified the predicted growth profiles under increasing concentrations of carbon sources. The WP3 model was applied to study mechanisms of anaerobic respiration through investigating energy conservation, redox balancing, and the generation of proton motive force. Despite being an obligate respiratory organism, WP3 was predicted to use substrate-level phosphorylation as the primary source of energy conservation under anaerobic conditions, a trait previously identified in other Shewanella species. Further investigation of the ATP synthase activity revealed a positive correlation between the availability of reducing equivalents in the cell and the directionality of the ATP synthase reaction flux. Comparison of the WP3 model with an existing model of a group 2 species, Shewanella oneidensis MR-1, revealed that the WP3 model demonstrated greater flexibility in ATP production under the anaerobic conditions. Such flexibility could be advantageous to WP3 for its adaptation to fluctuating availability of organic carbon sources in the deep sea. IMPORTANCE The well-studied nature of the metabolic diversity of Shewanella bacteria makes species from this genus a promising platform for investigating the evolution of carbon metabolism and energy conservation

  20. Direct Mutagenesis of Thousands of Genomic Targets using Microarray-derived Oligonucleotides

    DEFF Research Database (Denmark)

    Bonde, Mads; Kosuri, Sriram; Genee, Hans Jasper;

    2015-01-01

    Multiplex Automated Genome Engineering (MAGE) allows simultaneous mutagenesis of multiple target sites in bacterial genomes using short oligonucleotides. However, large-scale mutagenesis requires hundreds to thousands of unique oligos, which are costly to synthesize and impossible to scale-up by ...... insertions per cell. MO-MAGE enables cost-effective large-scale targeted genome engineering that should be useful for a variety of applications in synthetic biology and metabolic engineering....

  1. Contig-Layout-Authenticator (CLA): A Combinatorial Approach to Ordering and Scaffolding of Bacterial Contigs for Comparative Genomics and Molecular Epidemiology.

    Science.gov (United States)

    Shaik, Sabiha; Kumar, Narender; Lankapalli, Aditya K; Tiwari, Sumeet K; Baddam, Ramani; Ahmed, Niyaz

    2016-01-01

    A wide variety of genome sequencing platforms have emerged in the recent past. High-throughput platforms like Illumina and 454 are essentially adaptations of the shotgun approach generating millions of fragmented single or paired sequencing reads. To reconstruct whole genomes, the reads have to be assembled into contigs, which often require further downstream processing. The contigs can be directly ordered according to a reference, scaffolded based on paired read information, or assembled using a combination of the two approaches. While the reference-based approach appears to mask strain-specific information, scaffolding based on paired-end information suffers when repetitive elements longer than the size of the sequencing reads are present in the genome. Sequencing technologies that produce long reads can solve the problems associated with repetitive elements but are not necessarily easily available to researchers. The most common high-throughput technology currently used is the Illumina short read platform. To improve upon the shortcomings associated with the construction of draft genomes with Illumina paired-end sequencing, we developed Contig-Layout-Authenticator (CLA). The CLA pipeline can scaffold reference-sorted contigs based on paired reads, resulting in better assembled genomes. Moreover, CLA also hints at probable misassemblies and contaminations, for the users to cross-check before constructing the consensus draft. The CLA pipeline was designed and trained extensively on various bacterial genome datasets for the ordering and scaffolding of large repetitive contigs. The tool has been validated and compared favorably with other widely-used scaffolding and ordering tools using both simulated and real sequence datasets. CLA is a user friendly tool that requires a single command line input to generate ordered scaffolds.

  2. Complete Genome Sequence of Leifsonia xyli subsp. cynodontis Strain DSM46306, a Gram-Positive Bacterial Pathogen of Grasses

    Science.gov (United States)

    Zerillo, Marcelo Marques; Van Sluys, Marie-Anne; Camargo, Luis Eduardo Aranha; Kitajima, João Paulo

    2013-01-01

    We announce the complete genome sequence of Leifsonia xyli subsp. cynodontis, a vascular pathogen of Bermuda grass. The species also comprises Leifsonia xyli subsp. xyli, a sugarcane pathogen. Since these two subspecies have genome sequences available, a comparative analysis will contribute to our understanding of the differences in their biology and host specificity. PMID:24201198

  3. Proteolytic bacterial dominance in a full-scale municipal solid waste anaerobic reactor assessed by 454 pyrosequencing technology.

    Science.gov (United States)

    Cardinali-Rezende, Juliana; Rojas-Ojeda, Patricia; Nascimento, Andréa M A; Sanz, José L

    2016-03-01

    Biomethanization entails a good means to reduce the organic fraction (OF) derived from municipal solid wastes (MSW). The bacterial diversity of a full scale MSW anaerobic reactor located in Madrid (Spain) was investigated using high-throughput 454 pyrosequencing. Even though the proteolytic bacteria prevailed throughout all of the process, community shifts were observed from the start-up to the steady-state conditions, with an increasing biodiversity displayed over time. The Bacteroidetes and the Firmicutes were the majority phyla: 55.1 and 40.2% (start-up) and 18.7 and 78.7 (steady-state) of the total reads. The system's lack of evenness remains noteworthy as the sequences affiliated to the proteolytic non-saccharolytic Proteiniphylum, Gallicola and Fastidiosipila genera, together with the saccharolytic Saccharofermentans, were predominant on the system and this predominance appears to correlate with the presence of a high ammonium concentration. The 454 pyrosequencing revealed a great diversity of rare organisms which seemingly do not sustain any metabolic roles in the course of the OF-MSW degradation. However, this scarce and unique microbiota can confer great resilience to the system as a buffer against nutritional and environmental changing conditions, thus opening the door to increase the current knowledge about the bacterial community dynamics taking place during MSW treatment processes.

  4. Separation of bacterial spores from flowing water in macro-scale cavities by ultrasonic standing waves

    CERN Document Server

    Lipkens, B; Costolo, M; Stevens, A; Rietman, Edward

    2010-01-01

    The separation of micron-sized bacterial spores (Bacillus cereus) from a steady flow of water through the use of ultrasonic standing waves is demonstrated. An ultrasonic resonator with cross-section of 0.0254 m x 0.0254 m has been designed with a flow inlet and outlet for a water stream that ensures laminar flow conditions into and out of the resonator section of the flow tube. A 0.01905-m diameter PZT-4, nominal 2-MHz transducer is used to generate ultrasonic standing waves in the resonator. The acoustic resonator is 0.0356 m from transducer face to the opposite reflector wall with the acoustic field in a direction orthogonal to the water flow direction. At fixed frequency excitation, spores are concentrated at the stable locations of the acoustic radiation force and trapped in the resonator region. The effect of the transducer voltage and frequency on the efficiency of spore capture in the resonator has been investigated. Successful separation of B. cereus spores from water with typical volume flow rates of...

  5. Absence of large-scale displacement of quinone QB in bacterial photosynthetic reaction centers.

    Science.gov (United States)

    Breton, Jacques

    2004-03-30

    Photosynthesis transforms light into chemical energy by coupling electron transfer to proton uptake at the quinone Q(B). The possibility of initiating this process with a brief pulse of light and the known X-ray structure makes the photosynthetic bacterial reaction center a paradigm for studying coupled electron-proton transfer in biology. It has been established that electron transfer from the primary quinone Q(A) to Q(B) is gated by a protein conformational change. On the basis of a dramatic difference in the location of Q(B) in structures derived from crystals cooled to 90 K either under illumination or in the dark, a functional model for the gating mechanism was proposed whereby neutral Q(B) moves 4.5 A before receiving the electron from Q(A)(-) [Stowell, M. H. B., McPhillips, T. M., Rees, D. C., Soltis, S. M., Abresch, E., and Feher, G. (1997) Science 276, 812-816]. Isotope-edited FTIR difference spectroscopy of Q(B) photoreduction at 290 and 85 K is used to investigate whether Q(B) moves upon reduction. We show that the specific interactions of the carbonyl groups of Q(B) and Q(B)(-) with the protein at a single binding site remain identical at both temperatures. Therefore, the different locations of Q(B) reported in many X-ray crystal structures probably are unrelated to functional electron transfer from Q(A)(-) to Q(B).

  6. A genome-scale metabolic reconstruction of Pseudomonas putida KT2440: iJN746 as a cell factory

    Directory of Open Access Journals (Sweden)

    Thiele Ines

    2008-09-01

    Full Text Available Abstract Background Pseudomonas putida is the best studied pollutant degradative bacteria and is harnessed by industrial biotechnology to synthesize fine chemicals. Since the publication of P. putida KT2440's genome, some in silico analyses of its metabolic and biotechnology capacities have been published. However, global understanding of the capabilities of P. putida KT2440 requires the construction of a metabolic model that enables the integration of classical experimental data along with genomic and high-throughput data. The constraint-based reconstruction and analysis (COBRA approach has been successfully used to build and analyze in silico genome-scale metabolic reconstructions. Results We present a genome-scale reconstruction of P. putida KT2440's metabolism, iJN746, which was constructed based on genomic, biochemical, and physiological information. This manually-curated reconstruction accounts for 746 genes, 950 reactions, and 911 metabolites. iJN746 captures biotechnologically relevant pathways, including polyhydroxyalkanoate synthesis and catabolic pathways of aromatic compounds (e.g., toluene, benzoate, phenylacetate, nicotinate, not described in other metabolic reconstructions or biochemical databases. The predictive potential of iJN746 was validated using experimental data including growth performance and gene deletion studies. Furthermore, in silico growth on toluene was found to be oxygen-limited, suggesting the existence of oxygen-efficient pathways not yet annotated in P. putida's genome. Moreover, we evaluated the production efficiency of polyhydroxyalkanoates from various carbon sources and found fatty acids as the most prominent candidates, as expected. Conclusion Here we presented the first genome-scale reconstruction of P. putida, a biotechnologically interesting all-surrounder. Taken together, this work illustrates the utility of iJN746 as i a knowledge-base, ii a discovery tool, and iii an engineering platform to explore P

  7. Scaling laws governing stochastic growth and division of single bacterial cells

    CERN Document Server

    Iyer-Biswas, Srividya; Henry, Jonathan T; Lo, Klevin; Burov, Stanislav; Lin, Yihan; Crooks, Gavin E; Crosson, Sean; Dinner, Aaron R; Scherer, Norbert F

    2014-01-01

    Uncovering the quantitative laws that govern the growth and division of single cells remains a major challenge. Using a unique combination of technologies that yields unprecedented statistical precision, we find that the sizes of individual Caulobacter crescentus cells increase exponentially in time. We also establish that they divide upon reaching a critical multiple ($\\approx$1.8) of their initial sizes, rather than an absolute size. We show that when the temperature is varied, the growth and division timescales scale proportionally with each other over the physiological temperature range. Strikingly, the cell-size and division-time distributions can both be rescaled by their mean values such that the condition-specific distributions collapse to universal curves. We account for these observations with a minimal stochastic model that is based on an autocatalytic cycle. It predicts the scalings, as well as specific functional forms for the universal curves. Our experimental and theoretical analysis reveals a ...

  8. Noise analysis of genome-scale protein synthesis using a discrete computational model of translation.

    Science.gov (United States)

    Racle, Julien; Stefaniuk, Adam Jan; Hatzimanikatis, Vassily

    2015-07-28

    Noise in genetic networks has been the subject of extensive experimental and computational studies. However, very few of these studies have considered noise properties using mechanistic models that account for the discrete movement of ribosomes and RNA polymerases along their corresponding templates (messenger RNA (mRNA) and DNA). The large size of these systems, which scales with the number of genes, mRNA copies, codons per mRNA, and ribosomes, is responsible for some of the challenges. Additionally, one should be able to describe the dynamics of ribosome exchange between the free ribosome pool and those bound to mRNAs, as well as how mRNA species compete for ribosomes. We developed an efficient algorithm for stochastic simulations that addresses these issues and used it to study the contribution and trade-offs of noise to translation properties (rates, time delays, and rate-limiting steps). The algorithm scales linearly with the number of mRNA copies, which allowed us to study the importance of genome-scale competition between mRNAs for the same ribosomes. We determined that noise is minimized under conditions maximizing the specific synthesis rate. Moreover, sensitivity analysis of the stochastic system revealed the importance of the elongation rate in the resultant noise, whereas the translation initiation rate constant was more closely related to the average protein synthesis rate. We observed significant differences between our results and the noise properties of the most commonly used translation models. Overall, our studies demonstrate that the use of full mechanistic models is essential for the study of noise in translation and transcription.

  9. Noise analysis of genome-scale protein synthesis using a discrete computational model of translation

    Energy Technology Data Exchange (ETDEWEB)

    Racle, Julien; Hatzimanikatis, Vassily, E-mail: vassily.hatzimanikatis@epfl.ch [Laboratory of Computational Systems Biotechnology, Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne (Switzerland); Swiss Institute of Bioinformatics (SIB), CH-1015 Lausanne (Switzerland); Stefaniuk, Adam Jan [Laboratory of Computational Systems Biotechnology, Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne (Switzerland)

    2015-07-28

    Noise in genetic networks has been the subject of extensive experimental and computational studies. However, very few of these studies have considered noise properties using mechanistic models that account for the discrete movement of ribosomes and RNA polymerases along their corresponding templates (messenger RNA (mRNA) and DNA). The large size of these systems, which scales with the number of genes, mRNA copies, codons per mRNA, and ribosomes, is responsible for some of the challenges. Additionally, one should be able to describe the dynamics of ribosome exchange between the free ribosome pool and those bound to mRNAs, as well as how mRNA species compete for ribosomes. We developed an efficient algorithm for stochastic simulations that addresses these issues and used it to study the contribution and trade-offs of noise to translation properties (rates, time delays, and rate-limiting steps). The algorithm scales linearly with the number of mRNA copies, which allowed us to study the importance of genome-scale competition between mRNAs for the same ribosomes. We determined that noise is minimized under conditions maximizing the specific synthesis rate. Moreover, sensitivity analysis of the stochastic system revealed the importance of the elongation rate in the resultant noise, whereas the translation initiation rate constant was more closely related to the average protein synthesis rate. We observed significant differences between our results and the noise properties of the most commonly used translation models. Overall, our studies demonstrate that the use of full mechanistic models is essential for the study of noise in translation and transcription.

  10. Large scale single nucleotide polymorphism discovery in unsequenced genomes using second generation high throughput sequencing technology: applied to turkey

    NARCIS (Netherlands)

    Kerstens, H.H.D.; Crooijmans, R.P.M.A.; Veenendaal, A.; Dibbits, B.W.; Chin-A-Woeng, T.F.C.; Dunnen, den J.T.; Groenen, M.A.M.

    2009-01-01

    Background - The development of second generation sequencing methods has enabled large scale DNA variation studies at moderate cost. For the high throughput discovery of single nucleotide polymorphisms (SNPs) in species lacking a sequenced reference genome, we set-up an analysis pipeline based on a

  11. A genome-wide association study of Cloninger's temperament scales: Implications for the evolutionary genetics of personality

    NARCIS (Netherlands)

    Verweij, K.J.H.; Zietsch, B.P.; Medland, S.E.; Gordon, S.D.; Benyamin, B.; Nyholt, D.R.; McEvoy, B.P.; Sullivan, P.F.; Heath, A.C.; Madden, P.A.F.; Henders, A.K.; Montgomery, G.W.; Martin, N.G.; Wray, N.R.

    2010-01-01

    Variation in personality traits is 30-60% attributed to genetic influences. Attempts to unravel these genetic influences at the molecular level have, so far, been inconclusive. We performed the first genome-wide association study of Cloninger's temperament scales in a sample of 5117 individuals, in

  12. Bacterial Chemotaxis Toward A NAPL Source Within A Pore-Scale Model Subject to A Range of Groundwater Flow Velocities

    Science.gov (United States)

    Wang, X.; Ford, R. M.

    2010-12-01

    Organic solvents such as toluene are the most widely distributed pollutants in groundwater. Biodegradation of these industrial pollutants requires that microorganisms in the aqueous phase are brought in contact with sources of contamination, which may be dispersed as pore-size organic-phase droplets within the saturated soil matrix. Chemotaxis toward chemical pollutants provides a mechanism for bacteria to migrate to locations of high contamination, which may not normally be accessible to bacteria carried along by groundwater flow, and thus it may improve the efficiency of bioremediation. A microfluidic device was designed to mimic the dissolution of an organic-phase contaminant from a single pore into a larger macropore representing a preferred pathway for microorganisms that are carried along by groundwater flow. The glass windows of the µ-chip allowed image analysis of bacterial distributions within the vicinity of the organic contaminant. Concentrations of chemotactic bacteria P. putida F1 near the organic/aqueous interface were 25% greater than those of a nonchemotactic mutant in the vicinity of toluene for a fluid velocity of 0.5 m/d. For E. coli responding to phenol, the bacterial concentrations were 60% greater than the controls, also at a velocity of 0.5 m/d. Velocities in the macropore were varied over a range that is typical of groundwater velocities from 0.5 to 10 m/d. The accumulation of chemotactic bacteria near the NAPL (nonaqueous phase liquid) chemoattractant source decreased as the fluid velocity increased. At the higher velocities, accumulation of chemotactic bacteria was comparable to the non-chemotactic control experiments. Computer-based simulation using finite element analysis software (COMSOL) was also performed to understand the effects of various model parameters on bacterial chemotaxis to NAPL. There was good agreement between the simulations (generated using reasonable values of the model parameters) and the experimental data for P

  13. SHARP: genome-scale identification of gene-protein-reaction associations in cyanobacteria.

    Science.gov (United States)

    Krishnakumar, S; Durai, Dilip A; Wangikar, Pramod P; Viswanathan, Ganesh A

    2013-11-01

    Genome scale metabolic model provides an overview of an organism's metabolic capability. These genome-specific metabolic reconstructions are based on identification of gene to protein to reaction (GPR) associations and, in turn, on homology with annotated genes from other organisms. Cyanobacteria are photosynthetic prokaryotes which have diverged appreciably from their nonphotosynthetic counterparts. They also show significant evolutionary divergence from plants, which are well studied for their photosynthetic apparatus. We argue that context-specific sequence and domain similarity can add to the repertoire of the GPR associations and significantly expand our view of the metabolic capability of cyanobacteria. We took an approach that combines the results of context-specific sequence-to-sequence similarity search with those of sequence-to-profile searches. We employ PSI-BLAST for the former, and CDD, Pfam, and COG for the latter. An optimization algorithm was devised to arrive at a weighting scheme to combine the different evidences with KEGG-annotated GPRs as training data. We present the algorithm in the form of software "Systematic, Homology-based Automated Re-annotation for Prokaryotes (SHARP)." We predicted 3,781 new GPR associations for the 10 prokaryotes considered of which eight are cyanobacteria species. These new GPR associations fall in several metabolic pathways and were used to annotate 7,718 gaps in the metabolic network. These new annotations led to discovery of several pathways that may be active and thereby providing new directions for metabolic engineering of these species for production of useful products. Metabolic model developed on such a reconstructed network is likely to give better phenotypic predictions.

  14. Determining the control circuitry of redox metabolism at the genome-scale.

    Directory of Open Access Journals (Sweden)

    Stephen Federowicz

    2014-04-01

    Full Text Available Determining how facultative anaerobic organisms sense and direct cellular responses to electron acceptor availability has been a subject of intense study. However, even in the model organism Escherichia coli, established mechanisms only explain a small fraction of the hundreds of genes that are regulated during electron acceptor shifts. Here we propose a qualitative model that accounts for the full breadth of regulated genes by detailing how two global transcription factors (TFs, ArcA and Fnr of E. coli, sense key metabolic redox ratios and act on a genome-wide basis to regulate anabolic, catabolic, and energy generation pathways. We first fill gaps in our knowledge of this transcriptional regulatory network by carrying out ChIP-chip and gene expression experiments to identify 463 regulatory events. We then interfaced this reconstructed regulatory network with a highly curated genome-scale metabolic model to show that ArcA and Fnr regulate >80% of total metabolic flux and 96% of differential gene expression across fermentative and nitrate respiratory conditions. Based on the data, we propose a feedforward with feedback trim regulatory scheme, given the extensive repression of catabolic genes by ArcA and extensive activation of chemiosmotic genes by Fnr. We further corroborated this regulatory scheme by showing a 0.71 r(2 (p<1e-6 correlation between changes in metabolic flux and changes in regulatory activity across fermentative and nitrate respiratory conditions. Finally, we are able to relate the proposed model to a wealth of previously generated data by contextualizing the existing transcriptional regulatory network.

  15. The population genomics of begomoviruses: global scale population structure and gene flow

    Directory of Open Access Journals (Sweden)

    Prasanna HC

    2010-09-01

    Full Text Available Abstract Background The rapidly growing availability of diverse full genome sequences from across the world is increasing the feasibility of studying the large-scale population processes that underly observable pattern of virus diversity. In particular, characterizing the genetic structure of virus populations could potentially reveal much about how factors such as geographical distributions, host ranges and gene flow between populations combine to produce the discontinuous patterns of genetic diversity that we perceive as distinct virus species. Among the richest and most diverse full genome datasets that are available is that for the dicotyledonous plant infecting genus, Begomovirus, in the Family Geminiviridae. The begomoviruses all share the same whitefly vector, are highly recombinogenic and are distributed throughout tropical and subtropical regions where they seriously threaten the food security of the world's poorest people. Results We focus here on using a model-based population genetic approach to identify the genetically distinct sub-populations within the global begomovirus meta-population. We demonstrate the existence of at least seven major sub-populations that can further be sub-divided into as many as thirty four significantly differentiated and genetically cohesive minor sub-populations. Using the population structure framework revealed in the present study, we further explored the extent of gene flow and recombination between genetic populations. Conclusions Although geographical barriers are apparently the most significant underlying cause of the seven major population sub-divisions, within the framework of these sub-divisions, we explore patterns of gene flow to reveal that both host range differences and genetic barriers to recombination have probably been major contributors to the minor population sub-divisions that we have identified. We believe that the global Begomovirus population structure revealed here could

  16. Large scale genomic testing within herd does not affect contribution margin

    DEFF Research Database (Denmark)

    Hjortø, Line; Ettema, Jehan; Sørensen, Anders Christian

    2013-01-01

    A Danish study from 2012 shows that genomic test of all or part of the females in a Holstein herd gives roughly the same economic result as not performing any genomic test at all......A Danish study from 2012 shows that genomic test of all or part of the females in a Holstein herd gives roughly the same economic result as not performing any genomic test at all...

  17. Large scale single nucleotide polymorphism discovery in unsequenced genomes using second generation high throughput sequencing technology: applied to turkey

    Directory of Open Access Journals (Sweden)

    den Dunnen Johan T

    2009-10-01

    Full Text Available Abstract Background The development of second generation sequencing methods has enabled large scale DNA variation studies at moderate cost. For the high throughput discovery of single nucleotide polymorphisms (SNPs in species lacking a sequenced reference genome, we set-up an analysis pipeline based on a short read de novo sequence assembler and a program designed to identify variation within short reads. To illustrate the potential of this technique, we present the results obtained with a randomly sheared, enzymatically generated, 2-3 kbp genome fraction of six pooled Meleagris gallopavo (turkey individuals. Results A total of 100 million 36 bp reads were generated, representing approximately 5-6% (~62 Mbp of the turkey genome, with an estimated sequence depth of 58. Reads consisting of bases called with less than 1% error probability were selected and assembled into contigs. Subsequently, high throughput discovery of nucleotide variation was performed using sequences with more than 90% reliability by using the assembled contigs that were 50 bp or longer as the reference sequence. We identified more than 7,500 SNPs with a high probability of representing true nucleotide variation in turkeys. Increasing the reference genome by adding publicly available turkey BAC-end sequences increased the number of SNPs to over 11,000. A comparison with the sequenced chicken genome indicated that the assembled turkey contigs were distributed uniformly across the turkey genome. Genotyping of a representative sample of 340 SNPs resulted in a SNP conversion rate of 95%. The correlation of the minor allele count (MAC and observed minor allele frequency (MAF for the validated SNPs was 0.69. Conclusion We provide an efficient and cost-effective approach for the identification of thousands of high quality SNPs in species currently lacking a sequenced genome and applied this to turkey. The methodology addresses a random fraction of the genome, resulting in an even

  18. Bacterial niche-specific genome expansion is coupled with highly frequent gene disruptions in deep-sea sediments

    KAUST Repository

    Wang, Yong

    2011-12-21

    The complexity and dynamics of microbial metagenomes may be evaluated by genome size, gene duplication and the disruption rate between lineages. In this study, we pyrosequenced the metagenomes of microbes obtained from the brine and sediment of a deep-sea brine pool in the Red Sea to explore the possible genomic adaptations of the microbes in response to environmental changes. The microbes from the brine and sediments (both surface and deep layers) of the Atlantis II Deep brine pool had similar communities whereas the effective genome size varied from 7.4 Mb in the brine to more than 9 Mb in the sediment. This genome expansion in the sediment samples was due to gene duplication as evidenced by enrichment of the homologs. The duplicated genes were highly disrupted, on average by 47.6% and 70% for the surface and deep layers of the Atlantis II Deep sediment samples, respectively. The disruptive effects appeared to be mainly due to point mutations and frameshifts. In contrast, the homologs from the Atlantis II Deep brine sample were highly conserved and they maintained relatively small copy numbers. Likely, the adaptation of the microbes in the sediments was coupled with pseudogenizations and possibly functional diversifications of the paralogs in the expanded genomes. The maintenance of the pseudogenes in the large genomes is discussed. © 2011 Wang et al.

  19. Bacterial niche-specific genome expansion is coupled with highly frequent gene disruptions in deep-sea sediments.

    Directory of Open Access Journals (Sweden)

    Yong Wang

    Full Text Available The complexity and dynamics of microbial metagenomes may be evaluated by genome size, gene duplication and the disruption rate between lineages. In this study, we pyrosequenced the metagenomes of microbes obtained from the brine and sediment of a deep-sea brine pool in the Red Sea to explore the possible genomic adaptations of the microbes in response to environmental changes. The microbes from the brine and sediments (both surface and deep layers of the Atlantis II Deep brine pool had similar communities whereas the effective genome size varied from 7.4 Mb in the brine to more than 9 Mb in the sediment. This genome expansion in the sediment samples was due to gene duplication as evidenced by enrichment of the homologs. The duplicated genes were highly disrupted, on average by 47.6% and 70% for the surface and deep layers of the Atlantis II Deep sediment samples, respectively. The disruptive effects appeared to be mainly due to point mutations and frameshifts. In contrast, the homologs from the Atlantis II Deep brine sample were highly conserved and they maintained relatively small copy numbers. Likely, the adaptation of the microbes in the sediments was coupled with pseudogenizations and possibly functional diversifications of the paralogs in the expanded genomes. The maintenance of the pseudogenes in the large genomes is discussed.

  20. Insights into the history of a bacterial group II intron remnant from the genomes of the nitrogen-fixing symbionts Sinorhizobium meliloti and Sinorhizobium medicae.

    Science.gov (United States)

    Toro, N; Martínez-Rodríguez, L; Martínez-Abarca, F

    2014-10-01

    Group II introns are self-splicing catalytic RNAs that act as mobile retroelements. In bacteria, they are thought to be tolerated to some extent because they self-splice and home preferentially to sites outside of functional genes, generally within intergenic regions or in other mobile genetic elements, by mechanisms including the divergence of DNA target specificity to prevent target site saturation. RmInt1 is a mobile group II intron that is widespread in natural populations of Sinorhizobium meliloti and was first described in the GR4 strain. Like other bacterial group II introns, RmInt1 tends to evolve toward an inactive form by fragmentation, with loss of the 3' terminus. We identified genomic evidence of a fragmented intron closely related to RmInt1 buried in the genome of the extant S. meliloti/S. medicae species. By studying this intron, we obtained evidence for the occurrence of intron insertion before the divergence of ancient rhizobial species. This fragmented group II intron has thus existed for a long time and has provided sequence variation, on which selection can act, contributing to diverse genetic rearrangements, and to generate pan-genome divergence after strain differentiation. The data presented here suggest that fragmented group II introns within intergenic regions closed to functionally important neighboring genes may have been microevolutionary forces driving adaptive evolution of these rhizobial species.

  1. Two families of rep-like genes that probably originated by interspecies recombination are represented in viral, plasmid, bacterial, and parasitic protozoan genomes.

    Science.gov (United States)

    Gibbs, Mark J; Smeianov, Vladimir V; Steele, James L; Upcroft, Peter; Efimov, Boris A

    2006-06-01

    Two families of genes related to, and including, rolling circle replication initiator protein (Rep) genes were defined by sequence similarity and by evidence of intergene family recombination. The Rep genes of circoviruses were the best characterized members of the "RecRep1 family." Other members of the RecRep1 family were Rep-like genes found in the genomes of the Canarypox virus, Entamoeba histolytica, and Giardia duodenalis and in a plasmid, p4M, from the Gram-positive bacterium, Bifidobacterium pseudocatenulatum. The "RecRep2 family" comprised some previously identified Rep-like genes from plasmids of phytoplasmas and similar Rep-like genes from the genomes of Lactobacillus acidophilus, Lactococcus lactis, and Phytoplasma asteris. Both RecRep1 and RecRep2 proteins have a nucleotide-binding domain significantly similar to the helicases (2C proteins) of picorna-like viruses. On the N-terminal side of the nucleotide binding domain, RecRep1 proteins have a domain significantly similar to one found in nanovirus Reps, whereas RecRep2 proteins have a domain significantly similar to one in the Reps of pLS1 plasmids. We speculate that RecRep genes have been transferred from viruses or plasmids to parasitic protozoan and bacterial genomes and that Rep proteins were themselves involved in the original recombination events that generated the ancestral RecRep genes.

  2. BAC CGH-array identified specific small-scale genomic imbalances in diploid DMBA-induced rat mammary tumors

    Directory of Open Access Journals (Sweden)

    Samuelson Emma

    2012-08-01

    Full Text Available Abstract Background Development of breast cancer is a multistage process influenced by hormonal and environmental factors as well as by genetic background. The search for genes underlying this malignancy has recently been highly productive, but the etiology behind this complex disease is still not understood. In studies using animal cancer models, heterogeneity of the genetic background and environmental factors is reduced and thus analysis and identification of genetic aberrations in tumors may become easier. To identify chromosomal regions potentially involved in the initiation and progression of mammary cancer, in the present work we subjected a subset of experimental mammary tumors to cytogenetic and molecular genetic analysis. Methods Mammary tumors were induced with DMBA (7,12-dimethylbenz[a]anthrazene in female rats from the susceptible SPRD-Cu3 strain and from crosses and backcrosses between this strain and the resistant WKY strain. We first produced a general overview of chromosomal aberrations in the tumors using conventional kartyotyping (G-banding and Comparative Genome Hybridization (CGH analyses. Particular chromosomal changes were then analyzed in more details using an in-house developed BAC (bacterial artificial chromosome CGH-array platform. Results Tumors appeared to be diploid by conventional karyotyping, however several sub-microscopic chromosome gains or losses in the tumor material were identified by BAC CGH-array analysis. An oncogenetic tree analysis based on the BAC CGH-array data suggested gain of rat chromosome (RNO band 12q11, loss of RNO5q32 or RNO6q21 as the earliest events in the development of these mammary tumors. Conclusions Some of the identified changes appear to be more specific for DMBA-induced mammary tumors and some are similar to those previously reported in ACI rat model for estradiol-induced mammary tumors. The later group of changes is more interesting, since they may represent anomalies that involve

  3. Continental-scale variation in seaweed host-associated bacterial communities is a function of host condition, not geography.

    Science.gov (United States)

    Marzinelli, Ezequiel M; Campbell, Alexandra H; Zozaya Valdes, Enrique; Vergés, Adriana; Nielsen, Shaun; Wernberg, Thomas; de Bettignies, Thibaut; Bennett, Scott; Caporaso, J Gregory; Thomas, Torsten; Steinberg, Peter D

    2015-10-01

    Interactions between hosts and associated microbial communities can fundamentally shape the development and ecology of 'holobionts', from humans to marine habitat-forming organisms such as seaweeds. In marine systems, planktonic microbial community structure is mainly driven by geography and related environmental factors, but the large-scale drivers of host-associated microbial communities are largely unknown. Using 16S-rRNA gene sequencing, we characterized 260 seaweed-associated bacterial and archaeal communities on the kelp Ecklonia radiata from three biogeographical provinces spanning 10° of latitude and 35° of longitude across the Australian continent. These phylogenetically and taxonomically diverse communities were more strongly and consistently associated with host condition than geographical location or environmental variables, and a 'core' microbial community characteristic of healthy kelps appears to be lost when hosts become stressed. Microbial communities on stressed individuals were more similar to each other among locations than those on healthy hosts. In contrast to biogeographical patterns of planktonic marine microbial communities, host traits emerge as critical determinants of associated microbial community structure of these holobionts, even at a continental scale.

  4. Up-scaling aquaculture wastewater treatment by microalgal bacterial flocs: from lab reactors to an outdoor raceway pond.

    Science.gov (United States)

    Van Den Hende, Sofie; Beelen, Veerle; Bore, Gaëlle; Boon, Nico; Vervaeren, Han

    2014-05-01

    Sequencing batch reactors with microalgal bacterial flocs (MaB-floc SBRs) are a novel approach for photosynthetic aerated wastewater treatment based on bioflocculation. To assess their technical potential for aquaculture wastewater treatment in Northwest Europe, MaB-floc SBRs were up-scaled from indoor photobioreactors of 4 L over 40 and 400 L to a 12 m(3) outdoor raceway pond. Scale-up decreased the nutrient removal efficiencies with a factor 1-3 and the volumetric biomass productivities with a factor 10-13. Effluents met current discharge norms, except for nitrite and nitrate. Flue gas sparging was needed to decrease the effluent pH. Outdoor MaB-flocs showed enhanced settling properties and an increased ash and chlorophyll a content. Bioflocculation enabled successful harvesting by gravity settling and dewatering by filtering at 150-250 μm. Optimisation of nitrogen removal and biomass valorisation are future challenges towards industrial implementation of MaB-floc SBRs for aquaculture wastewater treatment.

  5. Expanding a dynamic flux balance model of yeast fermentation to genome-scale

    Directory of Open Access Journals (Sweden)

    Agosin Eduardo

    2011-05-01

    Full Text Available Abstract Background Yeast is considered to be a workhorse of the biotechnology industry for the production of many value-added chemicals, alcoholic beverages and biofuels. Optimization of the fermentation is a challenging task that greatly benefits from dynamic models able to accurately describe and predict the fermentation profile and resulting products under different genetic and environmental conditions. In this article, we developed and validated a genome-scale dynamic flux balance model, using experimentally determined kinetic constraints. Results Appropriate equations for maintenance, biomass composition, anaerobic metabolism and nutrient uptake are key to improve model performance, especially for predicting glycerol and ethanol synthesis. Prediction profiles of synthesis and consumption of the main metabolites involved in alcoholic fermentation closely agreed with experimental data obtained from numerous lab and industrial fermentations under different environmental conditions. Finally, fermentation simulations of genetically engineered yeasts closely reproduced previously reported experimental results regarding final concentrations of the main fermentation products such as ethanol and glycerol. Conclusion A useful tool to describe, understand and predict metabolite production in batch yeast cultures was developed. The resulting model, if used wisely, could help to search for new metabolic engineering strategies to manage ethanol content in batch fermentations.

  6. Population genomics of an endemic Mediterranean fish: differentiation by fine